March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.
|
|
- Warren Patterson
- 5 years ago
- Views:
Transcription
1 Florida State University March 1, 2018
2 Framework 1. (Lizhe) Basic inequalities Chernoff bounding Review for STA (Lizhe) Discrete-time martingales inequalities via martingale approach 3. (Boning)
3 Part I:
4 Why concentration inequalities? inequalities quantify how a random variable X deviates around its mean µ. They usually take the form of two-sides bounds for the tails of X µ, such as P( X µ t) something very small, t > 0. Based on CLT, we can get the asymptotic results when n. But in machine learning community and in high dimensional data analysis, we prefer to exploit the non-asymptotic properties for random variables.
5 Basic inequalities I First of all, we recall some basic tools and inequalities. For any nonnegative random variable X, E(X ) = 0 P(X t)dt. This implies Markov inequality: for any nonnegative random variable X, and t > 0, P(X t) E(X ). t
6 Basic inequalities II In general, if φ is a strictly monotonically increasing nonnegative-valued function, then for any random variable X and real number t, P(X t) = P{φ(X ) φ(t)} E(φ(X )). φ(t) For example, φ(x) = x 2 will induce Chebyshev s inequality: if X is an arbitrary random variable and t > 0, then P{ X EX t} = P{ X EX 2 t 2 } E X EX 2 t 2 = var{x } t 2
7 Chernoff bounding technique Taking φ(x) = exp(λx) where λ is an arbitrary positive number, for any random variable X, and any t > 0, we have P(X t) = P{exp(λX ) exp(λt)} exp( λt)e[exp(λx )] exp( λt + log E[exp(λX )]), λ > 0. Please note that if we want to bound the probability of the lower tail, P(X t), we follow the same steps, but with X rather than X. Now, we need to obtain tight uppers bound for exp( λt + log E[exp(λX )]).
8 Chernoff bounding technique: Hoeffding inequality I Theorem Let X be a random variable, such that X [a, b] a.s. for some finite a < b. Then, for any λ 0, ( ) E[exp(λ(X EX ))] exp λ 2 (b a) 2 /8 Proof. For any p [0, 1] and x R, let us define the function H p (s) = log[p exp(s(1 p)) + (1 p) exp( sp)]. Let ξ = X EX, where ξ [a EX, b EX ]. Using the convexity of the exponential function, we can write
9 Chernoff bounding technique: Hoeffding inequality II Proof. exp(λξ) = ( X a b X ) exp λ(b EX ) + b a b a t(a EX ) ( X a ) (b X ) exp(λ(b EX )) + exp(λ(a EX )). b a b a Taking expectations of both sides, we can get E[exp(λξ)] ( EX a ) (b EX ) exp(λ(b EX ))+ exp(λ(a EX )). b a b a Let p = EX a b a and s = λ(b a), we have exp(h p (s)) = ( EX a ) (b EX ) exp(λ(b EX ))+ exp(λ(a EX )). b a b a
10 Chernoff bounding technique: Hoeffding inequality III Proof. Using Taylor expansion, we can get the bound H p (s) s2 8 for all p [0, 1] and all s R, by using the above definitions of p and s, we can get the Hoeffding inequality. A disadvantage of Hoeffding inequality is that it ignores information about variance of the X. And the Bernstein inequality provide an improvement in this respect. Question: Why we use Chernoff bounding technique?
11 Example: bounding the random walk Symmetric Bernoulli distribution A random variable X has symmetric Bernoulli distribution (also called Rademacher distribution) if P(X = 1) = P(X = 1) = 1 2 Let X 1, X 2,, X n be independent symmetric Bernoulli random variables. Then for any t 0, we have P ( n X i t ) exp ( t2 ) 2n
12 Chernoff bounding technique: beyond Hoeffding inequality Bernstein s condition: Given a random variable X with EX = µ and var(x ) = σ 2, we say that the Bernstein s condition with parameter b holds if E[(X µ) k ] 1 2 k!σ2 b k 2, k = 3, 4, Theorem For any random variable X satisfying the Bernstein condition, we have E[exp(λ(X µ))] exp( λ2 σ 2 /2 1 b λ ), for all λ < 1 b, and moreover, we have the concentration inequality P( X µ t) 2 exp( 2(σ 2 ), for all t 0. + bt) t 2
13 Review for STA 6448
14 Part II:
15 Discrete-time martingales Definition Let (Ω, F, P) be a probability space. A sequence {X i, F i } n i=0, n N, where the X i are random variables and the F i are σ-algebras, is a martingale if the following conditions are satisifed: 1. The F i form a filteration, i.e., {, Ω} = F 0 F 1 F n = F. 2. X i L 1 (Ω, F i, P) for every i {0, 1,, n}. 3. For all i {1, 2,, n}, the equality E[X i F i 1 ] = X i 1 holds almost sure (a.s.). A martingale can be generated by the following procedure: given a r.v. X associated with a filtration {F i } n i=0, let X i = E[X F i ], i {1, 2, 3,, n}. Then, the sequence X 0, X 1, X 2,, X n forms a martingale.
16 decomposition Consider a r.v. X associated with a filtration {F i } n i=0, where F 0 = {, Ω} and F n = F, we have Consider E[X F 0 ] = EX and E[X F n ] = X. X EX = E[X F n ] E[X F 0 ] n = (E[X F i ] E[X F i 1 ]) = n ξ i, in which ξ i = E[X F i ] E[X F i 1 ]. Here, we call {ξ i } n martingale difference.
17 Review: Chernoff bounding technique Here, if we consider logarithmic moment generating function again, we have the following equality: log E[exp(λ(X EX ))] = log E[exp(λ n ξ i )] n = log E[ exp(λξ i )] Here, a intuitive idea is to bound each exp(λξ i ), i = 1, 2,, n.
18 Azuma inequality I Theorem Let {X i, F i } n i=0 be a real-valued martingale sequence. Suppose that there exist nonnegative real number d 1, d 2,, d n, such that X i X i 1 d i a.s. for all i {1, 2,, n}. Then, for every t > 0, P( X n X 0 t) 2 exp( 2 n t 2 d i 2 ) Proof. Here, we just consider P(X n X 0 t), t > 0. Let ξ i = X i X i 1 for i = 1, 2,, n denote the martingale difference. According to the assumption, we have ξ i d i and E[X i F i 1 ] = 0 a.s. for every i {1, 2,, n}.
19 Azuma inequality II Proof. Now we use the Chernoff technique: P(X n X 0 t) = P( n ξ i t) exp( λt)e[exp(λ n ξ i )], λ 0. Furthermore, E[exp(λ n [ ξ i )] = E E [ exp(λ n ξ i ) ] ] Fn 1 [ n 1 ] = E exp(λ ξ i )E[exp(λξ i ) F n 1 ]
20 Azuma inequality III Proof. The last equality holds since exp(λ n 1 ξ i) is F n 1 -measurable. And we can apply Hoeffding inequality to ξ i conditioned on F n 1. Because we know that E[ξ n F n 1 ] = 0 and ξ n [ d n, d n ] a.s., According to the Hoeffding inequality, we have E[exp(λξ n ) F n 1 ] exp ( λ 2 d 2 n 2 By continuing recursively the above inequality, we can bound the inequality by E [ exp(λ n ξ i ) ] n exp ( λ 2 di 2 ) (λ 2 = exp 2 2 ) n di 2 )
21 Azuma inequality IV Proof. Plugging the above bound into the inequality, we have P(X n X 0 t) exp ( λt + λ2 2 n di 2 ) t 0. After minimizing the right side of the above inequality, we get Above all, we have P(X n X 0 t) exp( 2 n t 2 P( X n X 0 t) 2 exp( 2 n d i 2 t 2 ) d i 2 )
22 McDiarmid inequality Bounded difference assumption Let f : R n R be a function that satisfies the bounded difference assumption sup f (x 1,, x i,, x n ) f (x 1,, x x 1,x 2,,x n,x i R i,, x n ) d i for every 1 i n, where d 1,, d n are some nonnegative real constants. Theorem Suppose that a measurable function f satisfies the bounded difference assumption with parameters (d 1, d 2,, d n ) and let X i n be independent (not necessary i.i.d) in some measurable space. Then P( f (X 1, X 2,, X n ) E[f (X 1, X 2,, X n )] t) 2 exp ( 2t2 ) n d i 2
23 McDiarmid inequality: the outline of the proof Construct the martingale difference ξ i = E[f (X 1, X 2,, X n ) F i ] E[f (X 1, X 2,, X n ) F i 1 ] Thus we get f (X 1, X 2,, X n ) E[f (X 1, X 2,, X n )] = n ξ i Bounded ξ i, construct r.v. A i and B i, prove A i ξ i B i and B i A i d i. Using Hoeffding inequality, similar to Azuma inequality.
24 A summary
25 Reference [1]. Lecture notes and related materials in STA [2]. Raginsky, Maxim, and Igal Sason. of measure inequalities in information theory, communications, and coding. Foundations and Trends in Communications and Information Theory (2013):
26 Thank you Thank you
STAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationMachine Learning Theory Lecture 4
Machine Learning Theory Lecture 4 Nicholas Harvey October 9, 018 1 Basic Probability One of the first concentration bounds that you learn in probability theory is Markov s inequality. It bounds the right-tail
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 2: Introduction to statistical learning theory. 1 / 22 Goals of statistical learning theory SLT aims at studying the performance of
More informationLecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.
Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationConcentration inequalities and the entropy method
Concentration inequalities and the entropy method Gábor Lugosi ICREA and Pompeu Fabra University Barcelona what is concentration? We are interested in bounding random fluctuations of functions of many
More informationTail and Concentration Inequalities
CSE 694: Probabilistic Analysis and Randomized Algorithms Lecturer: Hung Q. Ngo SUNY at Buffalo, Spring 2011 Last update: February 19, 2011 Tail and Concentration Ineualities From here on, we use 1 A to
More informationLecture 1 Measure concentration
CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples
More informationP (A G) dp G P (A G)
First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume
More informationHigh Dimensional Probability
High Dimensional Probability for Mathematicians and Data Scientists Roman Vershynin 1 1 University of Michigan. Webpage: www.umich.edu/~romanv ii Preface Who is this book for? This is a textbook in probability
More informationConcentration inequalities and tail bounds
Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples
More informationCOMPSCI 240: Reasoning Under Uncertainty
COMPSCI 240: Reasoning Under Uncertainty Andrew Lan and Nic Herndon University of Massachusetts at Amherst Spring 2019 Lecture 20: Central limit theorem & The strong law of large numbers Markov and Chebyshev
More informationOn the Concentration of the Crest Factor for OFDM Signals
On the Concentration of the Crest Factor for OFDM Signals Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel The 2011 IEEE International Symposium
More informationSelected Exercises on Expectations and Some Probability Inequalities
Selected Exercises on Expectations and Some Probability Inequalities # If E(X 2 ) = and E X a > 0, then P( X λa) ( λ) 2 a 2 for 0 < λ
More informationHoeffding, Chernoff, Bennet, and Bernstein Bounds
Stat 928: Statistical Learning Theory Lecture: 6 Hoeffding, Chernoff, Bennet, Bernstein Bounds Instructor: Sham Kakade 1 Hoeffding s Bound We say X is a sub-gaussian rom variable if it has quadratically
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Stochastic Convergence Barnabás Póczos Motivation 2 What have we seen so far? Several algorithms that seem to work fine on training datasets: Linear regression
More informationLecture Notes 3 Convergence (Chapter 5)
Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let
More informationAN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES
Lithuanian Mathematical Journal, Vol. 4, No. 3, 00 AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES V. Bentkus Vilnius Institute of Mathematics and Informatics, Akademijos 4,
More information18.175: Lecture 17 Poisson random variables
18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 1 Outline More on random walks and local CLT Poisson random variable convergence Extend CLT idea to stable random variables 2 Outline More
More informationConcentration of Measure with Applications in Information Theory, Communications and Coding
Concentration of Measure with Applications in Information Theory, Communications and Coding Maxim Raginsky UIUC Urbana, IL 61801, USA maxim@illinois.edu Igal Sason Technion Haifa 32000, Israel sason@ee.technion.ac.il
More informationOutline. Martingales. Piotr Wojciechowski 1. 1 Lane Department of Computer Science and Electrical Engineering West Virginia University.
Outline Piotr 1 1 Lane Department of Computer Science and Electrical Engineering West Virginia University 8 April, 01 Outline Outline 1 Tail Inequalities Outline Outline 1 Tail Inequalities General Outline
More information1. Stochastic Processes and filtrations
1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S
More informationLecture 4: September Reminder: convergence of sequences
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused
More informationProving the central limit theorem
SOR3012: Stochastic Processes Proving the central limit theorem Gareth Tribello March 3, 2019 1 Purpose In the lectures and exercises we have learnt about the law of large numbers and the central limit
More informationConcentration Inequalities
Chapter Concentration Inequalities I. Moment generating functions, the Chernoff method, and sub-gaussian and sub-exponential random variables a. Goal for this section: given a random variable X, how does
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013. Conditional expectations, filtration and martingales
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013 Conditional expectations, filtration and martingales Content. 1. Conditional expectations 2. Martingales, sub-martingales
More informationEXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION
EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION Luc Devroye Division of Statistics University of California at Davis Davis, CA 95616 ABSTRACT We derive exponential inequalities for the oscillation
More informationTheory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk
Instructor: Victor F. Araman December 4, 2003 Theory and Applications of Stochastic Systems Lecture 0 B60.432.0 Exponential Martingale for Random Walk Let (S n : n 0) be a random walk with i.i.d. increments
More informationLimiting Distributions
Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the
More informationSTA 711: Probability & Measure Theory Robert L. Wolpert
STA 711: Probability & Measure Theory Robert L. Wolpert 6 Independence 6.1 Independent Events A collection of events {A i } F in a probability space (Ω,F,P) is called independent if P[ i I A i ] = P[A
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW2 due now! Project proposal due on tomorrow Midterm next lecture! HW3 posted Last time Linear Regression Parametric vs Nonparametric
More informationExample continued. Math 425 Intro to Probability Lecture 37. Example continued. Example
continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with
More information18.175: Lecture 3 Integration
18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability
More informationLecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]
U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 4 Professor Satish Rao September 7, 2010 Lecturer: Satish Rao Last revised September 13, 2010 Lecture 4 1 Deviation bounds. Deviation bounds
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 5 Sequential Monte Carlo methods I 31 March 2017 Computer Intensive Methods (1) Plan of today s lecture
More informationLecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN
Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and
More informationDisjointness and Additivity
Midterm 2: Format Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions
More informationMidterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley
Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley Midterm 2: Format 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten
More informationChapter 6: Large Random Samples Sections
Chapter 6: Large Random Samples Sections 6.1: Introduction 6.2: The Law of Large Numbers Skip p. 356-358 Skip p. 366-368 Skip 6.4: The correction for continuity Remember: The Midterm is October 25th in
More informationProbability inequalities 11
Paninski, Intro. Math. Stats., October 5, 2005 29 Probability inequalities 11 There is an adage in probability that says that behind every limit theorem lies a probability inequality (i.e., a bound on
More informationBennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence
Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence Chao Zhang The Biodesign Institute Arizona State University Tempe, AZ 8587, USA Abstract In this paper, we present
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More informationEssentials on the Analysis of Randomized Algorithms
Essentials on the Analysis of Randomized Algorithms Dimitris Diochnos Feb 0, 2009 Abstract These notes were written with Monte Carlo algorithms primarily in mind. Topics covered are basic (discrete) random
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationChapter 7. Basic Probability Theory
Chapter 7. Basic Probability Theory I-Liang Chern October 20, 2016 1 / 49 What s kind of matrices satisfying RIP Random matrices with iid Gaussian entries iid Bernoulli entries (+/ 1) iid subgaussian entries
More informationStat 260/CS Learning in Sequential Decision Problems. Peter Bartlett
Stat 260/CS 294-102. Learning in Sequential Decision Problems. Peter Bartlett 1. Multi-armed bandit algorithms. Concentration inequalities. P(X ǫ) exp( ψ (ǫ))). Cumulant generating function bounds. Hoeffding
More informationThe Moment Method; Convex Duality; and Large/Medium/Small Deviations
Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential
More informationProbability Theory I: Syllabus and Exercise
Probability Theory I: Syllabus and Exercise Narn-Rueih Shieh **Copyright Reserved** This course is suitable for those who have taken Basic Probability; some knowledge of Real Analysis is recommended( will
More informationMathematical Statistics
Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution
More informationUniform concentration inequalities, martingales, Rademacher complexity and symmetrization
Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization John Duchi Outline I Motivation 1 Uniform laws of large numbers 2 Loss minimization and data dependence II Uniform
More informationStochastic Optimization
Introduction Related Work SGD Epoch-GD LM A DA NANJING UNIVERSITY Lijun Zhang Nanjing University, China May 26, 2017 Introduction Related Work SGD Epoch-GD Outline 1 Introduction 2 Related Work 3 Stochastic
More informationAdvanced Probability Theory (Math541)
Advanced Probability Theory (Math541) Instructor: Kani Chen (Classic)/Modern Probability Theory (1900-1960) Instructor: Kani Chen (HKUST) Advanced Probability Theory (Math541) 1 / 17 Primitive/Classic
More informationLecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write
Lecture 3: Expected Value 1.) Definitions. If X 0 is a random variable on (Ω, F, P), then we define its expected value to be EX = XdP. Notice that this quantity may be. For general X, we say that EX exists
More informationModern Discrete Probability Branching processes
Modern Discrete Probability IV - Branching processes Review Sébastien Roch UW Madison Mathematics November 15, 2014 1 Basic definitions 2 3 4 Galton-Watson branching processes I Definition A Galton-Watson
More informationRandom Process Lecture 1. Fundamentals of Probability
Random Process Lecture 1. Fundamentals of Probability Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2016 1/43 Outline 2/43 1 Syllabus
More informationBrownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539
Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory
More informationAppendix B: Inequalities Involving Random Variables and Their Expectations
Chapter Fourteen Appendix B: Inequalities Involving Random Variables and Their Expectations In this appendix we present specific properties of the expectation (additional to just the integral of measurable
More information4 Expectation & the Lebesgue Theorems
STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does
More informationMatrix concentration inequalities
ELE 538B: Mathematics of High-Dimensional Data Matrix concentration inequalities Yuxin Chen Princeton University, Fall 2018 Recap: matrix Bernstein inequality Consider a sequence of independent random
More informationCSE 312 Final Review: Section AA
CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More informationPart II Probability and Measure
Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationCSE 525 Randomized Algorithms & Probabilistic Analysis Spring Lecture 3: April 9
CSE 55 Randomized Algorithms & Probabilistic Analysis Spring 01 Lecture : April 9 Lecturer: Anna Karlin Scribe: Tyler Rigsby & John MacKinnon.1 Kinds of randomization in algorithms So far in our discussion
More information18.175: Lecture 13 Infinite divisibility and Lévy processes
18.175 Lecture 13 18.175: Lecture 13 Infinite divisibility and Lévy processes Scott Sheffield MIT Outline Poisson random variable convergence Extend CLT idea to stable random variables Infinite divisibility
More informationLet (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t
2.2 Filtrations Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of σ algebras {F t } such that F t F and F t F t+1 for all t = 0, 1,.... In continuous time, the second condition
More informationOn Concentration of Martingales and Applications in Information Theory, Communication & Coding
On Concentration of Martingales and Applications in Information Theory, Communication & Coding Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationLecture 11 : Asymptotic Sample Complexity
Lecture 11 : Asymptotic Sample Complexity MATH285K - Spring 2010 Lecturer: Sebastien Roch References: [DMR09]. Previous class THM 11.1 (Strong Quartet Evidence) Let Q be a collection of quartet trees on
More informationTransforms. Convergence of probability generating functions. Convergence of characteristic functions functions
Transforms For non-negative integer value ranom variables, let the probability generating function g X : [0, 1] [0, 1] be efine by g X (t) = E(t X ). The moment generating function ψ X (t) = E(e tx ) is
More informationLecture 5: Importance sampling and Hamilton-Jacobi equations
Lecture 5: Importance sampling and Hamilton-Jacobi equations Henrik Hult Department of Mathematics KTH Royal Institute of Technology Sweden Summer School on Monte Carlo Methods and Rare Events Brown University,
More informationX = X X n, + X 2
CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk
More informationStochastic Models (Lecture #4)
Stochastic Models (Lecture #4) Thomas Verdebout Université libre de Bruxelles (ULB) Today Today, our goal will be to discuss limits of sequences of rv, and to study famous limiting results. Convergence
More information6.1 Moment Generating and Characteristic Functions
Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,
More information3. Review of Probability and Statistics
3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture
More information18.175: Lecture 15 Characteristic functions and central limit theorem
18.175: Lecture 15 Characteristic functions and central limit theorem Scott Sheffield MIT Outline Characteristic functions Outline Characteristic functions Characteristic functions Let X be a random variable.
More informationEntropy and Ergodic Theory Lecture 15: A first look at concentration
Entropy and Ergodic Theory Lecture 15: A first look at concentration 1 Introduction to concentration Let X 1, X 2,... be i.i.d. R-valued RVs with common distribution µ, and suppose for simplicity that
More informationErgodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.
Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions
More informationOn the convergence of sequences of random variables: A primer
BCAM May 2012 1 On the convergence of sequences of random variables: A primer Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM May 2012 2 A sequence a :
More informationPoint Process Control
Point Process Control The following note is based on Chapters I, II and VII in Brémaud s book Point Processes and Queues (1981). 1 Basic Definitions Consider some probability space (Ω, F, P). A real-valued
More informationIntroduction to Self-normalized Limit Theory
Introduction to Self-normalized Limit Theory Qi-Man Shao The Chinese University of Hong Kong E-mail: qmshao@cuhk.edu.hk Outline What is the self-normalization? Why? Classical limit theorems Self-normalized
More informationLimiting Distributions
We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the two fundamental results
More informationMod-φ convergence I: examples and probabilistic estimates
Mod-φ convergence I: examples and probabilistic estimates Valentin Féray (joint work with Pierre-Loïc Méliot and Ashkan Nikeghbali) Institut für Mathematik, Universität Zürich Summer school in Villa Volpi,
More informationAnti-concentration Inequalities
Anti-concentration Inequalities Roman Vershynin Mark Rudelson University of California, Davis University of Missouri-Columbia Phenomena in High Dimensions Third Annual Conference Samos, Greece June 2007
More information{X i } realize. n i=1 X i. Note that again X is a random variable. If we are to
3 Convergence This topic will overview a variety of extremely powerful analysis results that span statistics, estimation theorem, and big data. It provides a framework to think about how to aggregate more
More information6.207/14.15: Networks Lecture 3: Erdös-Renyi graphs and Branching processes
6.207/14.15: Networks Lecture 3: Erdös-Renyi graphs and Branching processes Daron Acemoglu and Asu Ozdaglar MIT September 16, 2009 1 Outline Erdös-Renyi random graph model Branching processes Phase transitions
More informationComputational and Statistical Learning Theory
Computational and Statistical Learning Theory Problem set 1 Due: Monday, October 10th Please send your solutions to learning-submissions@ttic.edu Notation: Input space: X Label space: Y = {±1} Sample:
More informationOn the Concentration of the Crest Factor for OFDM Signals
On the Concentration of the Crest Factor for OFDM Signals Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology, Haifa 3, Israel E-mail: sason@eetechnionacil Abstract
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationKousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13
Discrete Mathematics & Mathematical Reasoning Chapter 7 (continued): Markov and Chebyshev s Inequalities; and Examples in probability: the birthday problem Kousha Etessami U. of Edinburgh, UK Kousha Etessami
More informationMath 6810 (Probability) Fall Lecture notes
Math 6810 (Probability) Fall 2012 Lecture notes Pieter Allaart University of North Texas September 23, 2012 2 Text: Introduction to Stochastic Calculus with Applications, by Fima C. Klebaner (3rd edition),
More informationConcentration of Measures by Bounded Size Bias Couplings
Concentration of Measures by Bounded Size Bias Couplings Subhankar Ghosh, Larry Goldstein University of Southern California [arxiv:0906.3886] January 10 th, 2013 Concentration of Measure Distributional
More informationLecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages
Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages ELEC206 Probability and Random Processes, Fall 2014 Gil-Jin Jang gjang@knu.ac.kr School of EE, KNU page 1 / 15 Chapter 7. Sums of Random
More informationExercises Measure Theoretic Probability
Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary
More informationLithuanian Mathematical Journal, 2006, No 1
ON DOMINATION OF TAIL PROBABILITIES OF (SUPER)MARTINGALES: EXPLICIT BOUNDS V. Bentkus, 1,3 N. Kalosha, 2,3 M. van Zuijlen 2,3 Lithuanian Mathematical Journal, 2006, No 1 Abstract. Let X be a random variable
More informationFiltrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition
Filtrations, Markov Processes and Martingales Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition David pplebaum Probability and Statistics Department,
More informationConcentration function and other stuff
Concentration function and other stuff Sabrina Sixta Tuesday, June 16, 2014 Sabrina Sixta () Concentration function and other stuff Tuesday, June 16, 2014 1 / 13 Table of Contents Outline 1 Chernoff Bound
More information1 Review of The Learning Setting
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #8 Scribe: Changyan Wang February 28, 208 Review of The Learning Setting Last class, we moved beyond the PAC model: in the PAC model we
More information