Chapter 7. Basic Probability Theory
|
|
- Cecilia Watson
- 5 years ago
- Views:
Transcription
1 Chapter 7. Basic Probability Theory I-Liang Chern October 20, / 49
2 What s kind of matrices satisfying RIP Random matrices with iid Gaussian entries iid Bernoulli entries (+/ 1) iid subgaussian entries random Fourier ensemble random ensemble in bounded orthogonal systems In each case, m = O(s ln N), they satisfy RIP with very high probability (1 e Cm ).. This is a note from S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing, Springer / 49
3 Outline of this Chapter: basic probability Basic probability theory Moments and tail Concentration inequalities 3 / 49
4 Basic notion of probability theory Probability space Random variables Expectation and variance Sum of independent random variables 4 / 49
5 Probability space A probability space is a triple (Ω, F, P ), Ω: sample space, F: the set of events, P : the probability measure. The sample space is the set of all possible outcomes. The collection of events F should be a σ-algebra: 1. F; 2. If E F, so is E c F; 3. F is closed under countable union, i.e. if E i F, i = 1, 2,, then i=1 E i F. The probability measure P : F [0, 1] satisfies 1. P (Ω) = 1; 2. If E i are mutually exclusive (i.e. E i E j = φ), then P ( i=1 E i) = i=1 P (E i). 5 / 49
6 Examples 1. Bernoulli trial: a trial which has only two outcomes: success or fail. We represent it as Ω = {1, 0}. The collect of events F = 2 Ω = {φ, {0}, {1}, {0, 1}}. The probability P ({1}) = p, P ({0}) = 1 p, 0 p 1. If we denote the outcome of a Bernoulli trial by x, i.e. x = 1 or 0, then P (x) = p x (1 p) 1 x. 2. Binomial trials: Let us perform Bernoulli trials n times independently. An outcome has the form (x 1, x 2,..., x n), where x i = 0 or 1 is the outcome of the ith trial. There are 2 n outcomes. The sample space Ω = {(x 1,..., x n) x i = 0 or 1}. The collection of events F = 2 Ω is indeed the collection of all subsets of Ω. The probability P ({(x 1,..., x n)}) := p x i (1 p) n x i. 6 / 49
7 Property of probability measure 1. P (E c ) = 1 P (E); 2. If E F, then P (E) P (F ); 3. If {E n, n 1} is either increasing (i.e. E n E n+1 ) or decreasing to E, then lim P (E n) = P (E). n 7 / 49
8 Independence and conditional probability Let A, B F and P (B) 0. The conditional probability P (A B) (the probability of A given B) is defined to be P (A B) := P (A B). P (B) Two events A and B are called independent if P (A B) = P (A)P (B). In this case P (A B) = P (A) and P (B A) = P (B). 8 / 49
9 Random variables Outline: Discrete random variables Continuous random variables Expectation and variances 9 / 49
10 Discrete random variables A discrete random variable is a mapping x : Ω {a 1, a 2, }, denoted by Ω x. The random variable x induces a probability on the discrete set Ω x := {a 1, a 2, } with probability P ({a k }) := P x ({x = a k }) and with σ-algebra F x which is 2 Ωx, the collection of all subsets of Ω x.. We call the function a k P x (a k ) the probability mass function of x. Once we have (Ω x, F x, P x ), we can just deal with this probability space if we only concern with x, and forget the original probability space (Ω, F, P ). 10 / 49
11 Binomial random variable Let x k be the kth (k = 1,..., n) outcome of the n independent Bernoulli trials. Clearly, x k is a random variable. Let S n = n i=1 x i be the number of successes in n Bernoulli trials. We see that S n is also a random variable. The sample space that S n induces is Ω Sn = {0, 1,..., n}. P Sn (S n = k) = ( n k ) p k (1 p) n k. The binomial distribution models the following uncertainty: the number of successes in n independent Bernoulli trials; the relative increase or decrease of a stock in a day; 11 / 49
12 Poisson random variable The Poisson random variable x takes values 0, 1, 2,... with probability where λ > 0 is a parameter. λ λk P (x = k) = e k!, The sample space is Ω = {0, 1, 2,...}. The probability satisfies λ λk P (Ω) = P (x = k) = e = 1. k! k=0 k=0 The Poisson random process can be used to model the following uncertainties: the number of customers visiting a specific counter in a day; the number of particles hitting a specific radiation detector in certain period of time; the number of phone calls of a specific phone in a week. The parameter λ is different for different cases. It can be estimated by experiments. 12 / 49
13 Continuous Random Variables A continuous random variable is a (Borel) measurable mapping from Ω R. This means that x 1 ([a, b)) F for any [a, b). It induces a probability space (Ω x, F x, P x) by Ω x = x(ω); F x = {A R x 1 (A) F} P x(a) := P (x 1 (A)). In particular, define F x(x) := P x((, x)) := P ({x < x}). called the (cumulative) distribution function. Its derivative p x(x) w.r.t. dx is called the probability density function: p x(x) = dfx(x). dx In other word, b P ({a x < b}) = F x(b) F x(a) = p x(x) dx. a Thus, x can be completely characterized by the density function p x on R. 13 / 49
14 Gaussian distribution The density function of Gaussian distribution is p(x) := 1 2πσ e x µ 2 /2σ 2, < x <. A random variable x with the above probability density function is called a Gaussian random variable and is denoted by x N(µ, σ). The Gaussian distribution is used to model: the motion of a big particle (called Brownian particle) in water; the limit of binomial distribution with infinite many trials. 14 / 49
15 Exponential distribution. The density is { λe λx if x 0 p(x) = 0 if x < 0. The exponential distribution is used to model the length of a telephone call; the length to the next earthquake. Laplace distribution The density function is given by p(x) = λ 2 e λ x. It is used to model some noise in images. It is used as a prior in Baysian regression (LASSO). 15 / 49
16 Remarks The probability mass function which takes discrete values on R can be viewed as a special case of probability density function by introducing the notion of delta function. The definition of the delta function is δ(x a)f(x) dx = f(a). Thus, a discrete random random variable x with value a i with probability p i has the probability density function p(x) = i p i δ(x a i ). b p(x) dx = p i. a a<a i <b 16 / 49
17 Expectation Given a random variable x with pdf p(x), we define the expectation of x by E(x) = xp(x) dx. If f is a continuous function on R, then f(x) is again a random variable. Its expectation is denoted by E(f(x)). We have E(f(x)) = f(x)p(x) dx. 17 / 49
18 The kth moment of x is defined to be: m k := E( x k ). In particular, the first and second moments have special names: mean: µ := E(x) variance: Var(x) := E((x E(x)) 2 ). The variance measures the spread out of values of a random variable. 18 / 49
19 Examples 1. Bernoulli distribution: mean µ = p, variance σ 2 = p(1 p). 2. Binomial distribution S n : the mean µ = np, variance: σ 2 = np(1 p). 3. Poisson distribution: mean µ = λ, variance σ 2 = λ. 4. Normal distribution N(µ, σ): mean µ, variance σ Uniform distribution: mean µ = (a + b)/2, variance σ 2 = (b a) 2 / / 49
20 Joint Probability Let x and y be two random variables on (Ω, F, P ). The joint probability distribution of (x, y) is the measure on R 2 defined by µ(a) := P ((x, y) A) for any Borel set A. The derivative of µ w.r.t. the Lebesgue measure dx dy is called the joint probability density: P ((x, y) A) = p (x,y) (x, y) dx dy. A 20 / 49
21 Independent random variables Two random variables x and y are called independent if the events (a < x < b) and (c < y < d) are independent for any a < b and c < d. If x and y are independent, then by taking A = (a, b) (c, d), we can show that the joint probability p (x,y) (x, y) dx dy = P (a < x < b and c < y < d) (a,b) (c,d) ( b ) ( d ) = P (a < x < b)p (c < y < d) = p x(x) dx p y(y) dy a c This yields that p (x,y) (x, y) = p x(x) p y(y). If x and y are independent, then E[xy] = E[x]E[y]. The covariance of x and y is defined as cov[x, y] := E[(x E[x])(y E[y])]. Two random variables are called uncorrelated if their covariance is / 49
22 Sum of independent random variables If x and y are independent, then F x+y (z) = P (x + y < z) = z y dy dx p x(x)p y(y) We differentiate this in z and get p x+y (z) = dy p x(z y)p y(y) := p x p y(z). Sum of n independent identical distributed (iid) random variables {x i } n i=1 with mean µ and variance σ 2. Their average is x n := 1 n (x x n) which has mean µ and variance σ 2 /n: ( ) E[( x n µ) 2 ] = E 1 n 2 ( ) x i µ = E 1 n 2 (x i µ) n n i=1 i=1 = E 1 n [ ] n n 2 (x i µ)(x j µ) 1 n = E n i=1 j=1 2 (x i µ) 2 i=1 = σ2 n 22 / 49
23 Limit of sum of ranviables Moments and Tails Gaussian, Subgaussian, Subexponential distributions Law of large numbers, central limit theorem Concentration inequalities 23 / 49
24 Tails and Moments: Markov inequality Theorem (Markov s inequality) Let x be a random variable. Then P ( x t) E[ x ] t for all t > 0. Proof. Note that P ( x t) = E[I x t ], where { 1 if ω A I A (ω) = 0 otherwise is called an indicator function supported on A, which satisfies Thus, I { x t} x t. P ( x t) = E[I { x t} ] E[ x ]. t 24 / 49
25 Remarks. For p > 0, P ( x t) = P ( x p t p ) E[ x p ] t p. For p = 2, apply Markov s inequality to x µ, we obtain Chebyshev inequality: For θ > 0, P ( x µ 2 t 2 ) σ2 t 2 P (x t) = P (exp(θx) exp(θt)) exp( θt)e[exp(θx)]. 25 / 49
26 Law of large numbers Theorem (Law of large numbers) Let x 1,..., x n be i.i.d. with mean µ and variance σ 2. Then the sample average x n := 1 n (x x n ) converges in probability to its expected value: That is, for any ɛ > 0, x n µ P 0 as n. lim P ( x n µ ɛ) = 0. n 26 / 49
27 Proof. 1. Using iid of x j, the mean and variance of x n are: E[ x n] = µ, while σ 2 ( x n) = E[( x n µ) 2 ] = E 1 n n 2 (x j µ) 2 j=1 = 1 n n 2 E[(x j µ)(x k µ)] = 1 n n 2 E[(x j µ) 2 ] = 1 n σ2 j,k=1 j=1 2. We apply the Chebyshev s inequality to x n: P ( x n µ ɛ) = P ( x n µ 2 ɛ 2 ) σ( xn)2 ɛ 2 = σ2 0 as n. nɛ2 Remarks. The condition on variance can be removed. But we use Chebyshev inequality to prove this theorem, which uses the assumption of finite variance. No convergent rate here. The concentration inequality provides rate estimate, which needs tail control. 27 / 49
28 Tail probability can be controlled by moments Lemma (Markov) For p > 0, Proposition If x is a random variable satisfying P ( x t) = P ( x p t p ) E[ x p ] t p. E[ x p ] α p βp p/2 for all p 2, then P ( x e 1/2 αu) βe u2 /2 for all u Use Markov inequality P ( x ( ) eαu) E[ x p ] α p p ( eαu) p β. eαu 2. Choosing p = u 2, we get the estimate. 28 / 49
29 Moments can be controlled by tail probability Proposition The moments of a random variable x can be expressed as Proof. 1. Use Fubini theorem: E[ x p ] = p P ( x t)t p 1 dt, p > 0. 0 x p E[ x p ] = x p dp = 1 dx dp = I { x p x} dx dp Ω Ω 0 Ω 0 = I { x p x} dp dx = P ( x p x) dx 0 Ω 0 = p P ( x p t p )t p 1 dt = p P ( x t)t p 1 dt Here I { x p x} is a random variable which is 1 as x p x and 0 otherwise. 29 / 49
30 Moments can be controlled by tail probability Proposition Suppose x is a random variable satisfying P ( x e 1/2 αu) βe u2 /2 for all u > 0, then for all p > 0, ( p ) E[ x p ] βα p (2e) p/2 Γ Proposition If P ( x t) βe κt2 for all t > 0, then E[ x n ] nβ ( n ) 2 κ n/2 Γ / 49
31 Subgaussian and subexponential distributions Definition A random variable x is called subgaussian if there exist constants β, κ > 0 such that P ( x t) βe κt2 for all t > 0; It is called subexponential if there exist constants β, κ > 0 such that P ( x t) βe κt for all t > 0. Notice that x is subgaussian if and only if x 2 is subexponential. 31 / 49
32 Subgaussian Proposition A random variable is subgaussian if and only if c, C > 0 such that E[exp(cx 2 )] C. Proof. ( ) 1. Estimating moments from tail, we get E[x 2n ] βκ n n!. 2. Expand exponential function E[exp(cx 2 )] = 1 + ( ) From Markov inequality n=1 c n E[x 2n ] n! 1 + β n=1 c n κ n n! n! C. P ( x t) = P (exp(cx 2 ) e ct2 ) E[exp(cx 2 )]e ct2 Ce ct2. 32 / 49
33 Subgaussian with mean 0 Proposition A random variable x is subgaussian with Ex = 0 if and only if c > 0 such that E[exp(θx)] exp(cθ 2 ) for all θ R. ( ) 1. Apply Markov inequality P (x t) = P (exp(θx) exp(θt)) E[exp(θx)]e θt e cθ2 θt. Optimal θ yields P (x t) e t2 /(4c). 2. Repeating this for x, we also get P ( x t) e t2 /(4c). Thus, 3. To show E[x] = 0, we use P ( x t) = P (x t) + P ( x t) 2e t2 /(4c). Take θ 0, we obtain E[x] = θe[x] E[exp(θx)] e cθ2 33 / 49
34 ( ) 1. It is enough to prove the statement for θ 0. For θ < 0, we replace x by x. 2. For θ < θ 0 small, expand exp, use E[x] = 0, moment estimate via tail and Stirling formula: E[exp(θx)] = θ 2 β(ce)2 2πκ n=2 n=0 θ n E[x n ] n! 1 + β θ n C n κ n/2 n n/2 n! n=2 ( ) n Ceθ0 1 + θ 2 β(ce)2 1 κ 2πκ 1 Ceθ = 1 + c 1 θ 2 exp(c 1 θ 2 ). 0 κ Here, θ 0 = κ/(2ce) and satisfies Ceθ 0 κ 1/2 < For θ > θ 0, we aim at proving E[exp(θx c 2 θ 2 )] 1. Here, c 2 = 1/(4c). E[exp(θx c 2 θ 2 )] = E[exp( q 2 + x2 4c 2 )] E[exp( x2 4c 2 )] C. 4. Define ρ = ln(c)θ 2 0 yields E[exp(θx)] Ce c 2θ 2 = Ce ( ρ+(ρ+c 2))θ 2 Ce ρθ2 0 e (ρ+c 2)θ 2 e (ρ+c 2)θ 2 Setting c 3 = max(c 1, c 2 + ρ). This completes the proof. 34 / 49
35 Bounded random variable Corollary If a random variable x has mean 0 and x B almost surely, then E[exp(θx)] exp(b 2 θ 2 /2) Proof. 1. We write x = ( B)t + (1 t)b, where t = (B x)/2b is a random variable, 0 t 1 and E[t] = 1/ By Jensen inequality: e θx te Bθ + (1 t)e Bθ, taking expectation, E[exp(θx)] 1 2 e Bθ ebθ = k=0 (θb) 2 n (2n)! k=0 (θb) 2 n (2 n n! = exp(b 2 θ 2 /2). 35 / 49
36 Exponential decay tails, cumulant function ln E[exp(θx)] In the case when the tail decays fast, the corresponding moment information can be grouped into exp[θx]. The function C x (θ) := ln E[exp(θx)] is called the cumulant function of x. 36 / 49
37 Example of C x Let g N(0, 1). Then This is from In particular, E[exp(ag 2 + θg)] = On the other hand, ( 1 exp 1 2a θ 2 2(1 2a) ), for a < 1/2, θ R. E[exp(ag 2 + θg)] = 1 exp(ax 2 + θx) exp( x 2 /2) dx 2π E[exp(θg)] = ( θ 2 ) E[exp(θg)] = exp. 2 j=0 θ j E[g j ] j! = n=0 By comparing the two expansions for E[exp(θx)], we obtain E[g 2n+1 ] = 0, θ 2n E[g 2n ] (2n)! E[g 2n ] = (2n)! 2 n, n = 0, 1,... n! C g(θ) := ln E[exp(θg)] = θ / 49
38 Examples of C x Let the random variable x have the pdf χ [ B,B] /(2B). Then E[exp(θx)] = 1 B exp(θx) dx = ebθ e Bθ 2B B 2Bθ = n=0 (Bθ) 2n (2n + 1)!. On the other hand, E[exp(θx)] = θ k E[x k ] k=0. By comparing these two k! expansions, we obtain E[x 2n+1 ] = 0, E[x 2n ] = B2n, n = 0, 1,... 2n + 1 ( C x(θ) = ln e Bθ e Bθ) ln θ + C. 38 / 49
39 Examples of C x The pdf of Rademacher distribution is p ɛ(x) = (δ(x + 1) + δ(x 1))/2. E[exp(θɛ)] = 1 2 e θx (δ(x + 1) + δ(x 1)) dx = eθ + e θ 2 = n=0 θ 2n (2n)!. Thus, we get E[ɛ 2n+1 ] = 0, E[ɛ 2n ] = 1, n = 1, 2,... C ɛ(θ) = ln(e θ + e θ ) + C. 39 / 49
40 Concentration inequalities Motivations: In the law of large numbers, if the distribution function satisfies certain growth condition, e.g. decay exponentially fast, or even finite support, then we have sharp estimate how fast x n tends to µ. This is the large deviation theory below. The rate is controlled by the cumulant function C x(θ) := ln E[exp(θx)]. Theorem (Cremér s theorem) Let x 1,..., x n be independent random variables with cumulant-generating function C xl. Then for t > 0, P ( 1 n ) n x l x exp( ni(x)), l=1 [ ] I(x) := sup θx 1 n C xl (θ). θ>0 n l=1 40 / 49
41 Proof. 1. By Markov s inequality and independence of x l, [ ( n )] P ( x n x) =P (exp(θ x n) exp(θx)) e θx E[exp(θ x n)] = e θx θx l E exp n l=1 [ n ( ) ] = e θx θxl n [ ( )] E exp = e θx θxl E exp n n l=1 l=1 ( n [ ( )] ) ( = e θx θxl n ( ) ) θ exp ln E exp = exp θx + C xl n n l=1 l=1 ( ( )) = exp n θ x 1 n C xl (θ ) exp ( ni(x)) n l=1 Here, [ ] I(x) = sup θx 1 n C xl (θ). θ>0 n l=1 Remark. If each C xl is subgaussian with 0 mean, then C xl (θ) cθ 2. This leads to I(x) x 2 /4c 41 / 49
42 Hoeffding inequality Theorem (Hoeffding inequality) Let x 1,..., x n be independent random variables with mean 0 and x l [ B, B] almost surely for l = 1,..., n. Then P ( x n > x) exp( ni(x)). Here, I(x) := 1 2n x 2 x2 n = i=1 (2B)2 2B 2 42 / 49
43 Proof. 1. Let S n = n l=1 x l. Use Markov s inequality n n P (S n > t) e tθ E[exp(θS n)] = e tθ E[ exp(θx l )] = e tθ E[exp(θx l )] l=1 l=1 n ( B e tθ 2 ) ( n ( B 2 ) ) exp 2 θ2 = exp tθ + 2 θ2. l=1 l=1 2. Let write t = nx, taking convex conjugate: I(x) := sup (xθ 12 ) B2 θ 2 = x2 θ 2B 2. we get P (S n > nx) exp( n(xθ B2 2 θ2 )) exp( ni(x)). 43 / 49
44 Bernstein s inequality Theorem (Bernstein s inequality) Let x 1,..., x n be independent random variables with mean 0 and variance σ 2 l, and x l [ B, B] almost surely for l = 1,..., n. Then P ( P ( n l=1 n l=1 ( ) x l > t) exp t2 /2, σ 2 + Bt/3 ( ) x l > t) 2 exp t2 /2, σ 2 + Bt/3 where σ 2 = n l=1 σ2 l. 44 / 49
45 Remark. In Hoeffding s inequality, we do not use the variance information. The concentration estimation is ) P ( x n > ɛ) exp ( n ɛ2 2B 2. In Bernstein inequality, we use the variance information. Let σ 2 := 1 n n l=1 σ2 l. The Bernstein inequality reads P ( x n > ɛ) exp ( n ɛ 2 σ 2 + Bɛ 3 Comparing the denominators, Bernstein s inequality is sharper, provided σ < B. ) 45 / 49
46 Proof. 1. For a random variable x l which has mean 0, variance σl 2 and x B almost surely, its moment generating function E[exp(θx l )] satisfies E[exp(θx l )] = E Here, [ k=0 1 + θ2 σ 2 l 2 θ k x k l k! ] E [ 1 + 2(θB) k 2 k=2 k! 2(θB) k 2 F l (θ) = k! k=2 k=2 ] θ k x l 2 B k 2 k=2 k! = 1 + θ2 σ 2 l 2 F l(θ) exp(θ 2 σ 2 l F l(θ)/2). (θb) k 2 3 k 2 = 1 1 Bθ/3 := 1 1 Rθ where R := B/3. We require 0 θ < 1/R. 2. Let S n = n l=1 x l. Using Cramer theorem, n ( n ) P (S n > t) e tθ E[exp(θx l )] e θt exp θ 2 σl 2 F l(θ)/2 Here, σ 2 = n l=1 σ l. l=1 exp ( θt + σ2 θ 2 ) 2 1 Rθ l=1 46 / 49
47 Proof (Cont.) 3 Choose θ = t/(σ 2 + Rt), which satisfies θ < 1/R. We then get P (S n > t) = exp ( ) t 2 σ 2 1 t 2 ( 2(σ 2 + Rt) 2 1 Rt σ 2 exp + Rt σ 2 +Rt t 2 2(σ 2 + Rt) 4 Replacing x l by x l yields the same estimate. We then get the estimate for P ( S n > t). (0.1) ). 47 / 49
48 Bernstein inequality for subexponential r.v. We can extend Bernstein s inequality to random variables without bound, but decay exponentially fast. That is, those subexponential random variables. Corollary Let x 1,..., x n be independent mean 0 subexponential random variables, i.e. P ( x l t) βe κt for some constant β, κ > 0 for all t > 0. Then n ( ) P ( x l t) 2 exp (κt)2 /2. 2βn + κt l=1 48 / 49
49 Proof. 1. For subexponential random variable x, for k 2, E[ x k ] = k P ( x t)t k 1 dt βk e κt t k 1 dt 0 0 = βkκ k e u u k 1 du = βκ k k! Using this estimate and E[x l ] = 0, we get [ ] θ k x k l θ k βκ k k! E[exp(θx l )] = E 1 + k! k! k=0 k=2 = 1 + β θ2 κ 2 exp (β θ2 κ 2 ) 1 θκ 1 1 θκ Using Cramer s inequality, we have P (S n t) exp ( θt + nβ θ 2 ) κ 2 1 κ 1 θ Comparing this formula and (0.1) with R = 1/κ, σ 2 = 2nβκ 2, we get ( t 2 ) ( t 2 ) P (S n t) exp 2(σ 2 = exp + Rt) 2(2nβκ 2 + κ 1 t) ( ) = exp (κt)2 /2. 2nβ + κt 49 / 49
Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationSTAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)
STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 07 Néhémy Lim Moment functions Moments of a random variable Definition.. Let X be a rrv on probability space (Ω, A, P). For a given r N, E[X r ], if it
More information1: PROBABILITY REVIEW
1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationSUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)
SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems
More informationELEMENTS OF PROBABILITY THEORY
ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More informationProbability and Measure
Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationChapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University
Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real
More information1 Presessional Probability
1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional
More informationLecture 1: Review on Probability and Statistics
STAT 516: Stochastic Modeling of Scientific Data Autumn 2018 Instructor: Yen-Chi Chen Lecture 1: Review on Probability and Statistics These notes are partially based on those of Mathias Drton. 1.1 Motivating
More informationReview of Probability Theory
Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationPart II Probability and Measure
Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationWhy study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables
ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section
More informationRandom Variables. P(x) = P[X(e)] = P(e). (1)
Random Variables Random variable (discrete or continuous) is used to derive the output statistical properties of a system whose input is a random variable or random in nature. Definition Consider an experiment
More information3. Review of Probability and Statistics
3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture
More informationACM 116: Lectures 3 4
1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance
More informationChp 4. Expectation and Variance
Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.
More informationFundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes
Fundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes Klaus Witrisal witrisal@tugraz.at Signal Processing and Speech Communication Laboratory www.spsc.tugraz.at Graz University of
More informationCDA5530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables
CDA5530: Performance Models of Computers and Networks Chapter 2: Review of Practical Random Variables Definition Random variable (R.V.) X: A function on sample space X: S R Cumulative distribution function
More informationLecture 22: Variance and Covariance
EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce
More informationLecture 6 Basic Probability
Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationMultiple Random Variables
Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationProbability: Handout
Probability: Handout Klaus Pötzelberger Vienna University of Economics and Business Institute for Statistics and Mathematics E-mail: Klaus.Poetzelberger@wu.ac.at Contents 1 Axioms of Probability 3 1.1
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationExpectation of Random Variables
1 / 19 Expectation of Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 13, 2015 2 / 19 Expectation of Discrete
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More information6.1 Moment Generating and Characteristic Functions
Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,
More informationDistributions of Functions of Random Variables. 5.1 Functions of One Random Variable
Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique
More informationExercises with solutions (Set D)
Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where
More informationWeek 12-13: Discrete Probability
Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationt x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.
Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationLIST OF FORMULAS FOR STK1100 AND STK1110
LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function
More information1.1 Review of Probability Theory
1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,
More information1 Probability theory. 2 Random variables and probability theory.
Probability theory Here we summarize some of the probability theory we need. If this is totally unfamiliar to you, you should look at one of the sources given in the readings. In essence, for the major
More informationTheorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1
Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationRandom Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay
1 / 13 Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay August 8, 2013 2 / 13 Random Variable Definition A real-valued
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationProbability Models. 4. What is the definition of the expectation of a discrete random variable?
1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions
More information1 Review of Probability
1 Review of Probability Random variables are denoted by X, Y, Z, etc. The cumulative distribution function (c.d.f.) of a random variable X is denoted by F (x) = P (X x), < x
More information2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).
Name M362K Final Exam Instructions: Show all of your work. You do not have to simplify your answers. No calculators allowed. There is a table of formulae on the last page. 1. Suppose X 1,..., X 1 are independent
More informationLectures on Statistics. William G. Faris
Lectures on Statistics William G. Faris December 1, 2003 ii Contents 1 Expectation 1 1.1 Random variables and expectation................. 1 1.2 The sample mean........................... 3 1.3 The sample
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationRandom variables. DS GA 1002 Probability and Statistics for Data Science.
Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities
More informationEcon 508B: Lecture 5
Econ 508B: Lecture 5 Expectation, MGF and CGF Hongyi Liu Washington University in St. Louis July 31, 2017 Hongyi Liu (Washington University in St. Louis) Math Camp 2017 Stats July 31, 2017 1 / 23 Outline
More informationEE514A Information Theory I Fall 2013
EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/
More information2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time:
Chapter 1 Random Variables 1.1 Elementary Examples We will start with elementary and intuitive examples of probability. The most well-known example is that of a fair coin: if flipped, the probability of
More informationGaussian vectors and central limit theorem
Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationSummary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016
8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More information18.175: Lecture 3 Integration
18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationMAS113 Introduction to Probability and Statistics
MAS113 Introduction to Probability and Statistics School of Mathematics and Statistics, University of Sheffield 2018 19 Identically distributed Suppose we have n random variables X 1, X 2,..., X n. Identically
More informationChapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory
Chapter 1 Statistical Reasoning Why statistics? Uncertainty of nature (weather, earth movement, etc. ) Uncertainty in observation/sampling/measurement Variability of human operation/error imperfection
More informationChap 2.1 : Random Variables
Chap 2.1 : Random Variables Let Ω be sample space of a probability model, and X a function that maps every ξ Ω, toa unique point x R, the set of real numbers. Since the outcome ξ is not certain, so is
More informationChapter 2: Random Variables
ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:
More informationDiscrete Probability Refresher
ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory
More information1 Review of Probability and Distributions
Random variables. A numerically valued function X of an outcome ω from a sample space Ω X : Ω R : ω X(ω) is called a random variable (r.v.), and usually determined by an experiment. We conventionally denote
More informationwhere r n = dn+1 x(t)
Random Variables Overview Probability Random variables Transforms of pdfs Moments and cumulants Useful distributions Random vectors Linear transformations of random vectors The multivariate normal distribution
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More information1 Exercises for lecture 1
1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationUQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables
UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1
More informationProbability reminders
CS246 Winter 204 Mining Massive Data Sets Probability reminders Sammy El Ghazzal selghazz@stanfordedu Disclaimer These notes may contain typos, mistakes or confusing points Please contact the author so
More informationCDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random
CDA6530: Performance Models of Computers and Networks Chapter 2: Review of Practical Random Variables Definition Random variable (RV)X (R.V.) X: A function on sample space X: S R Cumulative distribution
More informationProduct measure and Fubini s theorem
Chapter 7 Product measure and Fubini s theorem This is based on [Billingsley, Section 18]. 1. Product spaces Suppose (Ω 1, F 1 ) and (Ω 2, F 2 ) are two probability spaces. In a product space Ω = Ω 1 Ω
More informationAppendix A : Introduction to Probability and stochastic processes
A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of
More informationFundamental Tools - Probability Theory II
Fundamental Tools - Probability Theory II MSc Financial Mathematics The University of Warwick September 29, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory II 1 / 22 Measurable random
More informationChapter 5. Random Variables (Continuous Case) 5.1 Basic definitions
Chapter 5 andom Variables (Continuous Case) So far, we have purposely limited our consideration to random variables whose ranges are countable, or discrete. The reason for that is that distributions on
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationECE534, Spring 2018: Solutions for Problem Set #3
ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their
More informationSTAT/MATH 395 PROBABILITY II
STAT/MATH 395 PROBABILITY II Chapter 6 : Moment Functions Néhémy Lim 1 1 Department of Statistics, University of Washington, USA Winter Quarter 2016 of Common Distributions Outline 1 2 3 of Common Distributions
More informationSTAT 430/510: Lecture 16
STAT 430/510: Lecture 16 James Piette June 24, 2010 Updates HW4 is up on my website. It is due next Mon. (June 28th). Starting today back at section 6.7 and will begin Ch. 7. Joint Distribution of Functions
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationWeek 2. Review of Probability, Random Variables and Univariate Distributions
Week 2 Review of Probability, Random Variables and Univariate Distributions Probability Probability Probability Motivation What use is Probability Theory? Probability models Basis for statistical inference
More information