Random variables. Florence Perronnin. Univ. Grenoble Alpes, LIG, Inria. September 28, 2018

Similar documents
Chapter 2: Random Variables

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Fundamental Tools - Probability Theory II

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

1: PROBABILITY REVIEW

Basics of Stochastic Modeling: Part II

Probability Theory and Random Variables

Analysis of Engineering and Scientific Data. Semester

1 Random Variable: Topics

5. Conditional Distributions

Northwestern University Department of Electrical Engineering and Computer Science

Lecture 2: Repetition of probability theory and statistics

The random variable 1

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Brief Review of Probability

Topic 3: The Expectation of a Random Variable

Probability Review. Gonzalo Mateos

1 Presessional Probability

Probability and Distributions

Statistics and Econometrics I

Outils de Recherche Opérationnelle en Génie MTH Astuce de modélisation en Programmation Linéaire

18.440: Lecture 28 Lectures Review

Name: Firas Rassoul-Agha

Set Theory Digression

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

p. 4-1 Random Variables

Continuous Probability Distributions. Uniform Distribution

Chapter 3. Chapter 3 sections

Things to remember when learning probability distributions:

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions

1 Review of Probability

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

Stochastic Models of Manufacturing Systems

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Algorithms for Uncertainty Quantification

2 (Statistics) Random variables

Lecture 3. Discrete Random Variables

Continuous Random Variables

Random Variables and Their Distributions

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

18.440: Lecture 28 Lectures Review

18.440: Lecture 19 Normal random variables

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Deep Learning for Computer Vision

Chapter 3: Random Variables 1

STAT509: Continuous Random Variable

Recitation 2: Probability

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 1: Review on Probability and Statistics

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Discrete Random Variables

Chapter 3: Random Variables 1

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Random Variables. Statistics 110. Summer Copyright c 2006 by Mark E. Irwin

SDS 321: Introduction to Probability and Statistics

Multivariate distributions

This does not cover everything on the final. Look at the posted practice problems for other topics.

Guidelines for Solving Probability Problems

Quick Tour of Basic Probability Theory and Linear Algebra

Probability Density Functions and the Normal Distribution. Quantitative Understanding in Biology, 1.2

M378K In-Class Assignment #1

2 Continuous Random Variables and their Distributions

Random Variables Example:

Chapter 4 : Discrete Random Variables

Continuous Random Variables and Continuous Distributions

Random Variables. Lecture 6: E(X ), Var(X ), & Cov(X, Y ) Random Variables - Vocabulary. Random Variables, cont.

6.041/6.431 Fall 2010 Quiz 2 Solutions

Homework 2. Spring 2019 (Due Thursday February 7)

Chapter 1. Sets and probability. 1.3 Probability space

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

CME 106: Review Probability theory

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

STAT/MATH 395 PROBABILITY II

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Special distributions

Apprentissage automatique Méthodes à noyaux - motivation

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

Mathematical Statistics 1 Math A 6330

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

Review of Probability. CS1538: Introduction to Simulations

Probability Distributions Columns (a) through (d)

Review of Probability Theory

Review of probability

Sample Spaces, Random Variables

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

MATH 3670 First Midterm February 17, No books or notes. No cellphone or wireless devices. Write clearly and show your work for every answer.

F X (x) = P [X x] = x f X (t)dt. 42 Lebesgue-a.e, to be exact 43 More specifically, if g = f Lebesgue-a.e., then g is also a pdf for X.

Lecture 8: Continuous random variables, expectation and variance

Probability and Statistics Concepts

Transcription:

Random variables Florence Perronnin Univ. Grenoble Alpes, LIG, Inria September 28, 2018 Florence Perronnin (UGA) Random variables September 28, 2018 1 / 42

Variables aléatoires Outline 1 Variables aléatoires 2 Expected value 3 Lois discrètes ultra-classiques 4 Most famous continuous random variables Florence Perronnin (UGA) Random variables September 28, 2018 2 / 42

Variables aléatoires Variables aléatoires Une variable aléatoire X est une application X : Ω F ou F est un ensemble ordonné, t.q. Human langage translation x F, {ω Ω : X (ω) x} F A random variable is a function X : Ω R (or N) characterizing the possible outcomes in Ω such that intervals in R have a probability. X est une v.a. discrète si F est dénombrable (typ. N) X est une v.a. continue si F est indénombrable (typ. R) Random variables are usually written as capital letters: X,... Florence Perronnin (UGA) Random variables September 28, 2018 3 / 42

Variables aléatoires Some examples of random variables X =result of a single die Y =sum of two dice (coin tossing) X = 1 if heads, X = 0 if tails Z = lifetime of a device A n = n-th measurement (collection of random variables) 1 k k n=1 A n is also a random variable (mean) Florence Perronnin (UGA) Random variables September 28, 2018 4 / 42

Variables aléatoires Cumulative Distribution function (C.D.F) Pour toute v.a. X on définit sa fonction de répartition (CDF en anglais): x, F X (x) = P[X x] F X donne les probabilités cumulées. F X décrit la loi ou distribution de probabilité de X. Fn(x) 0.0 0.2 0.4 0.6 0.8 1.0 ecdf(g) 0 100 200 300 400 x P(X<=x) 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 x Properties: 0 F X (x) 1 x R F X is nondecreasing Florence Perronnin (UGA) Random variables September 28, 2018 5 / 42

Variables aléatoires Cas particulier d une variable aléatoire discrète La fonction de répartition (C.D.F) F X d une variable aléatoire discrète X est une fonction en escalier (sommes cumulées). Figure : Example of CDF : Binom(n,p) Florence Perronnin (UGA) Random variables September 28, 2018 6 / 42

Variables aléatoires Quantiles A quantile is the inverse of the cumulative distribution function. Median The median is the value m such that P[X m] = 1 2. (the value that splits Ω into two equiprobable parts.) It is therefore the 1/2-quantile or the 0.5-quantile. q-quantile The q-quantile is the value x q such that : F X (x q ) = P[X x q ] = q By Erzbischof - Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=18 Florence Perronnin (UGA) Random variables September 28, 2018 7 / 42

Variables aléatoires Probability density function (P.D.F) La distribution d une variable aléatoire peut également être définie de façon plus élémentaire par : Cas d une variable discrète Fonction de distribution : f X (k) = P[X = k] pour toute valeur k possible. f (n) = 1 n N Cas d une variable continue (réelle) Densité de probabilité (PDF en anglais) f (x) P[x X y] = y x f X (u)du attention aux bornes d intégration... f X (x)dx = 1 Florence Perronnin (UGA) Random variables September 28, 2018 8 / 42

Variables aléatoires PDF pour les variables aléatoires discrètes Pour une v.a. discrète (à valeurs entières par exemple) on définit la fonction de distribution (ou loi) par f X (k) = P[X = k]. C est la fonction qui donne la probabilité de chaque valeur. Histogram of g Density 0.000 0.001 0.002 0.003 0.004 0.005 0 f X (k) 1 F X (k) = k m= f X (m) f X peut être approchée par l histogramme d un échantillon 0 100 200 300 g Florence Perronnin (UGA) Random variables September 28, 2018 9 / 42

Variables aléatoires PDF for continuous random variables The probability density function f can be defined as the derivative of the CDF F (x) F X (x) F X (a) = P[a X x] = Intuitively: x a f X (u)du => f X (x) = F X x (x) The area below the curve of f between a and b gives the probability that the r.v. falls into the x-axis interval [a, b] The density is the derivative of F X, so: f X (x)dx = df X (x) Florence Perronnin (UGA) Random variables September 28, 2018 10 / 42

Variables aléatoires PDF and CDF Some continuous distribution 0.3 f X (ω) 0.2 0.1 P(3 X 5) = 0.2705 F X (x) = f X (u)du = 1 x f X (v)dv 0.0 total area =1 2.5 0.0 2.5 5.0 7.5 10.0 ω Florence Perronnin (UGA) Random variables September 28, 2018 11 / 42

Variables aléatoires Density and histograms A probability density function f X can also be viewed as the limit of the histogram. Indeed, each box of the histogram gives P(x n X x n+1 ) = F X (x n+1 ) F X (x n ). Gaussian distribution µ = 3, σ = 2 Beta distribution α = 1, β = 5 0.20 5 4 0.15 frequency 0.10 frequency 3 2 0.05 1 0.00 0 5 0 5 10 observations 0.0 0.2 0.4 0.6 0.8 observations Florence Perronnin (UGA) Random variables September 28, 2018 12 / 42

Expected value Outline 1 Variables aléatoires 2 Expected value 3 Lois discrètes ultra-classiques 4 Most famous continuous random variables Florence Perronnin (UGA) Random variables September 28, 2018 13 / 42

Expected value Expected value Espérance La moyenne (mean) ou espérance (expectation en anglais) d une v.a. discrète X à valeurs dans I (dénombrable) est: E [X ] = x I xp[x = x] Example Soit 0 < p < 1. Une variable aléatoire X à valeurs dans {0, 1} suit une loi de Bernoulli B(p) ssi P[X = 1] = p et P[x = 0] = 1 p. Son espérance est E [X ] = 0 (1 p) + 1 p = p. Exercise What is the expected value of the sum of two fair dice? Florence Perronnin (UGA) Random variables September 28, 2018 14 / 42

Expected value Expected value of a continuous r.v. Expectation of a continuous random variable X Let X be a continuous r.v. with values in R. Then (or equivalently : E [x] = E [X ] = xdf (x)) xf (x)dx Florence Perronnin (UGA) Random variables September 28, 2018 15 / 42

Expected value Exercise let Y be an exponential r.v. with parameter λ. The PDF is f (x) = λe λx 1 {x 0}. Compute E [Y ]. E [Y ] = = = = 0 0 xf (x)dx (1) xλe λx dx (2) ] 0 0 [ x( e λx ) [ 1 ] λ e λx = 1 0 λ ( e λx )dx (3) (4) Florence Perronnin (UGA) Random variables September 28, 2018 16 / 42

Expected value Expectation of a function of X Let X be an R-valued r.v. and a function g : R R. Then E [g(x)] = g(x)f (x)dx Exercise Let Y be a uniform r.v. over [a, b] (with 0 < a < b). Compute E [ log(y )] Florence Perronnin (UGA) Random variables September 28, 2018 17 / 42

Expected value Existence If the r.v. x has a infinite number of possible values, the expected value is not always finite. St Petersburg paradox How much should the bank ask for the following game? flip a coin until it lands on tails. If tails happens at the 1st round: reward=2$ 2nd round: reward=4$ 3nd round: reward=8$ and so on... Florence Perronnin (UGA) Random variables September 28, 2018 18 / 42

Expected value Let X be the reward and Y the number of coin tosses until tails. E [X ] = 2 n P[Y = n] = n=1 = (1 + 1 +...) = 2 n 1 2 n (Y Geom(1/2) However in reality no bank has an infinite amount of money. Let s assume the bank cannot pay more than 1, 000, 000$. Then if Y log 2 (10 6 ) 19, the player will only earns 10 6 $ after 19 heads events (when he should earn more). E [X ] = 19 n=1 n=1 2 n P[Y = n] + 10 6 P[Y > 19] = = 19 + 106 = 20, 9$ 524288 n=1 2 n 1 2 n + 106 1 2 19 So if the bank can t pay more than a million $, the expected reward of the player is only 20, 9$. A lot less than infinity... Florence Perronnin (UGA) Random variables September 28, 2018 19 / 42

Expected value Existence The example of the St Petersburg paradox shows that a finite random variable can have an infinite expected value. Florence Perronnin (UGA) Random variables September 28, 2018 20 / 42

Expected value Linearity of the Expectation operator Linearity Given two random variables X and Y we have: E [X + Y ] = E [X ] + E [Y ] even if X and Y are dependent variables. Also for any constant c we have: E [cx ] = ce [X ] More generally, if we have n r.v. X 1, X 2,... X n then: [ n ] n E X i = E [X i ] i=1 i=1 Florence Perronnin (UGA) Random variables September 28, 2018 21 / 42

Expected value Exercise (using the linearity of E [.]) Initially n objects are correctly placed. But someone came by and rearranged all the objects. On average, 1 object is correctly placed. let X i be the Bernoulli r.v. such that X i = 1 of object i is correctly placed X i = 0 otherwise Let Y be the total number of correctly placed objects. Then clearly Y = n i=1 X i. Florence Perronnin (UGA) Random variables September 28, 2018 22 / 42

Expected value Expectation exercises Compute the expectation of a binomial random variable Z. Let Y = X E [X ]. Compute E [Y ] as a function of the moments of X. Florence Perronnin (UGA) Random variables September 28, 2018 23 / 42

Expected value Conditional expectation E [Y Z = z] = y y P[Y = y Z = z] Law of total probability: E [Y ] = E [E [Y Z]] where E [Y Z] is a random variable (f (Z)). Florence Perronnin (UGA) Random variables September 28, 2018 24 / 42

Expected value Variance Définition La variance d une v.a. est définie par : var(x ) = E [ (X E [X ]) 2] = E [ X 2] (E [X ]) 2 Exercise : prove the above equality. Cas discret : var(x ) = ( x 2 P[X = x] xp[x = x] x I x I ( ) 2 Cas continu : var(x ) = x 2 f (x)dx xf (x)dx Question x I Pourquoi ne pas définir E [(X E [X ])] au lieu de l espérance du carré des écarts? x I ) 2 Florence Perronnin (UGA) Random variables September 28, 2018 25 / 42

Expected value Lois jointes Deux variables aléatoires X et Y définies sur un même espace de probabilités ont une fonction de répartition jointe : F (x, y) = P[X x, Y y] = P[(X x) (Y y)] Les distributions de X et Y respectives sont appelées lois marginales: F X (x) = P[X x] = F (x, y). lim y + La densité jointe se calcule comme précédemment : f XY = F XY x y soit P[X x, Y y] = x y f XY (u, v)dudv Densité conditionnelle : f X Y (x, y) = f XY (x, y) f Y (y) Florence Perronnin (UGA) Random variables September 28, 2018 26 / 42

Expected value Variables aléatoires multiples Espérance conditionnelle Cas discret : E [X Y = y] = xp[x = x Y = y]. x I Cas continu: E [X Y = y] = xf X Y (x, y)dx x I Loi de l espérance conditionnelle: E [X ] = y J E [X Y = y] P[Y = y] Soit en continu : E [X ] = E [X Y = y] f Y (y)dy R Florence Perronnin (UGA) Random variables September 28, 2018 27 / 42

Expected value Indépendance Indépendance Deux v.a. X et Y sont indépendantes ssi P[X x, Y y] = P[X x] P[Y y]. Si deux v.a. indépendantes ont une espérance alors : E [XY ] = E [Y ] E [Y ] Attention : la réciproque n est pas vraie. Florence Perronnin (UGA) Random variables September 28, 2018 28 / 42

Lois discrètes ultra-classiques Outline 1 Variables aléatoires 2 Expected value 3 Lois discrètes ultra-classiques 4 Most famous continuous random variables Florence Perronnin (UGA) Random variables September 28, 2018 29 / 42

Lois discrètes ultra-classiques Bernoulli trials Bernoulli A binary random variable X {0, 1} such that P[X = 1] = p is a Bernoulli random variable B(p). Typical example : coin tossing, test result (pass/fail)... Florence Perronnin (UGA) Random variables September 28, 2018 30 / 42

Lois discrètes ultra-classiques Uniform distribution Une variable X {n 1, n 2, n 3,..., n k } pouvant prendre k valeurs est uniforme si chacune de ces valeurs a la même probabilité de se réaliser : P[X = n i ] = 1 k y=sample.int(10,size=n,replace=true) hist(y,breaks=br,freq=false) Histogram of y Histogram of y Density 0.00 0.05 0.10 0.15 Density 0.00 0.04 0.08 2 4 6 8 10 observed values, N=100 2 4 6 8 10 observed values, N=10000 Florence Perronnin (UGA) Random variables September 28, 2018 31 / 42

Lois discrètes ultra-classiques Binomial distribution Repeating bernoulli trials... Binomial La somme de n v.a. de Bernoulli indépendantes de même paramètre p est une v.a. binomiale Binom(n, p). Sa distribution est: P[X 1 +... + X n = k] = C k n p k (1 p) n k Exercice : Binom(57,8/9) 1 Preuve de la distribution? 2 Montrer que n i=0 P[X = k] = 1 3 Montrer que l espérance est E [X ] = np Density 0.00 0.05 0.10 0.15 40 45 50 55 observed values, N=100000 Florence Perronnin (UGA) Random variables September 28, 2018 32 / 42

Lois discrètes ultra-classiques Geometric distribution (1) Sequence of independent Bernoulli trials B(p) until first success: P[X = k] = p (1 p) k 1 Warning The geometric distribution in the R language (rgeom) is slightly different (number of failures before first success) Exercice: Prove that i=0 P[X = k] = 1 Prove that E [X ] = 1 p Florence Perronnin (UGA) Random variables September 28, 2018 33 / 42

Lois discrètes ultra-classiques Geometric distribution (2) Note that the geometric r.v. has infinite support, i.e. can take an infinite number of values. Frequency 0 100 300 500 Geometric samples with p=0.5 0 2 4 6 8 observed values, N=1000 y=rgeom(1000,prob=0.5) 0.00 br=seq(from=min(y)-0.5,to=max(y)+0.5,by=1) hist(y,xlab="observed values, N=1000",main= Geom(0.5),breaks=br) frequency of observations 0.20 0.15 0.10 0.05 Geometric samples with p = 0.2 0 10 20 30 40 Number of trials before success Florence Perronnin (UGA) Random variables September 28, 2018 34 / 42

Lois discrètes ultra-classiques Poisson distribution Limite de la binomiale pour de grandes valeurs de n (et petites valeurs de p) Paramètre: λ Prove that E [X ] = λ P[X = k] = λk k! e λx Link with binomial? (hint: use λ = np) rpois(n,lambda) Florence Perronnin (UGA) Random variables September 28, 2018 35 / 42

Lois discrètes ultra-classiques Other classical distributions that you should know about Hypergeometric Negative binomial distribution Multinomial distribution Florence Perronnin (UGA) Random variables September 28, 2018 36 / 42

Most famous continuous random variables Outline 1 Variables aléatoires 2 Expected value 3 Lois discrètes ultra-classiques 4 Most famous continuous random variables Florence Perronnin (UGA) Random variables September 28, 2018 37 / 42

Most famous continuous random variables Uniform r.v Let X be a continuous uniform random variable over an interval I. Why can t I be R? 1.00 Uniform distribution over [ 3, 8 ] 0.75 f X (u) 0.50 P(5<X<6)= 0.2 0.25 0.00 2.5 5.0 7.5 10.0 u If I = [a, b] then: f (x) = 1 b a 1 {x [a,b]} Florence Perronnin (UGA) Random variables September 28, 2018 38 / 42

Most famous continuous random variables Uniform distribution Exercise What is the CDF of a uniform r.v. U over [a, b]? P[X u] 1.00 0.75 0.50 0.25 0.00 Uniform distribution over [ 3, 8 ] 2.5 5.0 7.5 10.0 u P[U x] = = = 1 {x [a,b]} x a b a 1 b a dx [ x b a = x a b a ] x a dx Florence Perronnin (UGA) Random variables September 28, 2018 39 / 42

Most famous continuous random variables Exponential r.v Let X be a continuous exponential random variable over R+ with parameter λ f (x) = λe λx F (x) = 1 e λx Exponential distribution with λ= 0.3 CDF of Exponential distribution with λ= 0.3 0.3 1.00 0.75 f X(ω) 0.2 0.1 P(3 X 5) = 0.1834 F X (u)=p[x u] 0.50 0.25 0.0 0.00 2.5 0.0 2.5 5.0 7.5 10.0 ω 0 5 10 15 u Florence Perronnin (UGA) Random variables September 28, 2018 40 / 42

Most famous continuous random variables Playing with exponential distribution Memoryless Prove that an exponential r.v. has no memory: P[X > x + y x > x] = P[X > y] The first event Let X and Y be two exponential r.v.s with respective parameters λ and µ. (for instance, the time before failure of two devices). Let Z = min(x, Y ) (the time of first failure). What is the distribution of Z? Florence Perronnin (UGA) Random variables September 28, 2018 41 / 42

Most famous continuous random variables Gaussian r.v N (µ, σ) f (x) = 1 σ 2π e 1 2( x µ σ ) 2 Gaussian distribution with µ=2, σ=3 CDFof Gaussian distribution with µ=2, σ=3 0.3 1.00 0.2 0.75 f X(ω) P(3 X 5) = 0.2108 F X(u) 0.50 0.1 0.25 0.0 0.00 5 0 5 10 ω 5 0 5 10 u Florence Perronnin (UGA) Random variables September 28, 2018 42 / 42