Basics on Probability. Jingrui He 09/11/2007

Similar documents
Recitation 2: Probability

CS 361: Probability & Statistics

Introduction to Probability Theory

Lecture 2: Repetition of probability theory and statistics

Quick Tour of Basic Probability Theory and Linear Algebra

Communication Theory II

EE4601 Communication Systems

Algorithms for Uncertainty Quantification

UVA CS / Introduc8on to Machine Learning and Data Mining

Discrete Random Variables

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Multiple Random Variables

Lecture 1: Basics of Probability

L2: Review of probability and statistics

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 11: Probability Review. Dr. Yanjun Qi. University of Virginia. Department of Computer Science

ECE 4400:693 - Information Theory

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

Chapter 2: Random Variables

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

1 Random variables and distributions

(3) Review of Probability. ST440/540: Applied Bayesian Statistics

Discrete Random Variables

Some Basic Concepts of Probability and Information Theory: Pt. 1

M378K In-Class Assignment #1

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Review of Probabilities and Basic Statistics

Review of Probability Theory

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

1 Random Variable: Topics

Probability Review. Gonzalo Mateos

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Multivariate Random Variable

Bandits, Experts, and Games

Review of Probability. CS1538: Introduction to Simulations

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Random Variables and Their Distributions

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

3 Multiple Discrete Random Variables

Chapter 7 Probability Basics

Discrete Random Variables

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Introduction to Probability and Statistics (Continued)

Special Discrete RV s. Then X = the number of successes is a binomial RV. X ~ Bin(n,p).

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

1 Review of Probability and Distributions

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

1: PROBABILITY REVIEW

Probability reminders

Probability. Table of contents

Conditional distributions

Multivariate random variables

Midterm Exam 1 Solution

Probability Theory for Machine Learning. Chris Cremer September 2015

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Data Mining Techniques. Lecture 3: Probability

Random Variables Example:

Chapter 2. Probability

ELEG 3143 Probability & Stochastic Process Ch. 4 Multiple Random Variables

Lecture Notes 3 Multiple Random Variables. Joint, Marginal, and Conditional pmfs. Bayes Rule and Independence for pmfs

Discrete Random Variable

Lecture 1. ABC of Probability

Review (Probability & Linear Algebra)

STAT 430/510: Lecture 15

Review: mostly probability and some statistics

Classical and Bayesian inference

Bivariate distributions

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Lecture 3. Discrete Random Variables

Week 12-13: Discrete Probability

Chapter 1. Probability, Random Variables and Expectations. 1.1 Axiomatic Probability

Probability and Distributions

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Appendix A : Introduction to Probability and stochastic processes

Chapter 5 continued. Chapter 5 sections

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random

STAT Chapter 5 Continuous Distributions

Contents 1. Contents

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

Introduction to Probability and Stocastic Processes - Part I

1 Presessional Probability

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Measure-theoretic probability

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Statistics for scientists and engineers

Probability and Estimation. Alan Moses

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

STAT 430/510 Probability

Preliminary statistics

Relationship between probability set function and random variable - 2 -

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Class 8 Review Problems solutions, 18.05, Spring 2014

Transcription:

Basics on Probability Jingrui He 09/11/2007

Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect

Coin Flips cont. You flip a coin Head with probability p Binary random variable Bernoulli trial with success probability p You flip k coins How many heads would you expect Number of heads X: discrete random variable Binomial distribution with parameters k and p

Discrete Random Variables Random variables (RVs) which may take on only a countable number of distinct values E.g. the total number of heads X you get if you flip 100 coins X is a RV with arity k if it can take on exactly one value out of { x } 1, K, xk E.g. the possible values that X can take on are 0, 1, 2,, 100

Probability of Discrete RV Probability mass function (pmf): Easy facts about pmf P( X = x ) = 1 i ( = x ) i = x j = i j ( = x ) ( ) ( ) i = x j = = xi + = x j P X X 0 if i P X X P X P X if ( K ) x1 x2 x k P X = X = X = = 1 ( = x ) P X i i j

Common Distributions X U 1, K, N X takes values 1, 2,, N Uniform ( i) P X = = 1 E.g. picking balls of different colors from a box X Bin( n, p) X takes values 0, 1,, n Binomial N n i i P ( X = i) = p ( 1 p) E.g. coin flips [ ] n i

Coin Flips of Two Persons Your friend and you both flip coins Head with probability 0.5 You flip 50 times; your friend flip 100 times How many heads will both of you get

Joint Distribution Given two discrete RVs X and Y, their joint distribution is the distribution of X and Y together E.g. P(You get 21 heads AND you friend get 70 heads) x E.g. 50 100 y ( x y) P X = Y = = 1 ( i j ) P You get heads AND your friend get heads = 1 0 j 0 i= =

Conditional Probability P X = x Y = y is the probability of X = x, given the occurrence of ( ) E.g. you get 0 heads, given that your friend gets 61 heads ( x Y y) P X = = = ( = x Y = y) P X Y ( = y) P Y = y

Law of Total Probability Given two discrete RVs X and Y, which take values in { x } 1, K, xm and { y } 1, K, yn, We have ( = x ) ( ) i = = xi = y j P X P X Y j ( x ) ( ) i y j y j = P X = Y = P Y = j

Marginalization Marginal Probability Joint Probability ( = x ) ( ) i = = xi = y j P X P X Y j ( x ) ( ) i y j y j = P X = Y = P Y = j Conditional Probability Marginal Probability

Bayes Rule X and Y are discrete RVs ( x Y y) P X = = = ( = x Y = y) P X ( = y) P Y ( x Y ) i y j P X = = = ( = y ) ( ) j = xi = xi P( Y = y X ) P( X ) j = xk = xk P Y X P X k

Independent RVs Intuition: X and Y are independent means that X = that x neither makes it more or less probable Y = y Definition: X and Y are independent iff P X = x Y = y = P X = x P Y = y ( ) ( ) ( )

More on Independence ( = x = y) = ( = x) ( = y) P X Y P X P Y ( = x = y) = ( = x) P( Y = y X = x) = P( Y = y) P X Y P X E.g. no matter how many heads you get, your friend will not be affected, and vice versa

Conditionally Independent RVs Intuition: X and Y are conditionally independent given Z means that once Z is known, the value of X does not add any additional information about Y Definition: X and Y are conditionally independent given Z iff ( = x = y = z) = ( = x = z) ( = y = z) P X Y Z P X Z P Y Z

More on Conditional Independence ( = x = y = z) = ( = x = z) ( = y = z) P X Y Z P X Z P Y Z ( = x = y = z) = ( = x = z) P X Y, Z P X Z ( = y = x = z) = ( = y = z) P Y X, Z P Y Z

Monty Hall Problem You're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1 The host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. Do you want to pick door No. 2 instead?

Host reveals Goat A or Host reveals Goat B Host must reveal Goat B Host must reveal Goat A

Monty Hall Problem: Bayes Rule C i : the car is behind door i, i = 1, 2, 3 P ( C ) i = 1 3 H ij : the host opens door j after you pick door i ( ) ij Ck P H 0 i = j 0 j = k = 1 2 i = k 1 i k, j k

Monty Hall Problem: Bayes Rule cont. WLOG, i=1, j=3 ( ) P C H 1 13 = ( ) ( ) P H C P C 13 1 1 ( ) P H 13 1 1 1 P( H ) ( ) 13 C1 P C 1 = = 2 3 6

Monty Hall Problem: Bayes Rule cont. ( ) = (, ) + (, ) + (, ) P H P H C P H C P H C 13 13 1 13 2 13 3 1 1 = + 1 6 3 1 = 2 ( ) 1 6 1 P C1 H 13 = = 1 2 3 ( ) ( ) ( ) ( ) = P H C P C + P H C P C 13 1 1 13 2 2

Monty Hall Problem: Bayes Rule cont. ( ) 1 6 1 P C1 H 13 = = 1 2 3 1 2 P C H = 1 = > P C H 3 3 ( ) ( ) 2 13 1 13 You should switch!

Continuous Random Variables What if X is continuous? Probability density function (pdf) instead of probability mass function (pmf) A pdf is any function f ( x) that describes the probability density in terms of the input variable x.

PDF Properties of pdf f ( x) 0, x + f f ( x) = 1 ( x) 1??? Actual probability can be obtained by taking the integral of pdf E.g. the probability of X being between 0 and 1 is ( ) ( ) P 0 X 1 = f x dx 1 0

Cumulative Distribution Function ( ) ( ) FX v = P X v Discrete RVs FX ( v) = P( X = vi ) vi Continuous RVs X ( ) = ( ) d F x f x F v f x dx dx v X ( ) = ( )

Common Distributions Normal ( ) ( 2 ) X N µ, σ ( x µ ) 2 1 f x = exp, x 2πσ 2σ E.g. the height of the entire population 2 0.4 0.35 0.3 0.25 f(x) 0.2 0.15 0.1 0.05 0-5 -4-3 -2-1 0 1 2 3 4 5 x

Common Distributions cont. Beta ( α β ) ( ) X Beta α, β 1 α 1 f x;, = x 1 x, x 0,1 B (, ) β 1 ( ) [ ] α β α = β = 1: uniform distribution between 0 and 1 E.g. the conjugate prior for the parameter p in Binomial distribution 1.6 1.4 1.2 1 f(x) 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x

Joint Distribution Given two continuous RVs X and Y, the joint pdf can be written as f ( x y) X,Y, x y ( ) fx,y x, y dxdy = 1

Multivariate Normal Generalization to higher dimensions of the one-dimensional normal fv ( x, K, x ) = X 1 d d 2 1 2 ( 2π ) 1 Σ Covariance Matrix 1 v T 1 v exp Σ 2 ( x µ ) ( x µ ) Mean

Moments Mean (Expectation): µ = E ( X) Discrete RVs: E ( X) v P( X v ) Continuous RVs: Variance: Discrete RVs: Continuous RVs: = v i = i + ( X) ( ) E = xf x dx ( X) = ( X ) 2 V E µ 2 ( X) ( µ ) P( X ) = v i = i + 2 V v v ( X) ( µ ) ( ) V = x f x dx i i

Properties of Moments Mean E ( X + Y) = E ( X) + E ( Y) E ( ax) = ae ( X) If X and Y are independent, Variance 2 V ( ax + b) = a V ( X) If X and Y are independent, ( XY) = ( X) ( Y) E E E ( ) V X + Y = V (X) + V (Y)

Moments of Common Distributions Uniform X U 1, K, N ( 1 N ) 2 Binomial X Bin n, p Mean ; variance Mean np; variance Normal Mean µ ; variance Beta [ ] + ( 2 ) ( ) 2 np ( 2 ) X N µ, σ σ ( ) X Beta α, β ( ) Mean α α + β ; variance 2 N 1 12 αβ ( α + β ) 2 ( α + β + 1)

Probability of Events X denotes an event that could possibly happen E.g. X= you will fail in this course P(X) denotes the likelihood that X happens, or X=true What s the probability that you will fail in this course? Ω Ω = denotes the entire event set { X,X}

The Axioms of Probabilities 0 <= P(X) <= 1 P( Ω ) = 1 ( K) ( ) P X1 X2 P X i disjoint events, where are Useful rules P( X X ) = P( X ) + P( X ) P( X X ) = i 1 2 1 2 1 2 P( X) = 1 P( X) X i

Interpreting the Axioms Ω X 1 X 2