Lecture 2: Probability

Similar documents
Lecture 2: Probability

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

HW Solution 12 Due: Dec 2, 9:19 AM

Chapter 4 Multiple Random Variables

Lecture 11. Probability Theory: an Overveiw

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

ECE Lecture #9 Part 2 Overview

Continuous Random Variables

Multivariate Random Variable

BASICS OF PROBABILITY

STT 441 Final Exam Fall 2013

2 (Statistics) Random variables

Multiple Random Variables

Joint Distribution of Two or More Random Variables

ACE 562 Fall Lecture 2: Probability, Random Variables and Distributions. by Professor Scott H. Irwin

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Lecture 2: Repetition of probability theory and statistics

Lecture 3: Basic Statistical Tools. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

Algorithms for Uncertainty Quantification

4. Distributions of Functions of Random Variables

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

MAS113 Introduction to Probability and Statistics. Proofs of theorems

This exam contains 6 questions. The questions are of equal weight. Print your name at the top of this page in the upper right hand corner.

Review of probability

Stat 5101 Notes: Algorithms

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Chapter 4 Multiple Random Variables

Chapter 4. Chapter 4 sections

6.041/6.431 Fall 2010 Quiz 2 Solutions

Functions of two random variables. Conditional pairs

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

Basics on Probability. Jingrui He 09/11/2007

Stat 5101 Notes: Algorithms (thru 2nd midterm)

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Lecture 2: Review of Probability

ECE Lecture #10 Overview

Introduction to Machine Learning

STAT/MATH 395 PROBABILITY II

Summary of Random Variable Concepts March 17, 2000

STAT 515 MIDTERM 2 EXAM November 14, 2018

Bandits, Experts, and Games

ENGG2430A-Homework 2

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Lecture 22: Variance and Covariance

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Review: mostly probability and some statistics

Lecture 1: Basics of Probability

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Introduction to Machine Learning

Recitation 2: Probability

5 Operations on Multiple Random Variables

Northwestern University Department of Electrical Engineering and Computer Science

MAS113 Introduction to Probability and Statistics. Proofs of theorems

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Math 151. Rumbos Spring Solutions to Review Problems for Exam 2

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

Bivariate Distributions

Introduction to Bayesian Statistics

ECSE B Solutions to Assignment 8 Fall 2008

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Probability and Distributions

Chapter 4 continued. Chapter 4 sections

A Probability Review

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

Appendix A : Introduction to Probability and stochastic processes

Preliminary Statistics. Lecture 3: Probability Models and Distributions

1 Probability Review. 1.1 Sample Spaces

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Review of Probability Theory

STAT Chapter 5 Continuous Distributions

UNIT Define joint distribution and joint probability density function for the two random variables X and Y.

Disjointness and Additivity

Probability. Lecture Notes. Adolfo J. Rumbos

LIST OF FORMULAS FOR STK1100 AND STK1110

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

1 Joint and marginal distributions

Review of Probability. CS1538: Introduction to Simulations

Properties of Summation Operator

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Tutorial 1 : Probabilities

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

Bivariate distributions

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

conditional cdf, conditional pdf, total probability theorem?

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Math 151. Rumbos Fall Solutions to Review Problems for Final Exam

More on Distribution Function

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Review (Probability & Linear Algebra)

Random Variables and Their Distributions

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review

Transcription:

Lecture 2: Probability

Statistical Paradigms Statistical Estimator Method of Estimation Output Data Complexity Prior Info Classical Cost Function Analytical Solution Point Estimate Simple No Maximum Likelihood Probability Theory Numerical Optimization Point Estimate Intermediate No Bayesian Probability Theory Sampling Probability Distribution Complex Yes The unifying principal for this course is statistical estimation based on probability

Overview Basic probability Joint, marginal, conditional probability Bayes Rule Random variables Probability distribution Discrete Continuous Moments One could spend ½ a semester on this alone...

Example White-breasted fruit dove (Ptilinopus rivoli) Yellow-bibbed fruit dove (Ptilinopus solomonensis)

Events S Sc R 2 9 Rc 18 3 Pr(A) = probability that event A occurs Pr(R) =? Pr(Rc) =? Pr(S) =? Pr(Sc) =?

Events S Sc R 2 9 Rc 18 3 Pr(A) = probability that event A occurs Pr(R) = 11/32 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Joint Probability Pr(A,B) = probability that both A and B occur Pr(R,Sc) =? Pr(S,Rc) =? Pr(R,S) =? Pr(Rc,Sc) =?

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Joint Probability Pr(A,B) = probability that both A and B occur Pr(R,Sc) = 9/32 Pr(S,Rc) = 18/32 Pr(R,S) = 2/32 Pr(Rc,Sc) = 3/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Pr(A or B) = Pr(A) + Pr(B) Pr(A,B) = 1 Pr(neither) Pr(R or S) =?

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Pr(A or B) = Pr(A) + Pr(B) Pr(A,B) = 1 Pr(neither) Pr(R or S) = 11/32 + 20/32 2/32 = 29/32 = 32/32 3/32 = 29/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 If Pr(A,B) = Pr(A) Pr(B) then A and B are independent Pr(R,S) = Pr(R) Pr(S)??

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 If Pr(A,B) = Pr(A) Pr(B) then A and B are independent 0.0625 = 2/32 = Pr(R,S) Pr(R) Pr(S) = 11/32 20/32 = 0.215

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Conditional Probability Pr(A B) = Probability of A given B occurred Pr(A B) = Pr(A,B) / Pr(B) Pr(B A) = Pr(B,A) / Pr(A) Pr(R S) =?

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(R S) = Pr(R,S) / Pr(S) = (2/32) / (20/32) = 2/20

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(S R) = Pr(S,R) / Pr(R) = (2/32) / (11/32) = 2/11

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Joint = Conditional Marginal Pr(B A) = Pr(B, A) / Pr(A) Pr(B, A) = Pr(B A) Pr(A) Pr(S, R) = Pr(S R) Pr(R) 2/32 = (2/11) (11/32)

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Pr(B) = S Pr(B,A i ) Pr(R) =? Marginal Probability

Events S Sc Marginal R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Marginal Pr(S) = 20/32 Pr(Sc) = 12/32 32 Pr(B) = S Pr(B, A i ) Marginal Probability Pr(R) = Pr(R, S) + Pr(R, Sc) = 2/32 + 9/32 = 11/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Marginal Probability Pr(B) = S Pr(B,A i ) Pr(B) = S Pr(B A i ) Pr(A i ) Pr(R) =?

Events S Sc Marginal R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Marginal Pr(S) = 20/32 Pr(Sc) = 12/32 32 Marginal Probability Pr(B) = S Pr(B A i ) Pr(A i ) Pr(R) = Pr(R S) Pr(S) + Pr(R Sc) Pr(Sc) = 2/20 20/32 + 9/12 12/32 = 2/32 + 9/32 = 11/32

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B)

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B)

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B) Joint = Conditional x Marginal

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B) Joint = Conditional x Marginal Competition: Best mnemonic

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B) Pr(A B) Pr(B) = Pr(B A) Pr(A)

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B) Pr(A B) Pr(B) = Pr(B A) Pr(A) Pr(A B) = Pr(B A) Pr(A) / Pr(B) BAYES RULE

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Bayes Rule Pr(A B) = Pr(B A) Pr(A) / Pr(B) Pr(R S) = Pr(S R) Pr(R) / Pr(S) = (2/11) (11/32) / (20/32) = 2/20

BAYES RULE: alternate form Pr(A B) = Pr(B A) Pr(A) / Pr(B) Pr(A B) = Pr(B A) Pr(A) / SPr(B,A i ) Pr(A B) = Pr(B A) Pr(A) / (SPr(B A i ) Pr(A i ) ) Normalizing Constant

Monty Hall Problem

Monty Hall Problem P 1 = 1/3 P 2 = 1/3 P 3 = 1/3

Monty Hall Problem P 1 open 3 =? P 2 open 3 =? P 3 open 3 =?

Monty Hall Problem open 3 P 1 open 3 P 2 open 3 P 3

Monty Hall Problem open 3 P 1 = 1/2 open 3 P 2 = 1 open 3 P 3 = 0

Monty Hall Problem P 1 open 3 = 1/3 P 2 open 3 = 2/3 P 3 open 3 = 0

Monty Hall Problem

BAYES RULE: alternate form Pr(A B) = Pr(B A) Pr(A) / Pr(B) Pr(A B) = Pr(B A) Pr(A) / (SPr(B A) Pr(A) ) Factoring probabilities: Pr(A,B,C) = Pr(A B,C)Pr(B,C) = Pr(A B,C)P(B C)Pr(C)

BAYES RULE: alternate form Pr(A B) = Pr(B A) Pr(A) / Pr(B) Pr(A B) = Pr(B A) Pr(A) / (SPr(B A) Pr(A) ) Factoring probabilities: Pr(A,B,C) = Pr(A B,C)Pr(B,C) = Pr(A B,C)P(B C)Pr(C) Joint = Conditional x Marginal

Random Variables a variable that can take on more than one value, in which the values are determined by probabilities Pr(Z = z k ) = p k given: 0 p k 1 Random variables can be continuous or discrete

Discrete random variables z k can only take on discrete values (typically integers) We can define two important and interrelated functions probability mass function (pmf): f(z) = Pr(Z = z k ) = p k where S f(z) = 1 (if not met, is just a density fcn) Cumulative distribution function (cdf): F(z) = Pr(Z z k ) = S f(z) summed up to k 0 F(z) 1 but can be infinite in z

Example For the set {1,2,3,4,5,6} f(z) = Pr(Z = z k ) = {1/6,1/6,1/6,1/6,1/6,1/6} F(z) = Pr(Z z k )= {1/6,2/6,3/6,4/6,5/6,6/6} For z < 1, f(z) = 0, F(z) = 0 For z > 6, f(z) = 0, F(z) = 1

Continuous Random Variables z is Real (though can still be bound) Cumulative distribution function (cdf): F(z) = Pr(Z z) where 0 F(z) 1 Pr(Z = z) is infinitely small Pr(z Z z + dz) = Pr(Z z+dz) Pr(Z z) = F(z+dz) F(z) Probability density function (pdf): f(z) = df/dz f(z) 0 but NOT bound by 1

Continuous Random Variables f is derivative of F F is integral of f z dz Pr z Z z dz = f z z ANY function that meets these rules (positive, integrate to 1) Wednesday we will be going over a number of standard number of distributions and discussing their interpretation/application.

Example: exponential distribution f z = exp z F z =1 exp z Where z 0 What are the values of F(z) and f(z): At z = 0? As z? What do F(z) and f(z) look like?

Exponential Exp x = exp x f(z) At z = 0 f(z) = l F(z) = 0 As z f(z) = 0 F(z) = 1 F(z)

E [ x]=1/λ Moments of probability distributions E[x n ]= x n f x dx E[ ] = Expected value First moment (n=1) = mean Example: exponential E [ x]= x f (x)dx= 0 = x exp( λ x) 0 + 1 λ 0 x λ exp( λ x) λ exp( λ x)

Properties of means E[c] = c E[x + c] = E[x] + c E[cx] = c E[x] E[x+y] = E[x] + E[y] (even if X is not independent of Y) E(xy) = E[x]E[Y] E[g(x)]!= g(e[x]) only if independent Jensen's Inequality

Central Moments E[ x E[ x] n ]= x E[ x] n f x dx Second Central Moment = Variance = s 2

Var ax =a 2 Var X Var X b =Var X Properties of variance Var X Y =Var X Var Y 2Cov X,Y Var ax by =a 2 Var X b 2 Var Y 2abCov X,Y Var X = Var X i 2 i j Cov X i, X j Var X =Var E[ X Y ] E[Var X Y ]

Distributions and Probability all the same properties apply to random variables Pr A, B joint distribution Pr A B = Pr A, B Pr B conditional distribution Pr A = Pr A B i Pr B i marginal distribution Pr A B = Pr B A Pr A Pr B Baye's Rule

Looking forward... Use probability distributions to: Quantify the match between models and data Represent uncertainty about model parameters Partition sources of process variability