Lecture 2: Probability

Similar documents
Lecture 2: Probability

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 11. Probability Theory: an Overveiw

Chapter 4 Multiple Random Variables

Multivariate Random Variable

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

HW Solution 12 Due: Dec 2, 9:19 AM

Multiple Random Variables

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

STT 441 Final Exam Fall 2013

Chapter 4 Multiple Random Variables

ECE Lecture #9 Part 2 Overview

2 (Statistics) Random variables

BASICS OF PROBABILITY

Joint Distribution of Two or More Random Variables

Continuous Random Variables

Stat 5101 Notes: Algorithms

Lecture 3: Basic Statistical Tools. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

MAS113 Introduction to Probability and Statistics. Proofs of theorems

This exam contains 6 questions. The questions are of equal weight. Print your name at the top of this page in the upper right hand corner.

Lecture 2: Review of Probability

ACE 562 Fall Lecture 2: Probability, Random Variables and Distributions. by Professor Scott H. Irwin

Chapter 4. Chapter 4 sections

Functions of two random variables. Conditional pairs

6.041/6.431 Fall 2010 Quiz 2 Solutions

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Lecture 2: Repetition of probability theory and statistics

Stat 5101 Notes: Algorithms (thru 2nd midterm)

Algorithms for Uncertainty Quantification

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

ECE Lecture #10 Overview

4. Distributions of Functions of Random Variables

STAT/MATH 395 PROBABILITY II

Introduction to Machine Learning

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Summary of Random Variable Concepts March 17, 2000

STAT 515 MIDTERM 2 EXAM November 14, 2018

Review of probability

ENGG2430A-Homework 2

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Bandits, Experts, and Games

A Probability Review

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Introduction to Machine Learning

Recitation 2: Probability

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

5 Operations on Multiple Random Variables

MAS113 Introduction to Probability and Statistics. Proofs of theorems

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Bivariate Distributions

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Math 151. Rumbos Spring Solutions to Review Problems for Exam 2

ECSE B Solutions to Assignment 8 Fall 2008

Chapter 4 continued. Chapter 4 sections

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Appendix A : Introduction to Probability and stochastic processes

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

1 Probability Review. 1.1 Sample Spaces

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

CS145: Probability & Computing Lecture 11: Derived Distributions, Functions of Random Variables

Lecture 22: Variance and Covariance

UNIT Define joint distribution and joint probability density function for the two random variables X and Y.

FINAL EXAM: Monday 8-10am

Disjointness and Additivity

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Expectation of Random Variables

conditional cdf, conditional pdf, total probability theorem?

Machine Learning. Instructor: Pranjal Awasthi

Probability. Lecture Notes. Adolfo J. Rumbos

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

LIST OF FORMULAS FOR STK1100 AND STK1110

Review of Probability. CS1538: Introduction to Simulations

Properties of Summation Operator

Bivariate distributions

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

Tutorial 1 : Probabilities

Review: mostly probability and some statistics

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

Probability Theory and Statistics. Peter Jochumzen

More on Distribution Function

Math 151. Rumbos Fall Solutions to Review Problems for Final Exam

Math 426: Probability MWF 1pm, Gasson 310 Exam 3 SOLUTIONS

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Review (Probability & Linear Algebra)

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III

ELEG 3143 Probability & Stochastic Process Ch. 4 Multiple Random Variables

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Probability and Distributions

CMPSCI 240: Reasoning Under Uncertainty

Random Variables and Expectations

Introduction to Bayesian Statistics

Transcription:

Lecture 2: Probability

Statistical Paradigms Statistical Estimator Method of Estimation Output Data Complexity Prior Info Classical Cost Function Analytical Solution Point Estimate Simple No Maximum Likelihood Probability Theory Numerical Optimization Point Estimate Intermediate No Bayesian Probability Theory Sampling Probability Distribution Complex Yes The unifying principal for this course is statistical estimation based on probability

Overview Basic probability Joint, marginal, conditional probability Bayes Rule Random variables Probability distribution Discrete Continuous Moments One could spend ½ a semester on this alone...

Example White-breasted fruit dove (Ptilinopus rivoli) Yellow-bibbed fruit dove (Ptilinopus solomonensis)

Events S Sc R 2 9 Rc 18 3 Pr(A) = probability that event A occurs Pr(R) =? Pr(Rc) =? Pr(S) =? Pr(Sc) =?

Events S Sc R 2 9 Rc 18 3 Pr(A) = probability that event A occurs Pr(R) = 11/32 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Joint Probability Pr(A,B) = probability that both A and B occur Pr(R,Sc) =? Pr(S,Rc) =? Pr(R,S) =? Pr(Rc,Sc) =?

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Joint Probability Pr(A,B) = probability that both A and B occur Pr(R,Sc) = 9/32 Pr(S,Rc) = 18/32 Pr(R,S) = 2/32 Pr(Rc,Sc) = 3/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Pr(A or B) = Pr(A) + Pr(B) Pr(A,B) = 1 Pr(neither) Pr(R or S) =?

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Pr(A or B) = Pr(A) + Pr(B) Pr(A,B) = 1 Pr(neither) Pr(R or S) = 11/32 + 20/32 2/32 = 29/32 = 32/32 3/32 = 29/32

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 If Pr(A,B) = Pr(A) Pr(B) then A and B are independent Pr(R,S) = Pr(R) Pr(S)??

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 If Pr(A,B) = Pr(A) Pr(B) then A and B are independent 0.0625 = 2/32 = Pr(R,S) Pr(R) Pr(S) = 11/32 20/32 = 0.215

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Conditional Probability Pr(A B) = Probability of A given B occurred Pr(A B) = Pr(A,B) / Pr(B) Pr(B A) = Pr(B,A) / Pr(A) Pr(R S) =?

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(R S) = Pr(R,S) / Pr(S) = (2/32) / (20/32) = 2/20

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B)

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B) Pr(A B) Pr(B) = Pr(B A) Pr(A)

Conditional Probability Pr(B A) = Pr(B,A) / Pr(A) Pr(A B) = Pr(A,B) / Pr(B) Pr(A,B) = Pr(B A) Pr(A) Pr(B,A) = Pr(A B) Pr(B) Pr(A B) Pr(B) = Pr(B A) Pr(A) Pr(A B) = Pr(B A) Pr(A) / Pr(B) BAYES RULE

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Bayes Rule Pr(A B) = Pr(B A) Pr(A) / Pr(B) Pr(R S) = Pr(S R) Pr(R) / Pr(S) = (2/11) (11/32) / (20/32) = 2/20

Events S Sc R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Pr(S) = 20/32 Pr(Sc) = 12/32 32 Marginal Probability Pr(B) = Σ Pr(B,A i ) Pr(B) = Σ Pr(B A i ) Pr(A i ) Pr(R) =?

Events S Sc Marginal R 2 9 Pr(R) = 11/32 Rc 18 3 Pr(Rc) = 21/32 Marginal Pr(S) = 20/32 Pr(Sc) = 12/32 32 Marginal Probability Pr(B) = Σ Pr(B A i ) Pr(A i ) Pr(R) = Pr(R S) Pr(S) + Pr(R Sc) Pr(Sc) = 2/20 20/32 + 9/12 12/32 = 2/32 + 9/32 = 11/32

BAYES RULE: alternate form Pr(A B) = Pr(B A) Pr(A) / Pr(B) Pr(A B) = Pr(B A) Pr(A) / (ΣPr(B A) Pr(A) ) Factoring probabilities: Pr(A,B,C) = Pr(A B,C)Pr(B,C) = Pr(A B,C)P(B C)Pr(C) Joint = Conditional x Marginal

Random Variables a variable that can take on more than one value, in which the values are determined by probabilities Pr(Z = z k ) = p k given: 0 p k 1 Random variables can be continuous or discrete

Discrete random variables k can only take on discrete values (typically integers) We can define two important and interrelated functions probability mass function (pmf): f(z) = Pr(Z = z k ) = p k where f(z) = 1 (if not met, is just a density fcn) Cumulative distribution function (cdf): F(z) = Pr(Z z k ) = f(z) summed up to k 0 F(z) 1 but can be infinite in z

Example For the set {1,2,3,4,5,6} f(z) = Pr(Z = z k ) = {1/6,1/6,1/6,1/6,1/6,1/6} F(z) = Pr(Z z k )= {1/6,2/6,3/6,4/6,5/6,6/6} For z < 1, f(z) = 0, F(z) = 0 For z > 6, f(z) = 0, F(z) = 1

Continuous Random Variables z is Real (though can still be bound) Cumulative distribution function (cdf): F(z) = Pr(Z z) where 0 F(z) 1 Pr(Z = z) is infinitely small Pr(z Z z + dz) = Pr(Z z+dz) Pr(Z z) = F(z+dz) F(z) Probability density function (pdf): f(z) = df/dz f(z) 0 but NOT bound by 1

Continuous Random Variables f is derivative of F F is integral of f z dz Pr z Z z dz = z f z ANY function that meets these rules (positive, integrate to 1) Monday we will be going over a number of standard number of distributions and discussing their interpretation/application.

Example: exponential distribution f z = exp z F z =1 exp z Where z 0 What are the values of F(z) and f(z): At z = 0? As z? What do F(z) and f(z) look like?

Exponential Exp x = exp x f(z) At z = 0 f(z) = λ F(z) = 0 As z f(z) = 0 F(z) = 1 F(z)

Moments of probability distributions E[x n ]= x n f x dx First moment = Expected value = mean Example: exponential E[x]= x f x dx= 0 = x exp x 0 1 0 E[x]=1/ x exp x exp x

E[c] = c Properties of means E[x + c] = E[x] + c E[cx] = c E[x] E[x+y] = E[x] + E[y] (even if X is not independent of Y) E[E[x y]]= E[x] E[g(x)]!= g(e[x]) E(xy) = E[x]E[Y] iterated expectation Jensen's Inequality only if independent

Central Moments E[ x E[ x] n ]= x E[ x] n f x dx Second Central Moment = Variance = σ 2

Var ax =a 2 Var X Var X b =Var X Properties of variance Var X Y =Var X Var Y 2Cov X,Y Var ax by =a 2 Var X b 2 Var Y 2abCov X,Y Var X = Var X i 2 i j Cov X i, X j Var X =Var E[ X Y ] E[Var X Y ]

Distributions and Probability all the same properties apply to random variables Pr A, B joint distribution Pr A B = Pr A, B Pr B conditional distribution Pr A = Pr A B i Pr B i marginal distribution Pr A B = Pr B A Pr A Pr B Baye's Rule

Looking forward... Use probability distributions to: Quantify the match between models and data Represent uncertainty about model parameters Partition sources of process variability