Lecture 2: Probability and Distributions

Similar documents
Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability and Probability Distributions. Dr. Mohammed Alahmed

Binomial and Poisson Probability Distributions

Sampling WITHOUT replacement, Order IS important Number of Samples = 6

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Statistical Experiment A statistical experiment is any process by which measurements are obtained.

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Basic Statistics and Probability Chapter 3: Probability

Dept. of Linguistics, Indiana University Fall 2015

Conditional Probabilities

9/6/2016. Section 5.1 Probability. Equally Likely Model. The Division Rule: P(A)=#(A)/#(S) Some Popular Randomizers.

4. Discrete Probability Distributions. Introduction & Binomial Distribution

Review for Exam Spring 2018

CIVL Why are we studying probability and statistics? Learning Objectives. Basic Laws and Axioms of Probability

Announcements. Topics: To Do:

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Today we ll discuss ways to learn how to think about events that are influenced by chance.

BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES

Probability 5-4 The Multiplication Rules and Conditional Probability

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

The possible experimental outcomes: 1, 2, 3, 4, 5, 6 (Experimental outcomes are also known as sample points)

Lecture 16. Lectures 1-15 Review

Probability and Discrete Distributions

AMS7: WEEK 2. CLASS 2

Lecture 2: Repetition of probability theory and statistics

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Decision making and problem solving Lecture 1. Review of basic probability Monte Carlo simulation

Chapter 2: Probability Part 1

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio

Econ 325: Introduction to Empirical Economics

Topic 4 Probability. Terminology. Sample Space and Event

7.1 What is it and why should we care?

Basic Concepts of Probability

2.6 Tools for Counting sample points

Lecture 1: Probability Fundamentals

Statistical Inference

2. AXIOMATIC PROBABILITY

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic

Given a experiment with outcomes in sample space: Ω Probability measure applied to subsets of Ω: P[A] 0 P[A B] = P[A] + P[B] P[AB] = P(AB)

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Algorithms for Uncertainty Quantification

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Properties of Probability

Conditional Probability

Stochastic Models of Manufacturing Systems

HW MATH425/525 Lecture Notes 1

CIVL Probability vs. Statistics. Why are we studying probability and statistics? Basic Laws and Axioms of Probability

Discrete Probability. Chemistry & Physics. Medicine

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

STEP Support Programme. Statistics STEP Questions

MAT2377. Ali Karimnezhad. Version September 9, Ali Karimnezhad

Probability Review. Gonzalo Mateos

Lecture 2 Binomial and Poisson Probability Distributions

Introduction to Probability. Experiments. Sample Space. Event. Basic Requirements for Assigning Probabilities. Experiments

Review of Probabilities and Basic Statistics

lecture notes October 22, Probability Theory

1 Probability Theory. 1.1 Introduction

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

STAT 201 Chapter 5. Probability

1 The Basic Counting Principles

Relative Risks (RR) and Odds Ratios (OR) 20

Chapter Summary. 7.1 Discrete Probability 7.2 Probability Theory 7.3 Bayes Theorem 7.4 Expected value and Variance

CS 361: Probability & Statistics

Chapter 6. Probability

Probability Notes (A) , Fall 2010

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Counting principles, including permutations and combinations.

Bernoulli Trials, Binomial and Cumulative Distributions

Deep Learning for Computer Vision

CIVL 7012/8012. Basic Laws and Axioms of Probability

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all

Lecture 4: Random Variables and Distributions

Test 1 Review. Review. Cathy Poliak, Ph.D. Office in Fleming 11c (Department Reveiw of Mathematics University of Houston Exam 1)

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

ACMS Statistics for Life Sciences. Chapter 9: Introducing Probability

Bandits, Experts, and Games

Statistical Theory 1

2. In a clinical trial of certain new treatment, we may be interested in the proportion of patients cured.

What is Probability? Probability. Sample Spaces and Events. Simple Event

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

Chapter Learning Objectives. Random Experiments Dfiii Definition: Dfiii Definition:

Relationship between probability set function and random variable - 2 -

Sociology 6Z03 Topic 10: Probability (Part I)

Standard & Conditional Probability

2011 Pearson Education, Inc

3.2 Probability Rules

Math Bootcamp 2012 Miscellaneous

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 2: Random Experiments. Prof. Vince Calhoun

Lecture 3. Discrete Random Variables

CSC Discrete Math I, Spring Discrete Probability

Advanced Herd Management Probabilities and distributions

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Discrete probability distributions

Recap of Basic Probability Theory

Transcription:

Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65

Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info mathematical notation Providing a framework for answering scientific questions Later, we will see how some common statistical methods in the scientific literature are actually probability concepts in disguise 2 / 65

What is Probability? Probability is a measure of uncertainty about the occurrence of events Two definitions of probability Classical definition Relative frequency definition 3 / 65

Classical Definition P(E) = m N If an event can occur in N equally likely and mutually exclusive ways, and if m of these ways possess the characteristic E, then the probability of E is m N 4 / 65

Example: Coin toss Flip one coin Tails and heads equally likely N = 2 possible events Let H=Heads and T=Tails We are interested in the probability of tails: P(Tails)= P(T ) = 1 2 5 / 65

Relative Frequency Definition P(E) = m n If an experiment is repeated n times, and characteristic E occurs m of those times, then the relative frequency of E is m n, and it is approximately equal to the probability of E 6 / 65

Example: Multiple coin tosses I Flip 100 coins Outcome Frequency T = Tails 53 H = Heads 47 Total 100 P(Tails) = P(T ) 53 100 = 0.53 0.50 7 / 65

Example: Multiple coin tosses II What happens if we flip 10,000 coins? Outcome Frequency T = Tails 5063 H = Heads 4937 Total 10000 P(Tails) = P(T ) 5063 10000 = 0.51 0.50 8 / 65

Relative frequency intuition The probability of T is the limit of the relative frequency of T, as the sample size n goes to infinity The long run relative frequency 9 / 65

Outcome characteristics Statistical independence Mutually exclusive 10 / 65

Statistical Independence Two events are statistically independent if the joint probability of both events occurring is the product of the probabilities of each event occuring: P(A and B) = P(A) P(B) 11 / 65

Example Let A = first born child is female Let B = second child is female P(A and B) = probability that first and second children are both female: Assuming independence: P(A and B) = P(A) P(B) = 1 2 1 2 = 1 4 12 / 65

Statistical independence: comment In a study where we are selecting patients at random from a population of interest, we assume that the outcomes we observe are independent... In what situations would this assumption be violated 13 / 65

Mutually exclusive Two events are mutually exclusive if the joint probability of both events occuring is 0: P(A and B) = 0 Ex: A = first child is female, B = first child is male 14 / 65

Probability rules 1 The probability of any event is non-negative, and no greater than 1: 0 P(E) 1 2 Given n mutually exclusive events, E 1, E 2,, E n covering the sample space, the sum of the probabilities of events is 1: n P(E i ) = P(E 1 ) + P(E 2 ) + + P(E n ) = 1 i=1 3 If E i and E j are mutually exclusive events, then the probability that either E i or E j occur is: P(E i E j ) = P(E i ) + P(E j ) 15 / 65

Set notation A set is a group of disjoint objects An element of a set is an object in the set The union if two sets, A and B, is a larger set that contains all elements in either A, B or both Notation: A B The intersection if two sets, A and B, is the set containing all elements found in both A and B Notation: A B 16 / 65

The addition rule If two events, A and B, are not mutually exclusive, then the probability that event A or event B occurs is: P(A B) = P(A) + P(B) P(A B) where P(A B) is the probability that both events occur 17 / 65

Conditional probability The conditional probability of an event A given an event B is: P(A B) = P(A B) P(B) where P(B) 0 18 / 65

The multiplication rule In general: P(A B) = P(B) P(A B) When events A and B are independent, P(A B) = P(A) and: P(A B) = P(A) P(B) 19 / 65

Bayes rule Useful for computing P(B A) if P(A B) and P(A B c ) are known Ex: Screening We know P(test positive true positive) We want P(true positive test positive) Ex: Bayesian statistics uses assumptions about P(data state of the world) to derive statements about P(state of the world data) The rule: P(B A) = P(A B) P(B) P(A B) P(B) + P(A B c ) P(B c ) where B c denotes the complement of B or not B 20 / 65

Example: Sex and Age I Age Age Young (B 1 ) Older (B 2 ) Total Male (A 1 ) 30 20 50 Female (A 2 ) 40 10 50 Total 70 30 100 21 / 65

Example: Sex and Age II A 1 = {all males}, A 2 = {all females} B 1 = {all young}, B 2 = {all older} A 1 A 2 = {all people} = B 1 B 2 A 1 A 2 = {no people} = = B 1 B 2 A 1 B 1 = {all males and young females} A 1 B 2 = {all males and older females} A 2 B 2 = {older females} 22 / 65

Example: Sex and Age III P(A 1 ) = P(male) = 50 100 = 0.5 P(A 2 ) = P(female) = 50 100 = 0.5 P(B 1 ) = P(young) = 70 100 = 0.7 P(B 2 ) = P(older) = 30 100 = 0.3 23 / 65

Example: Sex and Age IV P(A 2 B 2 ) = P(older and female) = 10 100 = 0.1 P(A 1 B 1 ) = P(young or male) = P(A 1 ) + P(B 1 ) P(A 1 B 1 ) = 50 100 + 70 100 30 100 = 90 100 = 0.9 24 / 65

Example: Sex and Age V P(B 2 A 2 ) = P(older female) = P(B 2 A 2 ) = 10/100 P(A 2 ) 50/100 = 10 50 = 0.2 P(B 2 A 1 ) = P(older male) = P(B 2 A 1 ) = 20/100 P(A 1 ) 50/100 = 20 50 = 0.4 P(B 2 ) = P(older) = 30 100 = 0.3 P(B 2 A 2 ) P(B 2 A 1 ) P(B 2 ) In this group, sex and age are not independent 25 / 65

Example: Sex and Age VI P(A 1 A 2 ) = P(B 1 B 2 ) = P(A 2 B 2 ) = 26 / 65

Example: Blood Groups I Sex Blood group Male Female Total O 113 170 283 A 103 155 258 B 25 37 62 AB 10 15 25 Total 251 377 628 27 / 65

Example: Blood Groups II P(male) = 1 P(female) = 251 628 0.4 P(O) = 283 628 0.45 P(A) = 258 628 0.41 P(B) = 62 628 0.10 P(AB) = 25 628 0.04 28 / 65

Example: Blood Groups III Question: Are sex and blood group independent? P(O male) = 113 251 0.45 P(O female) = 170 377 0.45 same as P(O) = 283 628 0.45 Can show same equalities for all blood types Yes, sex and blood group appear to be independent of each other in this sample 29 / 65

Example: Disease in the population I For patients with Disease X, suppose we knew the age proportions per sex, as well as the sex distribution. Question: Could we compute the sex proportions in each age group (young / older)? Answer: Use Bayes Rule 30 / 65

Example: Disease in the population II P(A 1 ) = P(A 2 ) = 0.5 P(B 2 A 2 ) = 0.2 P(B 2 A 1 ) = 0.4 P(A 2 B 2 ) = P(B 2 A 2 ) P(A 2 ) P(B 2 A 2 ) P(A 2 ) + P(B 2 A 1 ) P(A 1 ) 31 / 65

Probability Distributions Often, we assume a true underlying distribution Ex: P(tails) = 1 2, P(heads) = 1 2 This distribution is characterized by a mathermatical formula and a set of possible outcomes Two types of distributions: Discrete Continuous 32 / 65

Most Commonly Used Discrete Distributions Binomial two possible outcomes Underlies much of statistical applications to epidemiology Basic model for logistic regression Poisson uses counts of events of rates Basis for log-linear models 33 / 65

Most Commonly Used Continuous Distributions Normal bell shaped curve Many characteristics are normally distributed or approximately normally distributed Basic model for linear regression Exponential useful in describing growth 34 / 65

Counting techniques Factorials: counts the number of ways to arrange things Permutations: counts the number of possible ordered arrangements of subsets Combinations: counts the number of possible unordered arrangements of subsets 35 / 65

Factorials Notation: n! ( n factorial ) Number of possible arrangements of n objects n! = n(n-1)(n-2)(n-3) (3)(2)(1) 36 / 65

Permutations Ordered arrangement of n objects, taken r at a time np r = n! (n r)! = n(n 1) (n r + 1)(n r) 1 (n r)(n r 1) 1 = n(n 1)(n 2) (n r + 1) 37 / 65

Combinations An arragement of n objects taken r at a time without regard to order ( ) n n! = n choose r = r r!(n r)! ( ) 4 4! = 2 2!(4 2)! = 4 3 2 1 = 12 2! 2! 2 = 6 Note: the number of combinations is less than or equal to the number of permutations. 38 / 65

The Binomial Distribution You ve seen it before: 2 x 2 tables and applications Proportions: CIs and tests Sensitivity and Specificity Odds ratio and relative risk Logistic regression 39 / 65

Binomial Distribution Assumption Bernoulli trial model The study of experiment consists of n smaller experiments (trials) each of which has only two possible outcomes Dead or alive Success of failure Diseased, not diseased The outcomes of the trials are independent The probabilities of the outcomes of the trial remain the same from trial to trial 40 / 65

Binomial Distribution Function The probability of obtaining x successes in n Bernoulli trials is: P(X = x) = ( ) n x p x (1 p) n x where: p = probability of a success q = 1-p = probability of failure X is a random variable x is a particular number 41 / 65

Example: Binomial (n=2) I What is the probability, in a random sample of size 2, of observing 0, 1, or 2 heads? # heads Possible outcome Probability 2 HH p p = p 2 1 HT p q TH q p 0 TT q q= q 2 42 / 65

Example: Binomial (n=2) II P(X = x) = P(X = 0) = = ( ) n p x (1 p) n x x ( ) 2 (0.5) 0 (0.5) 2 0 0 2! 0!(2 0)! (1)(0.5)2 = 0.25 = q 2 43 / 65

Example: Binomial (n=2) III P(X = 1) = ( ) 2 (0.5) 1 (0.5) 2 1 1 = 2! 1!(2 1)! (0.5)(0.5) = 2(0.5)(0.5) = 0.5 = 2 p q ( ) 2 P(X = 2) = (0.5) 2 (0.5) 2 2 2 = 2! 2!(2 2)! (0.5)2 (0.5) 0 = 0.25 = p 2 44 / 65

Example: Binomial (n=3) I # successes Samples P(X=x) ( 3 {+ + +} 3 ) ( 3 p 3 q 0 = p 3 2 {+ +, + +, + +} 3 ) ( 2 p 2 q = 3p 2 q 1 {+, +, +} 3 ) 1 ) pq 2 = 3pq 2 0 { } p 0 q 3 = q 3 ( 3 0 45 / 65

Example: Binomial (n=3) II Since X takes discrete values only: P(X 1) = P(X = 0) + P(X = 1) P(X < 1) = P(X = 0) P(X > 2) = P(X = 3) P(1 X 2) = P(X = 1) + P(X = 2) P(X 1) = P(X = 1) + P(X = 2) + P(X = 3) = 1 P(X = 0) 46 / 65

Example: Binomial (n=3) III The probability that a person suffering from a head cold will obtain relief with a particular drug is 0.9. Three randomly selected sufferers from the cold are given the drug. p=0.9 q=1-p=0.1 n=3 47 / 65

Example: Binomial (n=3) IV 2 3 = 8 possible outcomes Outcome Probability Trial 1 Trial 2 Trial 3 S S S ppp S S F ppq S F S pqp F S S qpp F F S qqp F S F qpq S F F pqq F F F qqq 48 / 65

Example: Binomial (n=3) V Probability exactly zero (none) obtain relief: ( ) 3 P(X = 0) = p 0 q 3 = q 3 0 = (0.1) 3 = 0.001 Probability exactly one obtains relief: P(X = 1) = ( ) 3 p 1 q 2 1 = 3! 1!2! pq2 = 3 2! 1!2! pq2 = 3pq 2 = 3(0.9)(0.1) 2 = 0.027 49 / 65

Mean and Variance Mean of a random variable (r.v.) X Expected value, expectation µ = E(X ) i x ip(x = x i ) for discrete r.v. x f (x)dx for continuous r.v. + Variance of a random variable, X σ 2 = Var(X ) = E(X µ) 2 = E(X 2 ) µ 2 The standard deviation σ = σ 2 = Var(X ) 50 / 65

Example: Bernoulli Distribution Let X = 1 with probability p, and 0 otherwise Calculation of the mean: E(X ) = µ = i=0,1 x ip(x = x i ) = 1 p + 0 (1 p) = p Calculation of the variance: E(X 2 ) = (1 2 ) p + (0 2 ) (1 p) = p Var(X ) = E(X 2 ) µ 2 = p p 2 = p (1 p) 51 / 65

Properties of Expectation 1 E(c) = c where c is a constant 2 E(c X ) = c E(X ) 3 E(X 1 + X 2 ) = E(X 1 ) + E(X 2 ) 52 / 65

Properties of Variance 1 Var(c) = 0 where c is a constant 2 Var(c X ) = c 2 Var(X ) 3 Var(X 1 + X 2 ) = Var(X 1 ) + Var(X 2 ) if X 1 and X 2 are independent 53 / 65

Binomial Mean and Variance S is Binomial (n,p), so... S = n i=1 X i where X i are independent Bernoulli(p) random variables E(S) = n i=1 E(X i) = n i=1 p = np Var(S) = n i=1 Var(X i) = n i=1 p(1 p) = np(1 p) 54 / 65

Poisson Distribution Describes occurrences or objects which are distributed randomly in space or time Often used to describe distribution of the number of occurrences of a rare event Underlying assumptions similar to those for binomial distribution Useful when there are counts with no denominator Distribution Binomial Poisson Parameters needed n, p λ = np = the expected number of events per unit time 55 / 65

Poisson Distribution Examples Number of Prussian officers killed by horse kicks between 1875 and 1894 Spatial distribution of stars, weeds, bacteria, flying-bomb strikes Emergency room or hospital admissions Typographical errors Deaths due to a rare disease 56 / 65

Poisson Assumptions The occurrences of a random event in an interval of time are independent In theory, an infinite number of occurrences of the event are possible (though perhaps rare) within the interval In any extremely small portion of the interval, the probability of more than one occurrence of the event is approximately zero 57 / 65

Poisson Probability The probability of x occurrence of an event in an interval is: P(X = x) = e λ λ x, x = 0, 1, 2,... x! where λ = the expected number of occurrences in the interval e = a constant ( 2.718) For the Poisson distribution: mean = variance = λ 58 / 65

Example: Traffic accidents I Suppose the goal has been set of bringing the expected number of traffic accidents per day in Baltimore down to 3. There are 5 fatal accidents today. Has the goal be attained? The number of accidents follows a Poisson distribution because The population that drives in Baltimore is large The number of accidents is relatively small People have similar risks of having an accident (?) The number of people driving each day is fairly stable The probability of two accidents occurring at exactly the same time is approximately zero 59 / 65

Example: Traffic accidents II We are aiming for of a rate of λ = 3 fatal accidents per day, or lower The observed number is 5 P(X = 5; λ = 3) = e 3 3 5 5! = 0.101 Has the goal been attained? 60 / 65

Example: Suicide in the City If the rate for a given rare condition is expressed as µ per time period, the expected number of events is µt where t is the time period Suppose the weekly rate of suicide in a large city is 2. What is the probability of one suicide in a given week? What is the probability of 2 suicides in 2 weeks? 61 / 65

Poisson and Binomial The Poisson distribution can be used to approximate a binomial distribution when: n is large and p is very small, or np = λ is fixed, and n becomes infinitely large 62 / 65

Example: Cancer in a large population Yearly cases of esophageal cancer in a large city; 30 cases observed in 1990 P(X = 30) = e λ λ 30 30! where λ = yearly average number of cases of esophageal cancer 63 / 65

Example: Down s syndrome I Suppose the incidence of Down s syndrome in 40-year-old mothers is 1/100 Out of 25 babies born to 40-year-old women, what is the frequency of babies with Down s syndrome? We can approach this problem using a Binomial(25, 1/100) model, or using a Poisson(λ = 0.25) model 64 / 65

Example: Down s syndrome II Babies with P(X=x) Down s Syndrome Poisson Binomial 0 0.779 0.778 1 0.195 0.196 2 0.024 0.024 >2 0.002 0.002 Note: the approximation becomes even better for larger values of n. 65 / 65