Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic

Similar documents
Binomial and Poisson Probability Distributions

Probability and Probability Distributions. Dr. Mohammed Alahmed

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Lecture 4: Random Variables and Distributions

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

4. Discrete Probability Distributions. Introduction & Binomial Distribution

Continuous Probability Distributions

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

Special distributions

Binomial random variable

SDS 321: Introduction to Probability and Statistics

Name: Firas Rassoul-Agha

Random Variables Example:

Bernoulli Trials, Binomial and Cumulative Distributions

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Review of Probability. CS1538: Introduction to Simulations

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Chapter 15 Sampling Distribution Models

Discrete probability distributions

Lecture 2: Probability and Distributions

Unit 4 Probability. Dr Mahmoud Alhussami

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning

Special Discrete RV s. Then X = the number of successes is a binomial RV. X ~ Bin(n,p).

Introduction to Probability and Statistics Slides 3 Chapter 3

Discrete Distributions

Quick Tour of Basic Probability Theory and Linear Algebra

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Counting principles, including permutations and combinations.

Lecture 3. Discrete Random Variables

Continuous Probability Distributions

Brief Review of Probability

Chapter 3 Discrete Random Variables

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Slides 8: Statistical Models in Simulation

Chapter 5. Chapter 5 sections

To find the median, find the 40 th quartile and the 70 th quartile (which are easily found at y=1 and y=2, respectively). Then we interpolate:

STA 111: Probability & Statistical Inference

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Some Continuous Probability Distributions: Part I. Continuous Uniform distribution Normal Distribution. Exponential Distribution

Lecture 2: Repetition of probability theory and statistics

Probability Distributions

Chapter 2: The Random Variable

Relationship between probability set function and random variable - 2 -

Discrete Random Variables

ACM 116: Lecture 2. Agenda. Independence. Bayes rule. Discrete random variables Bernoulli distribution Binomial distribution

9/6/2016. Section 5.1 Probability. Equally Likely Model. The Division Rule: P(A)=#(A)/#(S) Some Popular Randomizers.

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Continuous Distributions

Bandits, Experts, and Games

EE/CpE 345. Modeling and Simulation. Fall Class 5 September 30, 2002

15 Discrete Distributions

ECON Fundamentals of Probability

MTH135/STA104: Probability

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random

S n = x + X 1 + X X n.

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

6 The normal distribution, the central limit theorem and random samples

Topic 3: The Expectation of a Random Variable

Random variables. DS GA 1002 Probability and Statistics for Data Science.

1 Random Variable: Topics

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Decision making and problem solving Lecture 1. Review of basic probability Monte Carlo simulation

Fundamental Tools - Probability Theory II

Applied Statistics I

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

1 Bernoulli Distribution: Single Coin Flip

Each trial has only two possible outcomes success and failure. The possible outcomes are exactly the same for each trial.

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

An-Najah National University Faculty of Engineering Industrial Engineering Department. Course : Quantitative Methods (65211)

3 Modeling Process Quality

Statistics and Econometrics I

functions Poisson distribution Normal distribution Arbitrary functions

Math/Stat 352 Lecture 8

Given a experiment with outcomes in sample space: Ω Probability measure applied to subsets of Ω: P[A] 0 P[A B] = P[A] + P[B] P[AB] = P(AB)

Chapter 4 - Lecture 3 The Normal Distribution

CSE 312: Foundations of Computing II Quiz Section #10: Review Questions for Final Exam (solutions)

Basics on Probability. Jingrui He 09/11/2007

Probability Density Functions

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, x. X s. Real Line

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Probability Distributions

continuous random variables

CS 361: Probability & Statistics

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

1 Presessional Probability

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Statistical Experiment A statistical experiment is any process by which measurements are obtained.

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

Guidelines for Solving Probability Problems

3 Multiple Discrete Random Variables

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions

STAT Chapter 5 Continuous Distributions

MATH : EXAM 2 INFO/LOGISTICS/ADVICE

MATH Solutions to Probability Exercises

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

Class 26: review for final exam 18.05, Spring 2014

Transcription:

BSTT523: Pagano & Gavreau, Chapter 7 1 Chapter 7: Theoretical Probability Distributions Variable - Measured/Categorized characteristic Random Variable (R.V.) X Assumes values (x) by chance Discrete R.V. Can assume a finite number of values Continuous R.V. Can assume any value within an interval 1. Discrete Random Variables p.2 2. Some Discrete Distributions Bernoulli distribution p.5 Binomial distribution p.7 Poisson distribution p.13 3. Continuous Random Variables p.16 4. The Normal/Gaussian Distribution p.18

BSTT523: Pagano & Gavreau, Chapter 7 2 1. Discrete Random Variables: Definition: The probability distribution of a discrete r.v. X: A table, graph, formula, or other device that specifies all possible values of X and their respective probabilities Example 7.1 Table form: Birth order of children in the U.S. x: Birth Order P(X = x) 1 0.416 2 0.330 3 0.158 4 0.058 5 0.021 6 0.009 7 0.004 8+ 0.004 Total 1.000

BSTT523: Pagano & Gavreau, Chapter 7 3 Discrete Probability Density Function (Discrete PDF): f(x) = P(X = x) Properties of the discrete PDF: i. 0 f(x i ) 1 for all x i Non-negative ii. f(x i ) {all x i } = 1 Exhaustive iii. P X = x i X = x j = f(x i ) + f(x j ) Additive Cumulative Distribution Function (CDF): F(x) = P(X x) = x i x P(X = x i ) = x i x f(x i ) Note: i. PDF f(x i ) = P(X = x i ) = F(x i ) F(x i 1 ) ii. P(X < x i ) = F(x i 1 ) iii. P(a < X b) = F(b) F(a)

BSTT523: Pagano & Gavreau, Chapter 7 4 Example 7.1: Birth Order x: Birth Order PDF f(x) = P(X = x) 1 0.416 0.416 2 0.330 0.746 3 0.158 0.904 4 0.058 0.962 5 0.021 0.983 6 0.009 0.992 7 0.004 0.996 8+ 0.004 1.000 CDF F(x) = P(X x) Q1. Prob. that a child picked at random was mother s 1 st or 2 nd child? Q2. Prob. that a child picked at random was of birth order fewer than 4? Q3. Prob. that a child picked at random was of order 5 or more? Q4. Prob. that a child picked at random was of order between 3 and 5?

BSTT523: Pagano & Gavreau, Chapter 7 5 2. Some Discrete Distributions Bernoulli Distribution Bernoulli Variable: Binary Variable 1, success X = 0, failure Bernoulli Trial: One performance of experiment with 0/1 outcome Denote p = P(X = 1) q = P(X = 0) = 1 p The PDF of the Bernoulli distribution is p if X = 1 f(x) = q if X = 0 = p x q 1 x, x = 0,1 = p x (1 p) 1 x, x = 0,1 The Bernoulli distribution has one parameter = p

BSTT523: Pagano & Gavreau, Chapter 7 6 If X follows a Bernoulli distribution, then Mean: Variance: μ = E(X) = p σ 2 = Var(X) = pq = p(1 p) Examples of Bernoulli variables: 1, Heads Ex. 1: flip a coin X = 0, Tails Ex. 2: roll a die, interested in 3 s X = 1, die falls on 3 0, otherwise

BSTT523: Pagano & Gavreau, Chapter 7 7 Binomial Distribution Perform n independent Bernoulli trials. X = number of successes (1 s) p = probability of success in each trial X~BIN(n, p) q = 1 p Q: What is the PDF f(x), x = 0,1,, n of X~BIN(n, p)? i.e., what is the probability of x successes in n Bernoulli trials? Q1. 5 Bernoulli trials, X~BIN(5, p) P(result is 10010)=? Solution: pqqpq = p 2 q 3 Q2. Other results with 2 successes out of 5? Number Sequence 1 11000 2 10100 3 10010 4 10001 5 01100 6 01010 7 01001 8 00110 9 00101 10 00011 There are 10 ways to get 2 successes out of 5 The probability of each sequence is p 2 q 3 P(Sequence 1 or 2 or or 10) = 10p 2 q 3

BSTT523: Pagano & Gavreau, Chapter 7 8 Definition: A Combination of n subjects taken x at a time = Number of unordered subsets of x ( n choose x ) = ncx = n! x!(n x)! where x! = x(x-1)(x-2) (2)(1) and define 0!=1 Example: 5 choose 2 how many subsets of 2 out of 5? 5C2 = 5! = 5 4 = 10 2!(5 2)! 2 1 Back to binomial distribution question: X~BIN(5, p); f(2)=? f(5)=? Ans: f(2) = 5C2p 2 q 3 = 10p 2 q 3 f(5) = 5C5p 5 q 0 = 5! 5!0! p5 q 0 = 1p 5 1= p 5

BSTT523: Pagano & Gavreau, Chapter 7 9 Binomial PDF f(x) : X~BIN(n, p) P(x successes in n Bernoulli trials) f(x) = ncx p x q n-x, x = 0, 1,, n = 0, otherwise Number of Successes x Probability f(x) 0 nc0 q n 1 nc1 pq n-1...... x ncx p x q n-x...... n-1 ncn-1 p n-1 q n ncn p n Total 1 Important Binomial distribution features: Mean: Variance: μ = E(X) = np σ 2 = Var(X) = npq

BSTT523: Pagano & Gavreau, Chapter 7 10 Example 7.2 Smoking in the U.S.: 29% are smokers, or p =.29 Select a random sample of size 10. Q1. What is P(4 smokers in the sample)? X = number of smokers out of n = 10 X~BIN(10,.29) Solution 1. f(4) = 10C4 (0.29) 4 (0.71) 6 = 10! (.00707)(.1281) =.1903 4!6! Solution 2. Table A.1 (P.A1): Binomial PDF p = 0.05 to 0.5, n = 2 to 20 f(4) : n = 10, p.30 f(4).2001 Solution 3. SAS: PROBBNML(p, n, m) CDF PDF( BINOMIAL, x, p, n) PDF CDF( BINOMIAL, x, p, n) - CDF Q2. P(6 or more smokers in the sample)=? P(X 6) = 1 F(5) = 1 (. 9596) =.0404

BSTT523: Pagano & Gavreau, Chapter 7 11 Q3. Among the 10 individuals chosen, what is the expected number of smokers? E(X) = np = 10 29 = 2.9 Variance and SD: Var(X) = npq = 10 (. 29) (. 71) = 2.059 SD = npq = 2.059 = 1.43 Note: Using Table A.1, what if p>0.5? f(x, n, p) = ncx p x (1-p) n-x f(n x, n, 1 p) = ncn-x (1-p) n-x (p) x ncx= n! x!(n x)! = n! (n x)!x! = ncn-x f(x, n, p) = f(n x, n, 1 p) i.e. if p>0.5 then treat X C as success. P(X x), X~BIN(n, p) = P(X C n x), X C ~BIN(n, 1 p)

BSTT523: Pagano & Gavreau, Chapter 7 12 Example 7.3 What do you think about the problem of childhood obesity? Poll in 2003: 55% of residents think it is serious. Randomly select n=12 residents. Q1. P(8 people think it is serious )? X~BIN(12,.55) f(8) =.1700 Same as P(4 out of 12 do not think serious ); X~BIN(12,.45) f(4) =.1700 Q2. P(5 or fewer think serious ) =? P(X 5 n = 12, p =.55) = P(X 7 n = 12, p =.45) = 1 P(X 6 n = 12, p =.45) = 1.7393 =.2607 Q3. Among the sample of 12, what is the expected number of people who think childhood obesity is serious? E(X) = np = 12.55 = 6.6 Q4. What is the variance of the number who think childhood obesity is serious? Var(X) = npq = 12 (. 55) (. 45) = 2.97

BSTT523: Pagano & Gavreau, Chapter 7 13 Poisson Distribution X = number of event occurrences in a given interval of time/space/volume etc. i.e. Count Data Probability that x events will occur: f(x) = e λ λ x x!, x=0, 1, 2,... X~POI(λ) Important Poisson features: Mean: E(X) = λ Variance: Var(X) = λ When λ is small, the distribution is right-skewed; when λ increases (λ 10), the distribution becomes symmetric.

BSTT523: Pagano & Gavreau, Chapter 7 14 Example 7.4 Allergic reaction to anesthesia (Laake and Rottingen) Occurrences of reaction Poisson, about 12 incidents per year expected Q1. In the next year, what is the probability of seeing 3 incidents? Solution: X~POI(12) f(3) = e 12 12 3 3! =.00177 Q2. What is the probability that at least 3 will have a reaction in the next year? Solution 1: P(X 3) = 1 P(X 2) = 1 F(2) = 1 {f(0) + f(1) + f(2)} = 1 e 12 12 0 0! + e 12 12 1 1! + e 12 12 2 2! = 1.00052225 =.9994775

BSTT523: Pagano & Gavreau, Chapter 7 15 Solution 2: Table A.2 (P.A-6): POISSON PDF P(X 3) = 1 F(2) = 1 (. 0000 +.0001 +.0004) =.9995 Solution 3: SAS: POISSON(λ, x) CDF PDF( POISSON, x, λ) PDF CDF( POISSON, x, λ) CDF

BSTT523: Pagano & Gavreau, Chapter 7 16 3. Continuous Random Variables Continuous X can assume any value within its range. Within any interval, there are theoretically an infinite number of values. Subareas of histograms represent frequency of occurrence of values within class intervals Total frequency of values between a and b: add all subareas for intervals a through b. If width of class intervals is very small, then connecting midpoints (creating a frequency polygon) creates a smooth curve. If probability is shown on the y-axis and we have a smooth curve: probability density function (PDF) f(x) P(a < X b) = total area under f(x) between a and b, b a or f(t)dt.

BSTT523: Pagano & Gavreau, Chapter 7 17 Cumulative density function (CDF) of X: F(x) = x f(t)dt Note: Total area under f(x) = 1, i.e., + f(t)dt = 1 and f(x) = d F(x) = F (x) dx

BSTT523: Pagano & Gavreau, Chapter 7 18 4. A special continuous distribution: the Normal or Gaussian Normal PDF: f(x) = 1 (x μ) 2 2πσ e 2σ 2, < x < + X~N(μ, σ 2 ) Characteristics: Distribution is symmetric around μ Mean = Median = Mode = μ Total area under the curve = 1, i.e., (x μ) + 1 2 2πσ e 2σ 2 Area under the curve between σ and +σ.68 Area under the curve between 2σ and +2σ.95 Area under the curve between 3σ and +3σ.997 = 1 E(X) = μ Var(X) = σ 2 location parameter scale parameter Standard Normal Distribution: Z~N(0,1) has PDF φ(z) = 1 z 2 2π e 2, < z < +

BSTT523: Pagano & Gavreau, Chapter 7 19 Table A.3: Standard Normal Upper Tail Cumulative Probabilities P(Z z 0 ) = 1 Φ(z 0 ), z 0 0 where Φ(z) = z φ(t)dt is the CDF for Z for z 0 < 0, Φ(z 0 ) = P(Z z 0 ) = P(Z ( z 0 )), z 0 0 Example 7.5 Given a variable that follows the standard normal distribution, i.e. Z~N(0,1), what is P(z 1) and P(z 1)? Solution: by Table A.3, P(z 1)=0.159 and P(z 1) = P(z 1) = 0.159 Example 7.6 Randomly pick a value z from the standard normal distribution. P(z has a value between -2 and +2) =? Solution: Note that for a continuous distribution P(X = x) = 0. P( 2 z +2) = P( 2 < z < +2) = 1 P(z 2) P(z 2) = 1 2 P(z 2) = 1 2 (. 023) = 0.954

BSTT523: Pagano & Gavreau, Chapter 7 20 How is the N(0,1) distribution related to N(μ, σ 2 )? If X~N(μ, σ 2 ) and Z = (X μ) σ, then Z~N(0, 1). Example 7.7 Systolic Blood Pressure (SBP) (p.181 P&G) X = SBP for 18-74 year old males; X~N(μ, σ 2 ) with μ=129 mm Hg and σ=19.8 mm Hg. Find x which is the cutoff for the upper 2.5% of the SBP distribution; i.e. find x such that P(X > x) =.025. Solution: By Table A.3 we know that P(Z 1.96) =.025. (x μ) σ = 1.96 (x 129) 19.8 = 1.96 x = (1.96)(19.8) + 129 = 167.8 What proportion of men in this population have SBP>150 mmhg? Solution: P(X > 150) = P (x μ) σ > (150 129) 19.8 = P(Z > 1.06) = 0.145 14.5%

BSTT523: Pagano & Gavreau, Chapter 7 21 Example 7.8 Breath study (Diskin et al.) X = Ammonia concentration in parts per billion (ppb) μ=491 ppb, σ = 119 ppb; i.e. X~N(491, 119 2 ) P(292 X 649) =? Solution 1: P(292 X 649) = P 292 491 119 X μ σ 649 491 119 = P( 1.67 Z 1.33) = 1 P(Z 1.67) P(Z 1.33) = 1.047.092 =.861 Solution 2: SAS: ProbNorm(x) N(0,1) CDF PDF( NORMAL, x) N(0,1) PDF PDF( NORMAL, x, μ, σ) N(μ, σ) PDF CDF( NORMAL, x) N(0,1) CDF CDF( NORMAL, x, μ, σ) N(μ, σ) CDF