Chapter 4: Continuous Probability Distributions

Similar documents
Chapter 4: Continuous Random Variable

STAT509: Continuous Random Variable

Applied Statistics and Probability for Engineers. Sixth Edition. Chapter 4 Continuous Random Variables and Probability Distributions.

Chapter 4 Continuous Random Variables and Probability Distributions

Ching-Han Hsu, BMES, National Tsing Hua University c 2015 by Ching-Han Hsu, Ph.D., BMIR Lab. = a + b 2. b a. x a b a = 12

Continuous Random Variables and Continuous Distributions

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

SDS 321: Introduction to Probability and Statistics

STAT 430/510 Probability Lecture 12: Central Limit Theorem and Exponential Distribution

STAT 509 Section 3.4: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables

Probability Density Functions

Gamma and Normal Distribuions

STAT Chapter 5 Continuous Distributions

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Continuous Random Variables

3 Continuous Random Variables

Exponential, Gamma and Normal Distribuions

Continuous random variables

Common ontinuous random variables

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

S n = x + X 1 + X X n.

Brief Review of Probability

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.

Chapter 4 Continuous Random Variables and Probability Distributions

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Continuous distributions

Expected Values, Exponential and Gamma Distributions

p. 6-1 Continuous Random Variables p. 6-2

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Chapter 4: CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

Probability and Distributions

Continuous Distributions

Continuous Probability Distributions. Uniform Distribution

Random Variables and Their Distributions

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

2DI90 Probability & Statistics. 2DI90 Chapter 4 of MR

Chapter 2 Continuous Distributions

Basics of Stochastic Modeling: Part II

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Page 312, Exercise 50

HW7 Solutions. f(x) = 0 otherwise. 0 otherwise. The density function looks like this: = 20 if x [10, 90) if x [90, 100]

Expected Values, Exponential and Gamma Distributions

CIVL 7012/8012. Continuous Distributions

Some Continuous Probability Distributions: Part I. Continuous Uniform distribution Normal Distribution. Exponential Distribution

2 Continuous Random Variables and their Distributions

Let X be a continuous random variable, < X < f(x) is the so called probability density function (pdf) if

Chapter 4. Continuous Random Variables 4.1 PDF

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Chapter 2. Continuous random variables

Chapter 3 Common Families of Distributions

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Math438 Actuarial Probability

STAT509: Discrete Random Variable

General Random Variables

CSE 312, 2017 Winter, W.L. Ruzzo. 7. continuous random variables

4Continuous Random. Variables and Probability Distributions CHAPTER OUTLINE LEARNING OBJECTIVES

SDS 321: Introduction to Probability and Statistics

Slides 8: Statistical Models in Simulation

Review for the previous lecture

Continuous Random Variables

Stat 426 : Homework 1.

Stat410 Probability and Statistics II (F16)

Probability Midterm Exam 2:15-3:30 pm Thursday, 21 October 1999

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Northwestern University Department of Electrical Engineering and Computer Science

Chapter 8: Continuous Probability Distributions

Will Landau. Feb 21, 2013

STAT 516 Midterm Exam 2 Friday, March 7, 2008

Continuous Distributions

Exponential Distribution and Poisson Process

Special distributions

Continuous RVs. 1. Suppose a random variable X has the following probability density function: π, zero otherwise. f ( x ) = sin x, 0 < x < 2

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Statistics for Engineers Lecture 4 Reliability and Lifetime Distributions

Chapter 3: Random Variables 1

STAT 3610: Review of Probability Distributions

Recall Discrete Distribution. 5.2 Continuous Random Variable. A probability histogram. Density Function 3/27/2012

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Notes on Continuous Random Variables

Fundamental Tools - Probability Theory II

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

HW Solution 12 Due: Dec 2, 9:19 AM

Experimental Design and Statistics - AGA47A

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

2 Functions of random variables

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Lecture 3 Continuous Random Variable

F X (x) = P [X x] = x f X (t)dt. 42 Lebesgue-a.e, to be exact 43 More specifically, if g = f Lebesgue-a.e., then g is also a pdf for X.

Guidelines for Solving Probability Problems

Chapter 3: Discrete Random Variable

1 Review of Probability and Distributions

Lecture 4. Continuous Random Variables and Transformations of Random Variables

STA 256: Statistics and Probability I

Math Spring Practice for the Second Exam.

STA 256: Statistics and Probability I

Transcription:

Chapter 4: Continuous Probability Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 57

Continuous Random Variable A continuous random variable is one with an interval (either finite or infinite) of real numbers for its range. Examples Let X = length in meter. Let X = temperature in F. Let X = time in seconds 2 / 57

Continuous Random Variable Cont d Because the number of possible values of X is uncountably infinite, the probability mass function (pmf) is no longer suitable. For a continuous random variable, P(X = x) = 0, the reason for that will become clear shortly. For a continuous random variable, we are interested in probabilities of intervals, such as P(a X b), where a and b are real numbers. We will introduce the probability density function (pdf) to calculate probabilities, such as P(a X b). 3 / 57

Probability Density Function Every continuous random variable X has a probability density function (pdf), denoted by f X (x). Probability density function f X (x) is a function such that a f X (x) 0 for any x R b f X (x)dx = 1 c P(a X b) = b a f X (x)dx, which represents the area under f X (x) from a to b for any b > a. d If x 0 is a specific value, then P(X = x 0 ) = 0. We assign 0 to area under a point. 4 / 57

Cumulative Distribution Function Here is a pictorial illustration of pdf: Let x 0 be a specific value of interest, the cumulative distribution function (CDF) is defined via F X (x 0 ) = P(X x 0 ) = x0 f X (x)dx. 5 / 57

Cumulative Distribution Function Cont d If x 1 and x 2 are specific values, then P(x 1 X x 2 ) = x2 x 1 f X (x)dx = F X (x 2 ) F X (x 1 ). From last property of a pdf, we have P(x 1 X x 2 ) = P(x 1 < X < x 2 ) = P(x 1 X < x 2 ) = P(x 1 < X x 2 ) 6 / 57

Example: Electric Current Let the continuous random variable X denote the current measured in a thin copper wire in milliamperes. Assume that the range of X (measured in ma) is [0, 20], and assume that the probability density function of X is f X (x) = 0.05 for 0 x 20. What is the probability that a current measurement is less than 10 milliamperes? Solution: The plot of pdf of X is 7 / 57

More Example Suppose that Y has the pdf { 3y 2, 0 < y < 1 f Y (y) = 0, otherwise. (a) Find the CDF of Y. 8 / 57

More Example Cont d Suppose that Y has the pdf { 3y 2, 0 < y < 1 f Y (y) = 0, otherwise. (b) Calculate P(Y < 0.3) 9 / 57

More Example Cont d Suppose that Y has the pdf { 3y 2, 0 < y < 1 f Y (y) = 0, otherwise. (c) Calculate P(0.3 Y 0.8) 10 / 57

Mean of a Continuous r.v. The mean (expectation) and variance can also be defined for a continuous random variable. Integration replaces summation in the calculation of expectation for a discrete r.v. Recall that for a discrete random variable Y. The mean of Y is defined as E (Y ) = µ Y = all y y p Y (y). Definition: For a continuous random variable X. The mean of X is defined as E (X ) = µ X = xf X (x)dx 11 / 57

Mean of a Continuous r.v. Cont d NOTE: The limits of the integral in this definition, while technically correct, will always be the lower and upper limits corresponding to the nonzero part of the pdf. Theorem: Just like in the discrete case, we can calculate the expected value for a function of a continuous r.v. Let X be a continuous random variable with pdf f X (x). Suppose that g is a real-valued function. Then, g(x ) is a random variable and E [g(x )] = g(x)f X (x)dx. 12 / 57

Variance of a Continuous r.v. Definition: The variance of X, denoted as Var (X ) or σ 2, is σ 2 = Var (X ) = E[(X µ) 2 ] = The population standard deviation of X is σ = σ 2, the positive square root of the variance. (x µ) 2 f X (x) dx. The computational formula for variance is the same as the discrete case, i.e., Var (X ) = E (X 2 ) [E (X )] 2. 13 / 57

Example: Electric Current Recall that the pdf of X is { 0.05, 0 x 20 f X (x) = 0, otherwise. Let s calculate the mean of X : 14 / 57

Example: Electric Current Cont d What about the variance of X? 15 / 57

Introduction to Exponential Distribution We have discussed Poisson distribution in the previous chapter, which, for example, can model the number of car accidents for a given length of time t. The waiting time between accidents is another random variable that is often of interest. We can use exponential distribution to model such a waiting period. 16 / 57

Introduction to Exponential Distribution Define Define X = the waiting time between two car accidents N = the number of accidents during time of length x We know that if the mean number of accidents is λ per base unit, then the random variable N Poisson(λx). 17 / 57

Introduction to Exponential Distribution To model the waiting time, suppose there is no accident during the time of length x. Now, P(X > x) = P(N = 0) = e λx (λx) 0 which means that there is no events in [0, x]. By the complement rule, it follows that 0! = e λx, F X (x) = P(X x) = 1 P(X > x) = 1 e λx. By differentiating the CDF of X, the pdf of X is { λe λx, x 0 f X (x) = 0, otherwise. 18 / 57

PDF & CDF Definition: A r.v. Y is continuous if its cdf F Y (y) is continuous for < y <. Definition: Suppose Y is a continuous r.v. with cdf F Y (y). Then pdf of Y is f Y (y) = d dy F Y (y). Corollary: see that From the Fundamental Theorem of Calculus, We F Y (y) = y f (t)dt. Then pdf of Y is f Y (y). 19 / 57

Exponential Distribution The plot of pdf of exponential distribution with different values of λ is shown below: The shorthand notation for X following exponential distribution is given by X exp(λ) 20 / 57

Mean and Variance for an Exponential Random Variable Suppose that X exp(λ), then E(X ) = 1 λ var(x ) = 1 λ 2 Note, for exponential r.v., mean = standard deviation. 21 / 57

Summary on exponential distribution Suppose that X exp(λ), then pdf: f X (x) = λe λx, for x 0 dexp(x, λ) in R CDF: F X (x) = P(X x) = 1 e λx, for x 0 pexp(x, λ) in R Note: P(X > x) = 1 F X (x) = e λx, for x 0 Mean: E(X ) = 1 λ Variance: Var(X ) = 1 λ 2 22 / 57

Example: Computer Usage Let X denote the time in hours from the start of the interval until the first log-on. Then, X has an exponential distribution with 1 log-on per hour. We are interested in the probability that X exceeds 6 minutes. (Hint: Because λ is given in log-ons per hour, we express all time units in hours. That is, 6 minutes =0.1 hour) Solution: 23 / 57

Example: Accidents The time between accidents at a factory follows an exponential distribution with a historical average of 1 accident every 900 days. What is the probability that there will be more than 1200 days between the next two accidents? Solution: 24 / 57

Example: Accidents Cont d What is the probability that there will be less than 900 days between the next two accidents? Solution: 25 / 57

Exponential or Poisson Distribution? We model the number of industrial accidents occurring in one year. We model the length of time between two industrial accidents (assuming an accident occurring is a Poisson event). We model the time between radioactive particles passing by a counter (assuming a particle passing by is a Poisson event). We model the number of radioactive particles passing by a counter in one hour 26 / 57

Example: Radioactive Particles The arrival of radioactive particles at a counter is Poisson events. The number of particles in an interval of time follows a Poisson distribution. Suppose, on average, we have 2 particles per millisecond. What is the probability that no particles will pass the counter in the next 3 milliseconds? Solution: 27 / 57

More Example: Machine Failures If the number of machine failures in a given interval of time follows a Poisson distribution with an average of 1 failure per 1000 hours, what is the probability that there will be no failures during the next 2000 hours? Solution: What is the probability that the time until the next failure is more than 2000 hours? Solution: 28 / 57

Lack of Memory Property An even more interesting property of an exponential random variable is concerned with conditional probabilities. The exponential distribution is often used in reliability studies as the model for the time until failure of a device. The lack of memory property of the exponential distribution implies that the device does not wear out, i.e., P(X > t + t X > t) remains the same for any t. However, the lifetime L of a device that suffers slow mechanical wear, such as bearing wear, is better modeled by a distribution s.t. P(X > t + t X > t) increases with t, such as the Weibull distribution (later). 29 / 57

Proof of Lack of Memory Property Show P(X > t + t X > t) = P(X > t). 30 / 57

Understanding Lack of Memory Property To understand this property of exponential distribution, let us assume X models the life time of a light bulb. The lack of memory property tells you that given the fact that the light bulb still survives at time t, the probability it will work greater than additional t amount of time (the conditional probability) equals to the probability that it will work greater than t amount of time from the beginning (the unconditional probability). The exponential distribution is the only continuous distribution with the memoryless property. 31 / 57

Introduction to Weibull Distribution As mentioned previously, the Weibull distribution is often used to model the time until failure of many different physical systems. The random variable X with probability density function f X (x) = β δ ( x δ ) β 1 e (x/δ) β, for x 0 is a Weibull random variable with scale parameter δ > 0 and shape parameter β > 0. The shorthand notation is X Weibull(β, δ). 32 / 57

Introduction to Weibull Distribution Cont d By changing the values of β and δ, the Weibull pdf can assume many shapes. Because of this flexibility (and for other reasons), the Weibull distribution is very popular among engineers in reliability applications. 33 / 57

CDF of Weibull Distribution If X has a Weibull distribution with parameters δ and β, then the cumulative distribution function of X is F X (x) = P(X x) = 1 e (x/δ)β, for x 0 It follows that P(X > x) = e (x/δ)β. 34 / 57

Example: Bearing Wear The time to failure (in hours) of a bearing in a mechanical shaft is satisfactorily modeled as a Weibull random variable with β = 1/2 and δ = 5000 hours. Determine the probability that a bearing lasts at least 6000 hours. Solution: 35 / 57

Summary on Weibull Distribution Suppose that X Weibull(β, δ), pdf: f X (x) = β ( x ) β 1 δ δ e (x/δ) β, for x 0 dweibull(x, β, δ) in R CDF: F X (x) = 1 e (x/δ)β, for x 0 pweibull(x, β, δ) in R ( ) Mean: E(X ) = δγ 1 + 1 β ( ) ( ( )) ] 2 Variance: var(x ) = δ [Γ 2 1 + 2 β Γ 1 + 1 β where Γ is called gamma function and Γ(n) = (n 1)! if n is a positive integer. 36 / 57

Gamma Function Definition: The gamma function is defined as Γ(α) = where α > 0. Facts about the gamma function: 1. Γ(1) = 1. 0 y α 1 e y dy, 2. Γ(α) = (α 1)Γ(α 1), for any α > 1. 3. Γ(n) = (n 1)!, n is an integer. 37 / 57

Example: Battery Lifetime The lifetime of a rechargeable battery under constant usage conditions, denoted by T (measured in hours), follows a Weibull distribution with parameters β = 2 and δ = 10. 1. What is the probability that a battery is still functional at time t = 20? 2. What is the probability that a battery is still functional at time t = 20 given that the battery is functional at time t = 10? 38 / 57

Introduction to Normal Distribution Most widely used distribution. Central Limit Theorem (Chapter 7): Whatever the distribution the random variable follows, if we repeat the random experiment over and over, the average result over the replicates follows normal distribution almost all the time when the number of the replicates goes to large. Other names: Gaussian distribution, bell-shaped distribution or bell-shaped curve. 39 / 57

Density of Normal Distribution A random variable X with probability density function f (x) = 1 2πσ e (x µ)2 2σ 2, < x < is a normal random variable with parameters µ and σ, where < µ <, and σ > 0. Also, E (X ) = µ and Var (X ) = σ 2. We use X N (µ, σ 2 ) to denote the distribution. Now our objective is to calculate probabilities (of intervals) for a normal random variable through R or normal probability table. 40 / 57

Density of Normal Distribution Cont d The plot of the pdf of normal distribution with different parameter values: CDF: The cdf of a normal random variable does not exist in closed form. Probabilities involving normal random variables and normal quantiles can be computed numerically. 41 / 57

Characteristics of Normal pdf Bell-shaped curve. < x <, i.e., the range of X is the whole real line. µ determines distribution location and is the highest point on curve. Curve is symmetric about µ. σ determines distribution spread. Inflection points at µ ± σ. 42 / 57

It s Your Turn... Which pdf in the above has greater variance? 43 / 57

Example: Strip of Wire Assume that the current measurements in a strip of wire follow a normal distribution with a mean of 10 milliamperes and a variance of 4 (milliamperes) 2. What is the probability that a measurement exceeds 13 milliamperes? Solution: Let X denotes the measure in that strip of wire. Then, X N (10, 4). We want to calculate > 1-pnorm(13,10,2) [1] 0.0668072 P(X > 13) = 1 P(X 13). Note code of calculating F (x) = P(X x) is of the form pnorm(x, µ, σ). 44 / 57

Empirical Rule For any normal random variable X whose distribution is close to bell-shaped, P(µ σ < X < µ + σ) = 0.6827 P(µ 2σ < X < µ + 2σ) = 0.9543 P(µ 3σ < X < µ + 3σ) = 0.9973 These are summarized in the following plot: 45 / 57

Earthquakes in a California Town Since 1900, the magnitude of earthquakes that measures 0.1 or higher on the Richter Scale in a certain location in California is distributed approximately normally, with µ = 6.2 and σ = 0.5, according to data obtained from the United States Geological Survey. Earthquake Richter Scale Readings 46 / 57

Earthquakes in a California Town Cont d Approximately what percent of the earthquakes are above 5.7 on the Richter Scale? What is the approximate probability an earthquake is above 6.7? 47 / 57

Standard Normal Distribution If X is a normal random variable with E (X ) = µ and Var (X ) = σ 2, the random variable Z = X µ σ is a normal random variable with E (Z) = 0 and Var (Z) = 1. That is, Z is called the standard normal random variable. Creating a new random variable by this transformation is referred to as standardizing. Z is traditionally used as the symbol for a standard normal random variable. 48 / 57

Standard Normal Probability Table 49 / 57

Normal Probability Table Cont d With the help of normal probability table, we can calculate the probabilities for nonstandard normal distribution through standardizing. Suppose X N (10, 4), we want to calculate P(X > 13). P(X > 13) = P( X 10 13 10 > ) 4 4 = P(Z > 1.5) = 1 P(Z 1.5) = 1 0.9332 (from table) = 0.0668 50 / 57

Normal Probability Table Cont d If we want to calculate P(X < 7), P(X < 7) = P( X 10 4 < 7 10 4 ) = P(Z < 1.5) = 0.0668 If we want to calculate P(X > 7), P(X > 7) = P( X 10 4 > 7 10 4 ) = P(Z > 1.5) = P(Z < 1.5) (by symmetry) = 0.9332 51 / 57

Example: Steel Bolt The thickness of a certain steel bolt that continuously feeds a manufacturing process is normally distributed with a mean of 10.0 mm and standard deviation of 0.3 mm. Manufacturing becomes concerned about the process if the bolts get thicker than 10.5 mm or thinner than 9.5 mm. Find the probability that the thickness of a randomly selected bolt is greater than 10.5 or smaller than 9.5 mm. Solution: 52 / 57

Inverse Normal Probabilities Sometimes we want to answer a question which is the reverse situation. We know the probability, and want to find the corresponding value of Y. 53 / 57

Inverse Normal Probabilities Cont d What is the cutoff value that approximately 2.5% of the bolts produced will have thicknesses less than this value? Solution: We need to find the z value such that P(Z < z) = 0.025. We can transform back to the original cutoff value from z. The R code of finding z is > qnorm(0.025, 0, 1) [1] -1.959964 Or, z = 1.96 by table. It follows that P(Z < 1.96) = 0.025 ( ) X 10 P < 1.96 = 0.025 0.3 P(X < 9.412) = 0.025 Therefore, the cutoff value is 9.412. 54 / 57

Inverse Normal Probabilities Cont d What is the cutoff value that approximately 1% of the bolts produced will have thicknesses greater than this value? Solution: 55 / 57

Normal Distribution Exercises The fill volume of an automatic filling machine used for filling cans of carbonated beverage is normally distributed with a mean of 12.4 fluid ounces and a standard deviation of 0.1 fluid ounce. What is the probability that a randomly chosen can will contain between 12.3 and 12.5 ounces? Solution: 2.5% of the cans will contain less than Solution: ounces. 56 / 57

Normal Distribution Exercises Cont d The army reports that the distribution of head circumferences among male soldiers is approximately normal with a mean of 22.8 inches and standard deviation of 1.1 inch. The army plans to make helmets in advance to fit the middle 98% of head circumferences for male soldiers. What head circumferences are small enough or big enough to require custom fitting? Solution: 57 / 57