Topic 6 page Topic 6 Continuous Random Variables Reference: Chapter 5.-5.3 Probability Density Function The Uniform Distribution The Normal Distribution Standardizing a Normal Distribution Using the Standard Normal Table Linear Transformations of Random Variables Further Examples of Continuous Distributions
Topic 6 page2 Probability Density Function If an experiment can result in an infinite, non-countable number of outcomes, then the random variable defined can be continuous. Whenever the value of a random variable is measured rather than counted, a continuous random variable is defined. Water level in the reservoir Distance between two points Amount of peanut butter in a jar Level of gasoline in fuel tank of a car
Topic 6 page3 The values of the random variables in these examples can be any of an infinite number of values within a defined interval [a,b]. These random variables could be any number from minus to plus b. Recall, that for the probability mass function, (p.m.f.), the value of P(X=x) was represented by the height of the spike at the point X=x.
Topic 6 page4 P(x) 2 3 X One of the major differences between discrete and continuous probability distributions is that this representation no longer holds. As you will see, a continuous p.d.f. is represented by the area between the x axis and the density function.
Topic 6 page5 An important fundamental rule of the continuous random variable is when the random variable is continuous, the probability that any one specific value takes place is zero. P(X=x)= =.
Topic 6 page6 Hence, we can determine probability values only for intervals, such as: P(a x b), (where a and b are some 2 points.) NOTE: f(x) does not represent probability! It represents how high or dense the function is at any specified value of x. P(X=x) = in the continuous case, so f(x) probability. P(a x b) = Area under f(x) from a to b.
Topic 6 page7 f(x) P(a <x < b) in the continuous case. p.d.f.(x) a b X Pa ( < x< b) = f( xdx ) b a
Topic 6 page8 Pa ( < x< b) = f( xdx ) b a To determine the probability, one must integrate the function between two points. Properties of All Probability Density Functions ) The total area under f(x), from - to has to equal. P( x ) = f ( x) dx = 2) f x ( ) The density function is never negative. A major difference between the p.m.f. and p.d.f. is f(x) does not have to be.
Topic 6 page9 Also, for a continuous random variable, it does not matter whether the endpoints are included in the interval or not. The addition of the endpoint changes the probability only by the value of =, so it has no effect. f(x) 2 p.d.f.(x) Example : x
Topic 6 page 2 f(x) f ( x) = 2; x 5. ; otherwise Area =.5 2 X Example 2:
Topic 6 page f(x) 4 + X 8 f ( x) = + X; 2 x 2 4 8 ; otherwise -2 2 X f(x) is a straight line for values between x= -2 and x=2. (X does not have to be positive.)
Topic 6 page2 Example 3: f(x) e x f(x) is a decreasing function for x between zero and infinity. 2 3 4 x
Topic 6 page3 Uniform Distribution: The uniform distribution is a rectangular distribution. Uniform random variable. f(x) f ( x) = = ; < x < = ; otherwise Area = X
Topic 6 page4 f ( x) dx = f ( x) dx + f ( x) dx + f ( x) dx = + f ( x) dx + ] = dx = x = x + Rule: dx = x dx = = X ] = = + -------------------------------------------------------------------------- Example: For the uniform distribution: 2 P( < x < ) = f ( x) dx = x] 2 = ( ) = 2 2 2 --------------------------------------------------------------------------
Topic 6 page5 Recall, in the discrete case, the cumulative distribution function: F( X*) = P( x) and x x* E( X) = xp( x) x 2 2 V( x) = E( X ) [ E( X)] ;
Topic 6 page6 In the continuous case: F( X*) = f ( x) dx and x* E( X) = xf ( x) dx V( x) = x f ( x) dx [ E( x)] = x f ( x) dx xf ( x) dx 2 2 2 2
Topic 6 page7 From the last example: f(x)= for ( < x < ); otherwise: 2 E( X) = xf ( x) dx = xdx = 2 x = ( ) = 2 V ( x) = ( x μ) f ( x) dx = ( x ) dx = ( x x + ) dx [ ] x x x [ ] = + = + = 3 3 2 2 4 ( Sheppard' s Correction) 3 2 4 2 ] 2 2 2 2 2 4
Topic 6 page8 Uniform random variable. f(x) f ( x) = = ; < x < = ; otherwise Area = X
Topic 6 page9 The Cumulative Distribution Function: Recall that for a discrete random variable, we obtained the cumulative probability distribution by adding up the probabilities accordingly: For example: x p(x) p(x).3.3.5.8 2.2 That is, PX ( ) = 8. PX ( 2) =. ; etc.
Topic 6 page2 Similarly, for a continuous random variable, we can work out: F(b)=P(X b)=p(x<b)= b f ( x) dx. (F(b) = Area under the p.d.f. to the left of X=b.) Note: Values of F(b) = probabilities. Values of f(x) probabilities. (Density function)
Topic 6 page2 For example: F(b) represents the probability that the random variable x assumes a value less than or equal to some specified value, say b. To calculate F(b) in the continuous case, it is necessary to integrate f(x) over the relevant range, rather than sum discrete probabilities. F(b)= P(x b) = all area under f(x) from x b.
Topic 6 page22 Notation: Values of F(x) represent probabilities, while values of f(x) do not. f(x) F(b) = P(x b) b x
Topic 6 page23 Example: Let X= time (months) taken to get a job. f ( x) = = 2 8 X; x 4 = ; otherwise First determine if this is a proper p.d.f.. f(x).5 Area = 2 4 X
Topic 6 page24 (a) 4 4 ( 2 X X ] ( X ) f ( x) dx = f ( x) dx = dx = 2 6 4 [( 4 6 ) ( )] 2 6 2 6 = [( ) ] = 2 = So this is a proper p.d.f.. 2 8
Topic 6 page25 (b) What is the probability of waiting more than 2 months to get a job? ( 2 X X ] ( X ) P( X > 2) = f ( x) dx = dx or: 4 2 = 2 6 2 4 ( ) ( ) [ 4 6 2 4 ] 2 6 2 6 = [ ] 4 = ( 2 ) ( ) = 25. 4 2 2 8
Topic 6 page26 P( X > 2) = P( X 2) = F( 2) 2 ( X ) = f ( x) dx = 2 ( ] 2 X X 2 6 [( 2 4 ) ( 2 6 2 6 )] [( 4 ) ]. = = = = 25 2 2 8 dx
Topic 6 page27 Mean and Variance: Recall, for discrete random variables: E( X) = xp( x) = [ μ ] 2 2 V( x) = E ( x ) = ( x μ) P( x) and x E( g( x)) = g( x) P( x). x μ x
Topic 6 page28 In the continuous case: E( X) = xf ( x) dx = V( X) = ( x μ) 2 f ( x) dx since = x f ( x) dx xf ( x) dx 2 Egx [ ( )] = gx ( ) f( xdx ) μ The summary measures of central location and dispersion are important in describing the density function. 2
Topic 6 page29 The mean represents central location: the balancing point of p.d.f.: μ = E( x) The variance and standard deviation of p.d.f.: dispersion σ 2 = V( x) and σ = V( x)
Topic 6 page3 Rule of Thumb Interpretation of Standard Deviation: μ ± σ contains 68% of probability ( area) μ ± 2σ contains 95% of probability ( area) For the continuous probability distribution, the weights are areas given by the f(x)dx (height times width). Example: From the example on waiting time to get a job, what are the average and the standard deviation of waiting time? f ( x) = = 2 8 X; x 4 = ; otherwise
Topic 6 page3 E( X) = xf ( x) dx = [ 2 3 X X ] 4 4 24 ( ) 2 ( X ) X X 2 8 ( 2 8) = xf ( x) dx = x dx = dx [ ] 6 64 8 = = 4 24 ( ) = ( 4 3 ) = months. V( X) = E( x E( x)) = E( x ) [ E( x)] 4 4 2 x f x dx ( 4 2 2 ) x ( X = ( ) ( ) 3 = ( ) 2 8 ) dx = 4 μ 4 2 2 2 4 2 3 X ( 2 ) ( ) [ 3 4 X X X dx = ] 4 6 8 9 6 32 ( 6 9 ) ( 64 256 [ 6 ) ] ( 6 32 9 ) = = 889. standard deviation =.889 = 4. 943 months. 6 9 3
Topic 6 page32 The Median Value: Recall that the median divides the (ranked) data into two equal parts. For a continuous random variable we need to find the value of X, say m, such that: P(X m)=.5. (That is the point where 5% of the area is on either side of x, or F(m) =.5, the cumulative area up to some point where area = 5%.)
Topic 6 page33 Example: Suppose: f x X; x ( ) = = 2 = ; otherwise f(x) 2.. X m
Topic 6 page34 Now find the median. Find m such that F(m) =.5 m m m f ( x) dx = 5. [ 2 x] m ( 2xdx ) = = 5. 2 = 5. m = 5. = 77.
Topic 6 page35 The Normal Distribution The most important distribution in statistics is the Normal or Gaussian distribution: (i) Relates to a continuous random variable that can take any real value. (ii) Many natural phenomenons involve this distribution. (iii) Even if a phenomenon of interest follows some other probability distribution, if we average enough I independent random effects, the result is a normal random variable.
Topic 6 page36 A normal r.v. is characterized by two parameters: Mean (μ) Variance (σ 2 ) The formula for the p.d.f. when X is N(μ, σ 2 ) is: f ( x) = e x σ 2Π 2σ 2 where ( < x < ). ( μ) The normal distribution is a continuous distribution in which x can assume any value between minus infinity and plus infinity. 2
Topic 6 page37 The normal density function is symmetrical bell-shaped probability density function. f(x) σ σ X μ-σ μ μ+σ Since П and e are constants, if μ and σ are known, it is possible to evaluate areas under this function by using calculus. (i.e. integration.)
Topic 6 page38 All normal distributions have the same bell shaped curve regardless of the μ (center) and σ (the spread or width) of the distribution. f(x) σ=.5 σ=2 μ Properties: A symmetric bell shaped p.d.f., centered at μ. So, μ = Mean = Mode = Median X
Topic 6 page39 The spread or shape is determined by the variance. f(x) μ-a μ μ+a X Area under the curve =. So, P(x<μ) = P(x>μ) =.5 by symmetry. Similarly, P(x > μ+a)=p(x < μ-a).
Topic 6 page4 Example: Suppose X~N(μ=5,σ 2 =4). What is the P(x 4)? (F(4)) f ( x) = e x σ 2Π 2σ 2 where ( < x < ). 4 ( μ) 2 P( x 4) = f( x) dx = e( )( x 5) 2 2Π 4 This is going to be tedious! 8 2 Fortunately, we can use an idea we met earlier to assist us.
Topic 6 page4 We learned that we can standardize a random variable by subtracting its mean, and dividing by its standard deviation: Z=(x-μ) / σ.
Topic 6 page42 The Two Properties of the Normal Standardized Random Variable: ) E(Z) = Proof: ( EZ ( ) E x μ) = = E( x ) μ σ σ = σ [ E( x) μ] = ( ) = σ μ μ Expected value of a standard normal variable is zero.
Topic 6 page43 2) V(Z)= Proof: ( V Z V x μ) ( ) = = V( x ) μ σ σ = 2 [ V( x) ] σ 2 = 2 ( σ ) = σ Variance of a standard normal variable is one. 2
Topic 6 page44 Standardized Normal Values of x for the normal distribution usually are described in terms of how many standard deviations they are away from the mean. Treating the values of x in a normal distribution in terms of standard deviations about the mean, has the advantage of permitting all normal distributions to be compared to one common or standard normal distribution. It is easier to compare normal distributions having different values of μ and σ if these curves are transformed to one common form: Standardized Normal Distribution.
Topic 6 page45 The standardized variable represents a new variable. Z = x μ σ. Using the Standardized Normal: Since the normal distribution is completely symmetrical, tables of Z-values usually include only positive values of Z. The lowest value in the text table is Z=.. F() = P(Z )=.5, the cumulative probability up to this point.
Topic 6 page46 Four Basic Rules: Rule : P(Z a) is given by F(a) where a is positive. f(z) F(a)=P(Z a) Z If a=2.25 P(Z 2.25)=.9878 μ a
Topic 6 page47 Rules 2: P(Z a) is given by the complement rule as - F(a). f(z) F(a)=P(Z a) P(Z a)=-f(a) Z μ a
Topic 6 page48 Rule 3: P(Z -a), where (-a) is negative is given by (-F(a)). f(z) F(a)=P(Z a) P(Z a)=-f(a) Z -a μ a
Topic 6 page49 Rule 4: P(a Z b) is given by F(b) F(a): f(z) F(a)=P(Z a) F(a) F(b) μ a b Z The area under the curve between two points a and b is found by subtracting the area to the left of a, F(a), from the area to the left of b, F(b).
Topic 6 page5 If a and b are negative: P(-a z -b) = P(b Z a)=f(a)-f(b). Also, linear transformations of Normal random variables are also normal. So: Z ~ N(,) Why is this transformation useful? To solve probability questions!! Example: X~ N(μ =5, σ 2 = 4) P(x<4):
Topic 6 page5 Px ( < ) = P x μ < a μ 4 σ σ = P Z < 4 5 2 = PZ ( < ) 2 2 f () z dz = F ( ) 2 4 5 X -½ Z If we have a table of such integrals already evaluated for Z, we can always transform any problem into the form where we can evaluate probability from just one table.
Topic 6 page52 Using Table from the text: f(z) Z -a μ a Table III: a Fa ( ) = f( zdz ) = PZ ( a)
Topic 6 page53 Examples: (i) P(Z < )=.5 (ii) P(Z < ) =.843 (iii) P(Z > 2) = - P(Z 2) = -.9772 =.228 (iv) P(Z < -)= P(Z > ) = P(Z ) = -.843 =.587 - Z (v) P(-<Z<2)= F(2) F(-) =.9772-.587=.885
Topic 6 page54