Probability and Statistics for Decision Making Fall, 07
Course Outline Main Tetbook Probability & Statistics for Engineers & Scientists, Ronald E. Walpole Raymond H. Myers, and Sharon L. Myers, 9 th Edition Other Tetbooks Applied Statistics and Probability for Engineers, 6 th Edition, Douglas C. Montgomery, and George C. Runger Probability & Statistics With R for Engineers and Scientists, Akritas, Michael. The slides are based on these tetbooks and will be available at: http://scholar.cu.edu.eg/?qdoaashawky/ Topic Week Overview on Probability & Introduction to Statistics Ch-4 Week Sampling Distributions & Estimation Ch9 Week 3 Hypothesis Testing with or -group samples Ch0 Week 4 Analysis of Variance Ch3 Week 5 Simple Linear Regression and Correlation Ch Week 6 Mulitple Linear Regression Ch
Tools The R tool https://cran.r-project.org/bin/windows/base/ The RStudio (an IDE for R) https://www.rstudio.com/products/rstudio/ R Tutorial https://cran.r-project.org/doc/manuals/r-release/r-intro.html
Grading Policy Item Posted Due % Points Assigned Assignment Week 3 Week 4 30 Assignment Week 4 Week 5 30 Final Eam Week 7 40
Week Introduction to Statistics & Overview on Probability
Objectives By the end of this lesson, you should be able to: Identify the basic engineering process and the role of statistics. Recall the basic concepts of probability and statistics terminology. Recall the basic concepts of descriptive statistics. Recall some basic discrete and continuous random distributions. Identify some engineering applications to random distributions. Learn the basics of R using R-Studio IDE.
The Engineering Process An engineer is someone who solves problems of interest to society with the efficient application of scientific principles by: Refining eisting products Designing new products or processes The engineering method [Source: Applied Statistics and Probability for Engineers, Wiley]
Statistics: Basic Ideas Statistics is the area of science that deals with collection, organization, analysis, and interpretation of data. It also deals with methods and techniques that can be used to draw conclusions about the characteristics of a large number of data points-commonly called a population- by using a smaller subset of the entire data.
Statistics in Engineering Engineers perform tests to learn how things behave under different scenarios, and at what point they might fail. As engineers perform eperiments, they collect data that can be used to eplain relationships better and to reveal information about the quality of products and services they provide.
Probability: A review on basic concepts The Sample Space: (Optional) Definition The set of all possible outcomes of a statistical eperiment is called the sample space and is denoted by S. Each outcome (element or member) of the sample space S is called a sample point.
Eample Camera Specifications Suppose that the recycle times of two cameras are recorded. The etension of the positive real line R is to take the sample space to be the positive quadrant of the plane S R+ R+ If the objective of the analysis is to consider only whether or not the cameras conform to the manufacturing specifications, either camera may or may not conform. We abbreviate yes and no as y and n. If the ordered pair yn indicates that the first camera conforms and the second does no: the sample space can be represented by the four outcomes: S {yy, yn, ny, nn} If we are interested only in the number of conforming cameras in the sample, we might summarize the sample space as S {0,, } As another eample, consider an eperiment in which cameras are tested until the flash recycle time fails to meet the specifications. The sample space can be represented as S {n, yn, yyn, yyyn, yyyyn, and so forth} and this is an eample of a discrete sample space that is countably infinite.
Sample spaces can also be described graphically with tree diagrams. When a sample space can be constructed in several steps or stages, we can represent each of the nways of completing the first step as a branch of a tree. Each of the ways of completing the second step can be represented as n branches starting from the ends of the original branches, and so forth.
Eample Message Delays Each message in a digital communication system is classified as to whether it is received within the time specified by the system design. If three messages are classified, use a tree diagram to represent the sample space of possible outcomes. Each message can be received either on time or late. The possible results for three messages can be displayed by eight branches in the tree diagram shown in Fig. -5. Practical Interpretation: A tree diagram can effectively represent a sample space. Even if a tree becomes too large to construct, it can still conceptually clarify the sample space.
Measures of Location (Central Tendency) The data (observations) often tend to be concentrated around the center of the data. Some measures of location are: the mean, mode, and median. These measures are considered as representatives (or typical values) of the data. They are designed to give some quantitative measures of where the center of the data is in the sample. The Sample mean of the observations ( ): If,,, are the sample values, then the sample mean is n + + n + n i n n i (unit)
Eample: Suppose that the following sample represents the ages (in year) of a sample of 3 men: 3 Then, the sample mean is: n 30, 35, 7. (n3) 30 + 35 + 7 3 Note: ( ) 0 i i 9 3 30.67
Measures of Variability (Dispersion or Variation) The variation or dispersion in a set of data refers to how spread out the observations are from each other. The variation is small when the observations are close together. There is no variation if the observations are the same. Some measures of dispersion are range, variance, and standard deviation These measures are designed to give some quantitative measures of the variability in the data.
The Sample Variance (S ): Let,,, n be the observations of the sample. The sample variance is denoted by S and is defined by: S n ( n ) ) ( ) + ( ) + + ( i i n n (unit) where n i i / n is the sample mean.
The Standard Deviation (S): The standard deviation is another measure of variation. It is the square root of the variance, i.e., it is: S ( i S n i n ) (unit) The standard deviation can also provide information about the relative spread of a data set.
Eample: Compute the sample variance and standard deviation of the following observations (ages in year): 0,, 33, 53, 54. Solution: n5 S n 5 i i i i 0 + + n 5 n 5 ( i ) i i n ( i 5 33 + 5 34.) 53 + 54 7 5 34. (year) 0 34. + 34. + 33 34. + 53 34. + 54 34. 506.8 4 376.7 (year) 4
The sample standard deviation is: S S 376.7 9.4 (year) * Another Formula for Calculating S : S n i n i n For the previous Eample, i (It is simple and more accurate) 0 33 53 54 i 7 i 00 44 089 809 96 i 7355 S n i n n i 7355 5 34. 506.8 376.7 (year) 5 4
Events: Definition: An event A is a subset of the sample space S. That is AS. We say that an event A occurs if the outcome (the result) of the eperiment is an element of A. S is an event ( is called the impossible event) SS is an event (S is called the sure event) Eample: Eperiment: Selecting a ball from a bo containing 6 balls numbered,,3,4,5 and 6. (or tossing a die)
This eperiment has 6 possible outcomes The sample space is S{,,3,4,5,6}. Consider the following events: E getting an even number {,4,6}S E getting a number less than 4{,,3}S E 3 getting or 3{,3}S E 4 getting an odd number{,3,5}s E 5 getting a negative number{ } S E 6 getting a number less than 0 {,,3,4,5,6}SS Notation: n(s) no. of outcomes (elements) in S. n(e) no. of outcomes (elements) in the event E. Eample: Eperiment: Selecting 3 items from manufacturing process; each item is inspected and classified as defective (D) or non-defective (N).
This eperiment has 8 possible outcomes S{DDD,DDN,DND,DNN,NDD,NDN,NND,NNN} Consider the following events: A{at least defectives} {DDD,DDN,DND,NDD}S B{at most one defective}{dnn,ndn,nnd,nnn}s C{3 defectives}{ddd}s
Some Operations on Events: Let A and B be two events defined on the sample space S. Definition.3: Complement of The Event A: A c or A' A c { S: A } A c consists of all points of S that are not in A. A c occurs if A does not. S Definition.4: Intersection: AB AB{ S: A and B} AB Consists of all points in both A and B. AB Occurs if both A and B occur together. S
Definition: Mutually Eclusive (Disjoint) Events: Two events A and B are mutually eclusive (or disjoint) if and only if AB; that is, A and B have no common elements (they do not occur together). AB A and B are not mutually eclusive AB A and B are mutually eclusive (disjoint)
Definition: Union: AB { S: A or B } AB Consists of all outcomes in A or in B or in both A and B. AB Occurs if A occurs, or B occurs, or both A and B occur. That is AB Occurs if at least one of A and B occurs. S
Counting Sample Points There are many counting techniques which can be used to count the number points in the sample space (or in some events) without listing each element. In many cases, we can compute the probability of an event by using the counting techniques.
Multiplication rule Assume an operation can be described as a sequence of k steps, and the number of ways of completing step is n, and the number of ways of completing step is n for each way of completing step and so on, then: The total number of ways of completing the operation is n n nk
Eample Web Site Design The design for a Website is to consist of four colors, three fonts, and three positions for an image. From the multiplication rule, 4 3 3 36 different designs are possible. Practical Interpretation: The use of the multiplication rule and other counting techniques enables one to easily determine the number of outcomes in a sample space or event and this, in turn, allows probabilities of events to be determined.
Permutations Another useful calculation is to find the number of ordered sequences of the elements of a set. Consider a set of elements, such as S {a,b,c}. A permutation of the elements is an ordered sequence of the elements. For eample, abc, acb, bac, bca, cab, and cba are all of the permutations of the elements of S. The number of permutations of n different elements is n! where n! n (n ) (n ) The number of permutations of subsets of r elements selected from a set of n different elements is P n n! r n n n.. n r + n r!
Eample Printed Circuit Board A printed circuit board has eight different locations in which a component can be placed. If four different components are to be placed on the board, how many different designs are possible? Each design consists of selecting a location from the eight locations for the first component, a location from the remaining seven for the second component, a location from the remaining si for the third component, and a location from the remaining five for the fourth component. P 8 4 8! 680 different designs 4!
Combinations In many problems, we are interested in the number of ways of selecting r objects from n objects without regard to order. These selections are called combinations. Notation: n factorial is denoted by n! and is defined by: n! n n n for n,, 0! Theorem : The number of combinations of n distinct objects taken r at a time is denoted by and is given by: n r n r r! n! n r ;! r 0,,,, n
Notes: n r is read as n choose r. n n,,, n r n 0 n n n n r n r The number of different ways of selecting r objects from n distinct objects (order doesn t matter).
Eample: If we have 0 equal priority operations and only 4 operating rooms are available, in how many ways can we choose the 4 patients to be operated on first? Solution: 0 4 4! 0! 0! 0 4! 4! 6! 4 3 6 5 4 3 0 ( different 0 9 8 7 6 5 4 3 ways)
Probability of an Event: To every point (outcome) in the sample space of an eperiment S, we assign a weight (or probability), ranging from 0 to, such that the sum of all weights (probabilities) equals. The weight (or probability) of an outcome measures its likelihood (chance) of occurrence. To find the probability of an event A, we sum all probabilities of the sample points in A. This sum is called the probability of the event A and is denoted by P(A). Definition The probability of an event A is the sum of the weights (probabilities) of all sample points in A. Therefore, ) ) 3) PA S 0 0 P P
Eample: A balanced coin is tossed twice. What is the probability that at least one head occurs? Solution: S {HH, HT, TH, TT} A {at least one head occurs} {HH, HT, TH} Since the coin is balanced, the outcomes are equally likely; i.e., all outcomes have the same weight or probability. The probability that at least one head occurs is: P(A) P({at least one head occurs})p({hh, HT, TH}) P(HH) + P(HT) + P(TH) 0.5+0.5+0.5 0.75 Outcom e HH HT TH TT Weight (Probabilit y) P(HH) w P(HT) w P(TH) w P(TT) w sum 4w 4w w /4 0.5 P(HH)P(HT)P(TH)P(TT)0. 5
Theorem If an eperiment has n(s)n equally likely different outcomes, then the probability of the event A is: n( A) n( A) P( A) n( S) N no. no. of of outcomes outcomes in in A S
Eample: A miture of candies consists of 6 mints, 4 toffees, and 3 chocolates. If a person makes a random selection of one of these candies, find the probability of getting: (a) a mint (b) a toffee or chocolate. Solution: Define the following events: M {getting a mint} T {getting a toffee} C {getting a chocolate} Eperiment: selecting a candy at random from 3 candies n(s) no. of outcomes of the eperiment of selecting a candy. no. of different ways of selecting a candy from 3 candies. 3 3
The outcomes of the eperiment are equally likely because the selection is made at random. (a) M {getting a mint} n(m) no. of different ways of selecting a mint candy from 6 mint candies 6 6 nm 6 P(M ) P({getting a mint}) n S 3 (b) TC {getting a toffee or chocolate} n(tc) no. of different ways of selecting a toffee or a chocolate candy no. of different ways of selecting a toffee candy + no. of different ways of selecting a chocolate candy 4 3 + 4 + 3 7
no. of different ways of selecting a candy from 7 candies 7 7 n T C n S 7 P(TC ) P({getting a toffee or chocolate}) 3 Eample: In a poker hand consisting of 5 cards, find the probability of holding aces and 3 jacks. Solution: Eperiment: selecting 5 cards from 5 cards. n(s) no. of outcomes of the eperiment of selecting 5 cards from 5 cards. 5 5 5! 5! 47! 598960
The outcomes of the eperiment are equally likely because the selection is made at random. Define the event A {holding aces and 3 jacks} n(a) no. of ways of selecting aces and 3 jacks (no. of ways of selecting aces) (no. of ways of selecting 3 jacks) (no. of ways of selecting aces from 4 aces) (no. of ways of selecting 3 jacks from 4 jacks) 4 4 3 4! 4! 6 4 4!! 3!! P(A ) P({holding aces and 3 jacks }) n n A S 4 598960 0.000009
Additive Rules: Theorem: If A and B are any two events, then: P(AB) P(A) + P(B) P(AB) Corollary : If A and B are mutually eclusive (disjoint) events, then: P(AB) P(A) + P(B) Corollary : If A, A,, A n are n mutually eclusive (disjoint) events, then: P(A A A n ) P(A ) + P(A ) + + P(A n ) n P A i i n P( A ) i i
Eample: The probability that Paula passes Mathematics is /3, and the probability that she passes English is 4/9. If the probability that she passes both courses is /4, what is the probability that she will: (a) pass at least one course? (b) pass Mathematics and fail English? (c) fail both courses? Solution: Define the events: M{Paula passes Mathematics} E{Paula passes English} We know that P(M)/3, P(E)4/9, and P(ME)/4. (a) Probability of passing at least one course is: P(ME) P(M) + P(E) P(ME) 3 + 4 9 4 3 36
(b) Probability of passing Mathematics and failing English is: P(ME C ) P(M) P(ME) 3 4 5 (c) Probability of failing both courses is: P(M C E C ) P(ME) 3 36 5 36 Theorem If A and A C are complementary events, then: P(A) + P(A C ) P(A C ) P(A) Conditional Probability: The probability of occurring an event A when it is known that some event B has occurred is called the conditional probability of A given B and is denoted P(A B).
Definition The conditional probability of the event A given the event B is defined by:.. 3. P P P P A B B P A B A B ; P B P B A n PA B PB A B/ ns nb/ ns PA B PA A B PA PB A PB PA B n A B n B ; 0 P(S)Total area for equally likely outcomes case (Multiplicative RuleTheorem.3)
Multiplicative Rule: Theorem: If P(A) 0 and P(B) 0, then: P(AB) P(A) P(B A) P(B) P(A B) Eample: Suppose we have a fuse bo containing 0 fuses of which 5 are defective (D) and 5 are non-defective (N). If fuses are selected at random and removed from the bo in succession without replacing the first, what is the probability that both fuses are defective? Solution: Define the following events: A {the first fuse is defective} B {the second fuse is defective} AB{the first fuse is defective and the second fuse is defective} {both fuses are defective}
We need to calculate P(AB). P(A) 5 0 P(B A) 4 9 P(AB) P(A) P(B A) 5 0 4 9 0.0563 First Selection Second Selection: given that the first is defective (D)
Theorem: Two events A and B are independent if and only if P(AB) P(A) P(B) *(Multiplicative Rule for independent events) Note: Two events A and B are independent if one of the following conditions is satisfied: (i) P(A B)P(A) (ii) P(B A)P(B) (iii) P(AB) P(A) P(B) Theorem: If A, A, A 3 are 3 events, then: P(A A A 3 ) P(A ) P(A A ) P(A 3 A A ) If A, A, A 3 are 3 independent events, then: P(A A A 3 ) P(A ) P(A ) P(A 3 )
Bayes' Rule: Definition: The events A, A,, and A n constitute a partition of the sample space S if: n i A i A A... A n S A i A j, i j Theorem (Total Probability) If the events A, A,, and A n constitute a partition of the sample space S such that P(A k )0 for k,,, n, then for any event B: P(B) n k n k P(A ) P(B k k ) P(A k B) A
Eample Three machines A, A, and A 3 make 0%, 30%, and 50%, respectively, of the products. It is known that %, 4%, and 7% of the products made by each machine, respectively, are defective. If a finished product is randomly selected, what is the probability that it is defective?
Solution: Define the following events: B {the selected product is defective} A {the selected product is made by machine A } A {the selected product is made by machine A } A 3 {the selected product is made by machine A 3 } 0 P(A) 0.; P(B A) 0.0 00 00 30 4 P(A) 0.3; P(B A) 0.04 00 00 50 7 P(A3) 0.5; P(B A3) 0.07 00 00 P(B) 3 k P(A ) P(B A k k ) P(A ) P(B A ) + P(A ) P(B A ) + P(A 3 ) P(B A 3 ) 0.0.0 + 0.30.04 + 0.50.07 0.00 + 0.0 + 0.035 0.049
Eample: If it is known that the selected product is defective, what is the probability that it is made by machine A? Answer: P(A B) 0.0.0 0.00 P(A B) 0. 0408 P(B) 0.049 0.049 This rule is called Bayes' rule.
Theorem : (Bayes' rule) If the events A,A,, and A n constitute a partition of the sample space S such that P(A k )0 for k,,, n, then for any event B such that P(B)0: P(A B) for i,,, n. P(AiB) P(B) P(A ) P(B A ) i i i n k P(A k )P(B A k ) P(A i) P(B P(B) A ) i
Eample: In previous eample, if it is known that the selected product is defective, what is the probability that it is made by: (a) machine A? (b) machine A 3? Solution: P(A ) P(B A P(A ) ) P(B A ) P(A )P(B A P(B) ( a)p(a B) n k k 0.30.04 0.049 k 0.0 0.049 0.449 P(A 3) P(B A3) P(A 3) P(B A3) ( b)p(a3 B) n P(B) P(A ) P(B A ) k k 0.50.07 0.049 k 0.035 0.049 0.74 )
Note: P(A B) 0.0408, P(A B) 0.449, P(A 3 B) 0.74 3 k P(A B) k
Concept of a Random Variable: In a statistical eperiment, it is often very important to allocate numerical values to the outcomes. Eample: Eperiment: testing two components. (Ddefective, Nnon-defective) Sample space: S{DD,DN,ND,NN} Let X number of defective components when two components are tested. Assigned numerical values to the outcomes are:
Sample point (Outcome) Assigned Numerical Value () DD DN ND NN 0 Notice that, the set of all possible values of the random variable X is {0,, }. Definition A random variable X is a function that associates each element in the sample space with a real number (i.e., X : S R.) Notation: " X " denotes the random variable. " " denotes a value of the random variable X.
Types of Random Variables: A random variable X is called a discrete random variable if its set of possible values is countable, i.e., {,,, n } or {,, } A random variable X is called a continuous random variable if it can take values on a continuous scale, i.e., {: a < < b; a, b R} In most practical problems: o A discrete random variable represents count data, such as the number of defectives in a sample of k items. o A continuous random variable represents measured data, such as height.
Discrete Probability Distributions A discrete random variable X assumes each of its values with a certain probability. Eample: Eperiment: tossing a non-balanced coin times independently. H head, Ttail Sample space: S{HH, HT, TH, TT} Suppose P(H)½P(T) P(H)/3 and P(T)/3 Let X number of heads Sample point Probability Value of X (Outcome) HH P(HH)P(H) P(H)/3/3 /9 HT P(HT)P(H) P(T)/3/3 /9 TH P(TH)P(T) P(H)/3/3 /9 TT P(TT)P(T) P(T)/3/3 4/9 0 ()
The possible values of X are: 0,, and. X is a discrete random variable. Define the following events: Event (X) Probability P(X) (X0){TT} P(X0) P(TT)4/9 (X){HT,TH} P(X) P(HT)+P(TH)/9+/94/9 (X){HH} P(X) P(HH) /9 The possible values of X with their probabilities are: X 0 Total P(X)f() 4/9 4/9 /9.00 The function f()p(x) is called the probability function (probability distribution or density) of the discrete random variable X or simply the pdf of X.
Definition : The function f() is a probability function of a discrete random variable X if, for each possible values, we have: ) f() 0 ) f ( ) all 3) f() P(X) Note: P(XA ) all A f () Eample: For the previous eample, we have: all A P(X ) X f() P(X) 0 4/9 4/9 /9 Total f ( ) 0
P(X<) P(X0)4/9 P(X) P(X0) + P(X) 4/9+4/9 8/9 P(X0.5) P(X) + P(X) 4/9+/9 5/9 P(X>8) P() 0 P(X<0) P(X0) + P(X) + P(X) P(S) Eample A shipment of 8 similar microcomputers to a retail outlet contains 3 that are defective and 5 are non-defective. If a school makes a random purchase of of these computers, find the probability distribution of the number of defectives. Solution: We need to find the probability distribution of the random variable: X the number of defective computers purchased. Eperiment: selecting computers at random out of 8 n(s) 8 equally likely outcomes
The possible values of X are: 0,,. Consider the events: 3 5 (X 0) {0D and N} n(x 0) 0 3 5 (X ) {D and N} n(x ) (X ) {D and 0N} n(x 3 5 ) 0 3 5 n(x 0) 0 f (0) P(X 0) n(s) 8 0 8
8 5 8 5 3 n(s) ) n(x ) P(X () f 8 3 8 0 5 3 n(s) ) n(x ) P(X () f 8 5 3 n(s) ) n(x ) P(X f ()
The probability distribution of X is: 0 Total f() P(X) 0 8 5 8 3 8.00 f 3 5 ; 0,, ( ) P( X ) 8 Hypergeometric Distribution 0; otherwise Definition: The cumulative distribution function (CDF), F(), of a discrete random variable X with the probability function f() is given by: F() P(X ) f (t) P(X t); for << t t
Eample: Find the CDF of the random variable X with the probability function: X 0 F() 0 8 Solution: F()P(X) for << For <0: F()0 5 8 3 8 0 For 0<: F()P(X0) 8 For <: F()P(X0)+P(X) For : F()P(X0)+P(X)+P(X) + + 0 8 + 5 8 5 8 0 8 5 8 3 8
The CDF of the random variable X is: F( ) P( X Note: F(0.5) P(X0.5)0 F(.5)P(X.5)F() F(3.8) P(X3.8)F() 0 ; 0 0 ; 0 8 ) 5 ; 8 ; 5 8 Result: P(a < X b) P(X b) P(X a) F(b) F(a) P(a X b) P(a < X b) + P(Xa) F(b) F(a) + f(a) P(a < X < b) P(a < X b) P(Xb) F(b) F(a) f(b)
Continuous Probability Distributions For any continuous random variable, X, there eists a non-negative function f(), called the probability density function (p.d.f) through which we can find probabilities of events epressed in term of X. f: R [0, ) P(a < X < b) area under the curve of f() and over the interval (a,b) P(XA) b f() d a f() d A area under the curve of f() and over the region A
Definition : The function f() is a probability density function (pdf) for a continuous random variable X, defined on the set of real numbers, if:. f() 0 R. f() d - 3. P(a X b) f() d a, b R; ab b a Note: For a continuous random variable X, we have:. f() P(X) (in general). P(Xa) 0 for any ar 3. P(a X b) P(a < X b) P(a X < b) P(a < X < b) 4. P(XA) f() d A
Eample Suppose that the error in the reaction temperature, in o C, for a controlled laboratory eperiment is a continuous random variable X having the following probability density function: ; f ( ) 3 0 ; elsewhere. Verify that (a) f() 0 and (b). Find P(0<X) Solution: X the error in the reaction temperature in o C. X is continuous r. v. f() d - f ( ) ; 3 0 ; elsewhere
. (a) f() 0 because f() is a quadratic function. (b) - f()d - 0d + - 3 d d 3 3 9 - (8 ( )) 9. P(0<X) f() d 0 0 9 3 9 9 0 ( (0)) 3 + d 0d
Definition The cumulative distribution function (CDF), F(), of a continuous random variable X with probability density function f() is given by: F() P(X) - f(t)dt; for << Result: P(a < X b) P(X b) P(X a) F(b) F(a)
In previous eample, find the CDF, then find P(0<X). Solution: ; f ( ) 3 0 ; elsewhere For < : F() f(t) dt 0dt 0 - - For <: F() f(t) dt - t dt 3-9 t 0dt - t t + - ( 9 3 t dt 3 3 3 ( )) ( 9 + )
For : F() 0dt dt t 3 0dt f(t)dt - - - + + dt t 3 - Therefore, the CDF is: + ; ; ) ( 9 ; 0 ) ( ) ( 3 X P F. Using the CDF, P(0<X) F() F(0) 9 9 9
Mathematical Epectation Mean of a Random Variable: Definition: Let X be a random variable with a probability distribution f(). The mean (or epected value) of X is denoted by X (or E(X)) and is defined by: E(X) all μx f ( ) ; f ( ) d ; if if X is discrete X is continuous Eample: A shipment of 8 similar microcomputers to a retail outlet contains 3 that are defective and 5 are non-defective. If a school makes a random purchase of of these computers, find the epected number of defective computers purchased
Solution: Let X the number of defective computers purchased. Previously, we found that the probability distribution of X is: 0 F()p(X) 0 8 5 8 3 8 or: f ( ) P( X ) 3 5 ; 8 0; otherwise 0,,
The epected value of the number of defective computers purchased is the mean (or the epected value) of X, which is: E(X) μ X 0 (0) f(0) + () f() +() f() f ( ) 0 5 3 ( 0) + () + () 8 8 8 5 6 + 0.75 8 8 8 (computers) Eample Let X be a continuous random variable that represents the life (in hours) of a certain electronic device. The pdf of X is given by: f ) 0,000 ; 00 0 ; elsewhere ( 3 Find the epected life of this type of devices.
Solution: E(X) μ X f ( ) d 00 0000 3 0000 00 0000 d d 00 0000 0 00 (hours) 00 Therefore, we epect that this type of electronic devices to last, on average, 00 hours.
Theorem Let X be a random variable with a probability distribution f(), and let g(x) be a function of the random variable X. The mean (or epected value) of the random variable g(x) is denoted by g(x) (or E[g(X)]) and is defined by: E[g(X)] all μg(x) g( ) f g( ) f ( ) ; ( ) d ; if if X X is is discrete continuous Eample: Let X be a discrete random variable with the following probability distribution 0 F() Find E[g(X)], where g(x)(x ). 0 8 5 8 3 8
Solution: g(x)(x ) E[g(X)] μ g(x) g( ) 0 f ( ) ( 0 ) (0) f(0) + () f() +() f() 0 8 () + (0) 5 +() 8 0 8 + 0 + 3 8 0 8 f ( ) 3 8
Eample: In Eample 4.3, find. E X Solution: 0,000 ; 00 f ( ) 3 0 ; elsewhere g(x) X E X E[g(X)] μ 00 g(x) g ( ) f ( ) d f ( ) d 0000 d 0000 3 d 4 00 0000 0 3 000000 0.0067 0000 3 3 00
Variance (of a Random Variable) The most important measure of variability of a random variable X is called the variance of X and is denoted by Var(X) or. σ X Definition Let X be a random variable with a probability distribution f() and mean. The variance of X is defined by: ( ) f ( ) ; if all Var(X) σ X E[(X μ) ] ( ) f ( ) d ; if X X is is discrete continuous Definition: The positive square root of the variance of X, σ X σ X,is called the standard deviation of X. Note: Var(X)E[g(X)], where g(x)(x )
Theorem The variance of the random variable X is given by: where Var(X) σ E(X ) all E(X ) μ X f ( ) ; f ( ) d ; if if X X is is discrete continuous Eample Let X be a discrete random variable with the following probability distribution 0 3 F() 0.5 0.38 0.0 0.0 Find Var(X) σ X
Solution: 3 μ 0 f ( ) (0) f(0) + () f() +() f() + (3) f(3) (0) (0.5) + () (0.38) +() (0.0) + (3) (0.0) 0.6. First method: Var(X) σ ( ) f ( ) X 3 0 3 0 ( 0.6) f () (00.6) f(0)+(0.6) f()+(0.6) f()+ (30.6) f(3) (0.6) (0.5)+(0.39) (0.38)+(.39) (0.0)+ (.39) (0.0) 0.4979. Second method: μ Var(X) σ E(X ) X E(X ) 3 0 f() (0 ) f(0) + ( ) f() +( ) f() + (3 ) f(3) (0) (0.5) + () (0.38) +(4) (0.0) + (9) (0.0) 0.87 E(X ) μ X Var(X) σ 0.87 (0.6) 0.4979
Eample Let X be a continuous random variable with the following pdf: f ( ) ( ) ; 0 ; elsewhere Find the mean and the variance of X. Solution: μ E(X) E(X f ( ) d ) f ( ) d [ ( )] d ( ) d [( )] d ( ) d Var(X) σ E(X ) μ 7/6 (5/3) /8 X 5/3 7/6
Means and Variances of Linear Combinations of Random Variables: If X, X,, X n are n random variables and a, a,, a n are constants, then the random variable : n Y a X a X + a X + + i i n n i is called a linear combination of the random variables X,X,,X n. a X Theorem If X is a random variable with mean E(X), and if a and b are constants, then: E(aXb) a E(X) b axb a X ± b Corollary : E(b) b Corollary : E(aX) a E(X)
Eample Let X be a random variable with the following probability density function: f ( ) ; 3 0 ; elsewhere Find E(4X+3). Solution: μ E(X) f ( ) d 3 [ ] d 3 d 3 4 3 4 5/4 E(4X+3) 4 E(X)+3 4(5/4) + 38 Another solution: E[g(X)] ; g(x) 4X+3 g ( ) f ( ) d E(4X+3) 4 + 3) f ( ) d ( (4 + 3) [ ] d 8 3
Theorem: If X, X,, X n are n random variables and a, a,, a n are constants, then: E(a X +a X + +a n X n ) a E(X )+ a E(X )+ +a n E(X n ) n E( a i Corollary: If X, and Y are random variables, then: E(X ± Y) E(X) ± E(Y) i X i ) n a i i E( X i ) Theorem If X is a random variable with variance Var( X ) X constants, then: and if a and b are Var(aXb) a Var(X) ax + b a X
Theorem: If X, X,, X n are n independent random variables and a, a,, a n are constants, then: Var(a X +a X + +a n X n ) Var(X )+ Var (X )+ + Var(X n ) a a a n n Var( a X ) i i i ax + a X ++ a nxn n a i i Var( X X i ) X a + a + + a n Xn Corollary: If X, and Y are independent random variables, then: Var(aX+bY) a Var(X) + b Var (Y) Var(aXbY) a Var(X) + b Var (Y) Var(X ± Y) Var(X) + Var (Y)
Eample: Let X, and Y be two independent random variables such that E(X), Var(X)4, E(Y)7, and Var(Y). Find:. E(3X+7) and Var(3X+7). E(5X+Y) and Var(5X+Y). Solution:. E(3X+7) 3E(X)+7 3()+7 3 Var(3X+7) (3) Var(X)(3) (4) 36. E(5X+Y) 5E(X) + E(Y) (5)() + ()(7) Var(5X+Y) Var(5X+Y) 5 Var(X) + Var(Y) (5)(4)+(4)() 04