Math493 - Fall HW 2 Solutions

Size: px

Start display at page:

Download "Math493 - Fall HW 2 Solutions"

Noel Porter
6 years ago
Views:

1 Math493 - Fall HW 2 Solutions Renato Feres - Wash. U. Preliminaries. In this assignment you will do a few more simulations in the style of the first assignment to explore conditional probability, Bayes theorem, and also play with the concept of random variables. Some useful R tricks are introduced below: creating your own functions, and plotting graphs and histograms. I also elaborate a bit on the notion of random variables. Conditional probability. To illustrate the idea of conditional probability consider the following problem. From a bowl containing five red, three white, and seven blue chips, select four at random and without replacement. Compute the conditional probability of one red, zero white and three blue chips, given that there are at least three blue chips in this sample of four chips. Suppose that the bowl has a total of N chips of which N r are red, N w are white, and N b are blue. Therefore, N r + N w + N b = N. We select a sample of size n from the bowl, at random without replacement, and ask for the probability that there are n r red, n w white, and n b blue chips in the sample. So n r + n w + n b = n. I will indicate this event simply by (n r,n w,n b ). Then an elementary counting argument already used in Homework Set 1 gives the probability P(n r,n w,n b ) = ( Nr )( Nw )( Nb n r n w ( N ). n n b ) It is not hard to see how this generalizes to any number of colors. Keep this example in mind as a model for the general (multivariable) hypergeometric distribution. This is the key information you need to solve the problem by hand, in addition to the definition of conditional probability: P(A B) P(A B) =. P(B) (Hint: A = (1,0,3) and B is the union of the events (1,0,3), (0,1,3), and (0,0,4).) I won t give away the answer, but if you d like to check your calculation, try to run a simulation of the problem. The following script does this. It also illustrates how conditional probability may be approximated by Monte Carlo simulation. (You may need to go back to the R tutorial in the preliminaries section of the first homework assignment to make sense of the below programs.) Chips=c(1,1,1,1,1,2,2,2,3,3,3,3,3,3,3) #1 for red, 2 for white, 3 for blue N= #Number of trials of the experiment Total_s=0 #Counts the number of successes (right number #red, white and blue chips in sample of 4) Total_c=0 #Counts the number of trials with least 3 blue #chips in sample of 4.

2 for(i in 1:N){ a=sample(chips,4,replace=false) #Select 4 chips #Count number of red, white and blue chips in sample r=sum(a==1) w=sum(a==2) b=sum(a==3) if (b>=3){ Total_c=Total_c+1 #Success means: 1 red, 0 white and 3 blue chips: Total_s=Total_s+(r==1 && w==0 && b==3) Total_s/Total_c #Relative frequency approximates probability One run of this program gave me the number for the conditional probability that the sample of size 4 contains one red, zero white and three blue chips given that it contains at least three blue chips. Bayesian probability. Bayes theorem is a very simple but extremely significant idea. It is at the heart of many subjects, from machine learning to Bayesian statistics, which is a method of inference in which Bayes theorem is used to revise our estimate of some hypothesis as additional data are acquired. The following is a standard textbook example. (It may serve as a model for one of the homework problems, below.) Let A be the event that a given person has a disease for which a clinical test is available. Let B be the event that the test gives a positive reading. We may have prior information as to how reliable the test is, so we may already know the conditional probability P(B A) that if the test is given to a person who is known to have the disease, the reading will be positive, and similarly P(B A c ), of a positive reading if the person is healthy (i.e., a false positive). We may also know how common or rare the disease is in the population at large, so the prior probability, P(A), that a person has the disease may be known. Assume that: 3% of the population has the disease, that the probability of a diseased person being tested positive is 98%, and of a healthy person being tested positive is 1%. 1. What is the probability of actually having the disease given that the test is positive? By Bayes formula, this probability is given by Substituting the numerical values: This is approximately 75%. P(A B) = P(A B) = P(B A)P(A) P(B A)P(A) + P(B A c )P(A c ) = What is the probability of having the disease given that the test is negative? Similarly, P(A B c ) = P(B c A)P(A) P(B c A)P(A) + P(B c A c )P(A c ). 2

3 Substituting the numerical values: This is approximately 0.06%. P(A B c ) = (1 0.98) 0.03 (1 0.98) (1 0.01) 0.97 = We can confirm the above results by stochastic simulation. This amounts to running many independent trials (say N, which we think of as the number of persons tested) of the following random experiment: (1) to each person assign the condition of having the disease or not at random and independently, with the given probability for having the disease; (2) independently, assign to this person a test result, with the given probabilities for a positive or negative test result; (3) keep a counter for the number of times the test is positive, and for the number of times that both the test is positive and the person has the disease. The probability that the person has the disease given that the test is positive is then approximated by the fraction of individuals with the disease in the group of individuals tested positive. The following program implements this in R. N = #Number of persons selected (the larger the better) #Each person is randomly "assigned" a value 1 (has the disease) or 0 (does not #have it). So D[j]=1 if person j has the disease and 0 otherwise. D = sample(c(0,1),n,replace=true,c(0.97,0.03)) #Initialize the vector T of test results: T[j]=1 if test is positive #for person j and 0 if negative. T = 0*c(1:N) #Arbitrarily giving test result 0 to all persons, simply to initialize T. #Now person j is tested (this will assign T[j] with a test result value): for (j in 1:N){ if (D[j]==1) { T[j]=sample(c(0,1),1,replace=TRUE,c(0.02,0.98)) else { T[j]=sample(c(0,1),1,replace=TRUE,c(0.99,0.01)) #Now create a vector TD whose jth entry is 1 if jth person both has the #disease and was tested positive, and 0 otherwise. Note that this vector #is simply the elementwise product of T and D: TD = T*D f = sum(td)/sum(t) #This fraction approximates the conditional probability we want. f One run of the script gave me the value f = This seems to support the result obtained analytically. The following script is equivalent, but avoids the for loop. It uses the R operation which and employs vector operations more fully. This helps to speed up the program considerably. (Observe that I have used a much bigger N than in the first version. Run the two versions to see the difference in speed.) N = #Number of persons selected #Each person is randomly "assigned" a value 1 (has the disease) or 0 (does not #have it). So D[j]=1 if person j has the disease and 0 otherwise. D = sample(c(0,1),n,replace=true,c(0.97,0.03)) #Initialize the vector T of test results: T[j]=1 if test is positive #for person j and 0 if negative. 3

4 T = 0*c(1:N) #Arbitrarily giving test result value 0 to all the persons. #Now look for the indices of those persons with and without the disease: d1 = which(d==1) d2 = which(d==0) n1 = sum(d==1) n2 = N-n1 #Assign test values to the persons: T[d1] = sample(c(0,1),n1,replace=true,c(0.02,0.98)) T[d2] = sample(c(0,1),n2,replace=true,c(0.99,0.01)) #Now create a vector TD whose jth entry is 1 if jth person both has the #disease and was tested positive, and 0 otherwise. Note that this vector #is simply the elementwise product of T and D: TD = T*D f = sum(td)/sum(t) #This fraction approximates the conditional probability we want. f This gave me the value f = I then ran the same script with N = (it took about one minute on my Mac) and the result was This agrees with the analytical solution (part 1 of the problem) up to 4 decimal places. User defined functions in R. A very useful device in R is to create your own functions. The following examples should give you the basic idea. #We define three functions, f, g, h. #Note that h is a piecewise defined function. #You may enter these function definitions directly on the command line console in R. f=function(x) x^2+3*x-5 g=function(x) 2*sin(3*x)-5*cos(5*x) h=function(x) { f(x)*(x<0)+g(x)*(x>=0) #Now we calculate a few values: > f(4) [1] 23 > g(-2) [1] > f(g(1)) [1] > h(-1)-f(-1) [1] 0 #We can plot a graph of h over the interval [-5,2*pi] as follows. x=seq(from=-5, to=2*pi, by=(2*pi+5)/1000) #The sequence x consists of 1001 equally spaced numbers between -5 and 2*pi. #The following shows how to plot h(x) over this sequence, adding labels and title. #You can choose different types of line. The simplest choice is "type= l " for line. plot(x,h(x),type= l,main= Graph of the function h,xlab= x variable, ylab= y variable ) 4

5 grid() #Add a grid. #The resulting graph is shown below. Graph of the function h y variable x variable Random variables. I will need to get a little abstract for a moment. Please bear with me here as there is a concrete and very useful point about computer simulation worth noting in the end. (See the main remark towards the end of the next page.) The concept of random variable is introduced in Chapter 4 of the textbook. I will present it here in a more mathematically formal fashion. Let (X,F,P) be a probability space, where X is a sample space, or set of outcomes (it is simply a set; I had used the symbol X for this in class, but I ll reserve X for random variables here), F is a σ-algebra (whose elements are subsets of X referred to as events), and P is a probability measure on B. Recall the axioms for F and P: Axioms of the σ-algebra of events F 1. X F ; 2. if A F then A c F ; 3. if A 1, A 2, F then A i F. In class I used the term algebra for F, but σ-algebra ( sigma-algebra ) is the technically proper name. Axioms of the probability function P : F [0,1] 1. P(X ) = 1; 2. if A 1, A 2,... are mutually disjoint events, then P( A i ) = A i. One often thinks of (X,F,P) as the mathematical specification of a random experiment. Now let Y be some other set and X : X Y a function. (I like to indicate functions by such arrows. This means that X is the domain of X and X (x) Y for each x X.) A random variable is simply any such function. (But see a refinement of this definition below.) A couple of points to observe. First note that Y can be given its own σ-algebra, which I will denote by F X, defined as follows: A subset D of Y belongs to F X if (by definition) X 1 (D) := {x X : X (x) D 5

6 is an event in the σ-algebra F. (I use the symbol := to indicate that the left-hand side of this equality is defined by the right-hand side.) Furthermore, X induces a probability measure P X on F X as follows. For any D F X (hence D Y ) P X (D) := P ( X 1 (D) ). (1) Notice that we are using the probability measure P on F and the function X to define a probability measure P X on F X. The measure P X is often called the law of X. The following exercise is optional, but think about it for a few minutes! Exercise 1. Show that (Y,F X,P X ) satisfies the axioms of a probability space. It may be the case, that Y already had a σ-algebra of its own before we chose X. Let F denote this σ-algebra. Then for X : X Y to be called a random variable, and for the probability measure P X to make logical sense, we require that X 1 (D) be an event in F for every event D F. In other words, we require F F X. For example, if the random variable X takes values in R, then we always consider the σ-algebra of subsets in R generated from intervals through the use of basic set operations applied at most countably many times. This is called the Borel σ-algebra. (More on this in the next bullet point.) In this case, we say that the function X : X R is a random variable if every event of the form X 1 (I ) belongs to F, where I is any interval. Random numbers in [0, 1]. Consider the experiment of choosing a random (real) number between 0 and 1 so that all numbers are equally likely to be chosen. This experiment is formally represented by the probability space ([0,1],B,λ), where B is the Borel σ-algebra (whose elements are called Borel sets), which is the σ-algebra generated by the open intervals of R intersected with the unit interval; and λ is the length measure, also called the Lebesgue measure, that assigns value b a to an interval [a,b] [0,1]. That is, λ([a,b]) = b a. (Technical point: one typically throws into B also all those subsets of [0,1] which are contained in Borel sets of zero length. This process of adding all sets that ought to be assigned measure 0 but may not have been in B initially is called completion of the probability space.) Finally, here s the main remark: It turns out that, for essentially anything we can possibly want do in this course (as well in more advanced graduate level courses in probability theory), the domain of our random variables (say, taking on numerical values) can be assumed simply to be ([0,1],B,λ). This makes the concept of a random variable very concrete. The moral of this long story is this: If we have any means of producing random numbers in [0, 1] with the uniform probability distribution, then we can simulate essentially any other numerical random variable of interest by constructing an appropriate function X from [0,1] into R. Other target spaces for X that will be useful to us are the n-dimensional coordinate spaces R n. The precise statement and proof of this general claim belongs to a more advanced course, but the following discussion should hint at why this may be true. Discrete random variables and the indicator function of sets. In R, the random experiment of producing a uniformly distributed random number between 0 and 1 amounts to invoking the function runif. Thus, for example, to produce a sequence of 5 such random numbers: > runif(5) [1] We can define a (mathematical) fair coin as a random variable X : [0,1] {0,1, where 0 stands for Heads and 1 for Tails, as follows: 0 if x [0,1/2) X (x) = 1 if x [1/2,1] 6

7 Then flipping a coin amounts to choosing a random number x [0, 1] with probability distribution λ (uniform distribution) and evaluating X (x). More generally, if p is the probability of heads, then the random experiment of flipping the possibly biased coin amounts to the function 0 if x [0, p) X (x) = 1 if x [p,1] That this does what is expected can be shown as follows. P X ({0) = λ(x 1 ({0)) = λ({x [0,1] : X (x) = 0) = λ([0, p)) = p 0 = p. Similarly, we have P X ({1) = 1 p. This random variable may be implemented as an R-function as follows. Coin=function(n,p) { x=runif(n) 1*(x>=p) If I want 5 independent tosses of a fair coin: > Coin(5,1/2) [1] Suppose now that X is the random variable taking on values in {a 1,..., a k with probabilities p 1,..., p k, respectively, where the a i are simply numbers. There is a convenient way to write X using the so-called indicator function of a set. (Another name for it is the characteristic function of the set.) If A is a (Borel) subset of [0,1], we let 1 A (x) represent the function which is 1 if x A and 0 if not: 1 if x A 1 A (x) = 0 if x A c We want to write X as a linear combination of indicator functions. Let A 1, A 2,..., A k form a partition of [0,1] such that λ(a i ) = p i. Any partition will do so long as the lengths are equal to p i. For example, let A i be the intervals A 1 = [0, p 1 ), A 2 = [p 1, p 1 + p 2 ), A 3 = [p 1 + p 2, p 1 + p 2 + p 3 ),..., A k = [p p k 1,1]. Note that these sets indeed form a partition of [0,1] (meaning that they are disjoint Borel sets and their union is all of [0,1]) and each A i is an interval of length λ(a i ) = p p i 1 + p i (p p i 1 ) = p i. 7

8 Now, the punchline of this story is that any discrete random variable X with k possible values can be written as k X (x) = a j 1 A j (x). (2) j =1 In fact, it was not important that k be finite; the same argument works also for k =. It is also easy to check from the definition of P X that P X ({a j ) = λ(a j ) = p j for each j. Allowing k to be infinite, equation 2 gives in fact the general form of a discrete random variable. Indicator functions are easy to implement in R using the logical relations: == Equals! = Not equal > Greater than >= Greater than or equal to < Less than <= Less than or equal to Or Or (use with vectors and matrices) && & And And (use with vectors and matrices) As an example, suppose that X is a random variable taking on the possible values 2,5,6 with probabilities 1/2,1/3,1/6, respectively. Here is a user defined R function implementing this random variable. #Defining the indicator functions of the three intervals Ind1=function(x) (x>=0 & x<1/2) Ind2=function(x) (x>=1/2 & x<1/2+1/3) Ind3=function(x) (x>=1/2+1/3 & x<=1) #Now define the random variable X X=function(x) -2*Ind1(x)+5*Ind2(x)+6*Ind3(x) To see the graph of X : #Create a sequence in the interval [0,1] x=seq(from=0,to=1,by=1/1000) plot(x,x(x),type= l,main= Graph of the random variable X ) grid() #The graph is shown next. Note that X is discontinuous at 1/2 and 5/6. All we need to know about X is already contained in its probability distribution P X. From the definition of X we have P X ( 2) = 1/2, P X (5) = 1/3, P X (6) = 1/6. It is possible to approximate this distribution by Monte Carlo simulation. We simply draw a large number of independent values of X and plot the relative frequency of the three values 2,5,6. Such a frequency plot is also called a histogram. I will use this example as a model. The idea is to generate a large number of sample values of X and then represent the frequency (or relative frequency) of each value in { 2,5,6 by the height of a bar in a bar chart. Here it is: #First I will generate, say random values of X with the given 8

9 Graph of the random variable X X(x) x #probabilities. #As noted above this amounts to generating random numbers in [0,1] #with the uniform distribution, then evaluating the function X at #those numbers. x=runif(10000) Xvalues=X(x) #Now the hist command. It requires specifying the break points of the #bins, or intervals, containing the values of X. This breakpoints=c(-2.5,-1.5,-0.5,0.5, 1.5, 2.5,3.5,4.5,5.5,6.5) hist(xvalues,breaks=breakpoints,prob=true) #prob=true indicates that we want the relative frequencies rather #than absolute counts. #The histogram is shown below. Histogram of Xvalues Density Xvalues It is good to know how to translate more or less verbatim mathematical expressions into R as we just did, but in 9

10 practice we could have used the built-in function sample for generating random values of X. Here is how this would look: #Create a vector of values x=c(-2,5,6) #Create a vector of probabilities p=c(1/2,1/3,1/6) #Now generate sample values of X Xvalues=sample(x,10000,replace=TRUE,p) #From this we could go on to plot the histogram just as above. Continuous random variables. The above idea of representing discrete random variables as a linear combination of indicator functions of sets in [0, 1] suggests that one could similarly approximate continuous random variables by such stepwise functions. We will explore this idea later in the course. For now I just note the conclusion that continuous random variables can also be represented as functions X : [0,1] R, now no longer taking on values in a discrete set. Just as in the discrete case, to find sample values of X we choose random numbers x in [0,1] with the uniform probability distribution and evaluate X (x). The probability distribution P X is just as given in the general definition above. (See the section on Random variables. ) Continuous random variables are characterized by the property that P X has a probability density. This means that there is a (often continuous) function f (x) on the range of values of X such that P X (A) = f (x)d x. A We can visualize the density function f (x) in a similar way as we did above for the discrete random variables using a histogram. By making the break points of the histogram bins (the base of the rectangular bars) sufficiently close together we can make it look as continuous as we want. (An alternative is to use the R function density.) This is illustrated with the next example. #First I create a continuous random variable. This is an arbitrary example: X=function(x) { 100*x*(1-x)*(x-0.5)^2 #Let us see first what this function looks like by plotting its graph: x=seq(from=0,to=1,by=1/1000) plot(x,x(x),type= l ) 10

11 X(x) x #We now generate (this number is arbitrary) independent values of X x=runif( ) Xvalues=X(x) #We now produce a histogram plot approximating the probability density of X. #In the variable breaks now I simply specify the number of bins rather than exact breakpoints. hist(xvalues,breaks=100,prob=true) #The plot is below. Histogram of Xvalues Density Xvalues Can you explain qualitatively the shape of this histogram? Note that the two peaks seem to coincide with the values of X where the derivative of X equals 0. 11

12 Problems 1. Each bag in a large box contains 25 tulip bulbs. It is known that 60% of the bags contain bulbs for 5 red and 20 yellow tulips, while the remaining 40% of the bags contain bulbs for 15 red and 10 yellow tulips. A bag is selected at random and a bulb taken at random from this bag is planted. (a) What is the probability that it will be a yellow tulip? (b) Given that it is yellow, what is the conditional probability it comes from a bag that contained 5 red and 20 yellow bulbs? Solve this problem by hand. (That is, not using computer simulation.) Solution. Part (a). Let Y be the event that the bulb selected will be a yellow tulip, and R the event that it will be red. Let B 1 the event that the bulb is taken from the first kind of bag and B 2 that it was taken from the second kind. From the data given we can say: Then the probability of Y can be obtained using P(Y B 1 ) = 20 25, P(Y B 2) = 10 25, P(B 1) = 0.6, P(B 2 ) = 0.4. P(Y ) = P(Y B 1 )P(B 1 ) + P(Y B 2 )P(B 2 ). Therefore, P(Y ) = = Part (b). The problem is to find P(B 1 Y ). Here we need to invert a conditional probability, which calls for Bayes theorem. P(B 1 Y ) = P(Y B 1)P(B 1 ) P(Y ) where we have used the value of P(Y ) obtained in part (a). = = Confirm your solution to part (a) of the previous problem by doing a computer simulation. In other words, simulate the experiment of selecting a bag at random with the given probabilities, then choosing a bulb at random from that bag. This experiment should be repeated a large number of times from which you compute the fraction of times bulbs for yellow tulips were selected. This fraction approximates the probability P(Y ). What approximate value do you obtain? (Make your approximate answer close to the exact one to at least two digits after the decimal point but choosing a sufficiently large number of trials of the experiment.) Solution. The following program does it. The number of times the experiment is repeated (N = ) is somewhat arbitrary. It is big enough to give a precise solution but not so big that I have to wait a long time to get the answer. The problem of how big is sufficient for some desired precision is a statistical problem that we will (in principle, even if we may not have the time to get to it) be able to answer using the methods of this course. N= #Number of times experiment is repeated Y=0 #Initializing the count of yellow tulips. for (j in 1:N){ bag=sample(c(1,2),1,replace=true,c(0.6,0.4)) if (bag==1){ #Here we pick a bulb from bag 1. 12

13 #a==1 means yellow, 2 means red. a=sample(c(1,2),1,replace=true,c(20/25,5/25)) Y=Y+(a==1) else { #Here we pick a bulb from bag 2. Again, #a==1 means yellow, 2 means red. a=sample(c(1,2),1,replace=true,c(10/25,15/25)) Y=Y+(a==1) #The fraction of yellow tulips is Y/N The value I obtained in one run of this problem is This is in good agreement with the exact answer A chemist wishes to detect an impurity in a certain compound that she is making. There is a test that detects an impurity with probability 0.90; however, this test indicates that an impurity is there when it is not about 5% of the time. The chemist produces compounds with the impurity about 20% of the time. A compound is selected at random from the chemist s output. The test indicates that an impurity is present. What is the conditional probability that the compound actually has the impurity? (a) Find the exact solution for the conditional probability. (b) Obtain the solution by simulation, as in the above example of the clinical test. Your solution should be correct to at least two decimal places. Solution. Part (a). This is a standard application of Bayes theorem. Let D be the event that the test detects the impurity, and I the event that impurity is present in the compound. The event that impurity is not present will be denoted by I c (the complementary event). From the problem s data we have P(D I ) = 0.9, P(D I c ) = 0.05, P(I ) = 0.2. We wish to find P(I D), the probability that impurity is present given that the test was positive. This probability of a positive test is P(D) = P(D I )P(I ) + P(D I c )P(I c ) = = We can now use Bayes formula: P(D I )P(I ) P(I D) = = P(D) 0.22 Part (b). The program I used is a simple modification of the one for the clinical test example. N = #Number of trial compounds pd = 0.9 #Probability that test detects impurity pfp = 0.05 #Probability of a false positive pci = 0.2 #Probability a compound has the impurity #Each compound is "assigned" a value 1 (has the impurity) or 0 (does not 13

14 #have it). So C[j]=1 if compound j has the impurity and 0 otherwise. C = sample(c(0,1),n,replace=true,c(1-pci,pci)) #Initialize the vector T of test results: T[j]=1 if test is positive #for compound j and 0 if negative. T = 0*c(1:N) #Arbitrarily giving test result value 0 to all compounds. #Now compound j is tested (this will assign T[j] with a test result value): for (j in 1:N){ if (C[j]==1) { #If compound j has the impurity T[j]=sample(c(0,1),1,replace=TRUE,c(1-pD,pD)) else { T[j]=sample(c(0,1),1,replace=TRUE,c(1-pFP,pFP)) #Now create a vector TD whose jth entry is 1 if jth person both has the #disease and was tested positive, and 0 otherwise. Note that this vector #is simply the elementwise product of T and D: TC = T*C f = sum(tc)/sum(t) #This fraction approximates the condition probability we want. f One run gave me the value This is certainly in the ballpark of what was obtained by hand. I also tried N = (10 times bigger number of trials) and obtained the value , which gives the correct value to 4 decimal places. 4. Histogram. A random variable X : [0,1] {1,2,3,4,5 on the Lebesgue probability space ([0,1],B,λ) has the following expression: X (x) = 1 [0,1/5] (x) + 21 (1/5,1/4] (x) + 31 (1/4,1/3] (x) + 41 (1/3,1/2] (x) + 51 (1/2,1] (x). (Recall that 1 I (x) is my notation for the indicator function of a set I [0,1]. So, for example, X (0.8) = 5.) (a) Draw the graph of the function X using R along the lines of the example given in the tutorial. (b) What is the value of the probability P X ({1,3,5)? (This is to be done by hand.) (c) Using R obtain a histogram plot for 1000 independent sample values of X. Solution. Part (a). Here is the script for part (a): #I first define the indicator functions for the intervals Ind1=function(x) x>0 & x<= 1/5 Ind2=function(x) x>1/5 & x<= 1/4 Ind3=function(x) x>1/4 & x<= 1/3 Ind4=function(x) x>1/3 & x<= 1/2 Ind5=function(x) x>1/2 & x<= 1 #The function X is a linear combination of the indicator functions X=function(x) Ind1(x)+2*Ind2(x)+3*Ind3(x)+4*Ind4(x)+5*Ind5(x) #Now, plotting the graph of X x=seq(from=0,to=1,by=1/1000) 14

15 plot(x,x(x),type= l ) grid() #The plot is shown below. X(x) x Part (b). The probability P X of the set {1,3,5 is the sum of the lengths of the intervals [0,1/5], (1/4,1/3], and (1/2,1]. This sum is P X ({1,3,5) = ( ) ( ) = Part (c). To obtain independent values of the random variable X, I will use the function X from part (a). x=runif(1000) Xvalues=X(x) hist(xvalues,breaks=c(0.5,1.5,2.5,3.5,4.5,5.5),prob=true) 15

16 Histogram of Xvalues Density Xvalues 5. Density plots. Produce density plots using the histogram operation in R (as in the above example) for the following random variables defined by functions on [0,1]. (a) X (x) = log(x) (b) X (x) = 0.5 x 0.5 Note: in R the absolute value function is abs. For example, abs(-2) gives 2. Solution. Part (a). X=function(x) -log(x) x=runif( ) Xvalues=X(x) hist(xvalues,breaks=50,prob=true) #The plot is below. Histogram of Xvalues Density Xvalues 16

17 Part (b). X=function(x) 0.5-abs(x-0.5) x=runif( ) Xvalues=X(x) hist(xvalues,breaks=50,prob=true) #The plot is below. Histogram of Xvalues Density Xvalues 17

4. Conditional Probability

1 of 13 7/15/2009 9:25 PM Virtual Laboratories > 2. Probability Spaces > 1 2 3 4 5 6 7 4. Conditional Probability Definitions and Interpretations The Basic Definition As usual, we start with a random experiment