DEPARTMENT OF MATHEMATICS AND STATISTICS

Size: px
Start display at page:

Download "DEPARTMENT OF MATHEMATICS AND STATISTICS"

Transcription

1 DEPARTMENT OF MATHEMATICS AND STATISTICS Memorial University of Newfoundland St. John s, Newfoundland CANADA A1C 5S7 ph. (709) fax (709) Alwell Julius Oyet, Phd STATISTICS FOR PHYSICAL SCIENCES LECTURE NOTES 4. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 4.1 Continuous Random Variables Recall that for a discrete random variable, the set of all possible outcomes has to be discrete in the sense that the elements of the set can be listed sequentially beginning with the smallest number. On the other hand, if the set of all possible values of a random variable is an entire interval of numbers, then the random variable is said to be continuous. In this case, between any of the two possible values we can find infinitely many possible values. This implies that the values cannot be listed sequentially. Example: For instance, let X = length of a randomly chosen rattle snake. Suppose the smallest rattle snake is of length a and the longest rattle snake is b units long. Then the set of all possible values lie between a and b. We then write a x b as the range of possible values. Example: Let X = reaction temperature (in o C) in a certain chemical process. Suppose the minimum temperature is 5 o C and maximum temperature for the process is 5 o C. Then the set of all possible values is {x : 5 x 5}. It is possible to argue that restrictions of our measuring instruments may sometimes limit us to a discrete world. For instance, our measurements of temperature may sometimes make us feel that temperature is discrete. However, continuous random variables and distributions often approximate real-world situations very well. Furthermore, problems based on continuous variables are often easier to solve than those based on discrete variables. 4.2 Probability Distributions For Continuous Random Variables 1

2 In the discrete case, the probability distribution of a random variable called the probability mass function (pmf) shows us how the total probability of 1 is distributed among the possible values of the random variable. The probability distribution of a continuous random variable is called the probability density function. Now, let Y be a continuous random variable. Then a probability density function (pdf) of Y is a function f(y) such that for any two numbers a and b, with a b, P (a Y b) = Z b a f(y)dy. That is, if we plot the graph of f(y) or the density curve, the probability that Y takes a value between a and b is the area between these two numbers and under the graph of f(y). That means, probabilities involving continuous random variables is the same as area under the curve. It therefore makes sense that for any constant, say c, P (Y = c) =0. That is, the area under the graph at a single point is zero. This implies that if Y is a continuous random variable, for any two numbers a and b, with a b, P (a Y b) =P (a <Y b) =P (a Y<b)=P (a <Y <b). We wish to emphasize the point that the equality of the four probabilities above holds for only continuous random variables. For any function f(y) to be a legitimate probability density function, it must satisfy 1. f(y) 0, for all y. 2. R f(y)dy = total area under any density curve = 1. Example: Problem 5, Page 151 A college professor never finishes his lecture before the end of the hour and always finishes his lectures within 2 min after the hour. Let X = the time that elapses between the end of the hour and the end of the lecture and suppose the pdf of X is (a) Find the value of k. f(x) = ( kx 2 0 x 2 0 otherwise. 2

3 (b) What is the probability that the lecture ends within 1 min of the end of the hour? Solution: In our discussion on discrete random variables we studied some random variables with special mass functions such as binomial, hypergeometric and Poisson. The first random variable with a special density function we shall discuss is the uniformly distributed random variable. By definition, a continuous random variable X is said to have a uniform distribution on the interval [A, B] if the pdf of X is f(x; A, B) = ( 1 B A A x B 0 otherwise. This is equivalent to the notion of equally likely outcomes of a sample space. Example: Problem 7, Page 151 The time X (min) for a lab assistant to prepare the equipment for a certain experiment is believed to have a uniform distribution with parameters 25 and 35. (a) Write the pdf of X and sketch its graph. (b) For any a such that 25 <a<a+2< 35, what is the probability that preparation time is between a and a + 2 min? Solution: 4.3 Cumulative Distribution Functions and Expected Values The cumulative distribution function (cdf) F (x) of a continuous random variable, for every number x, is the entire area to the left of the point x under the density curve f(x) ofx. That is, F (x) =P (X x) = Z x f(y)dy. Notice that the variable in the integrand has been changed to y in order not to confuse that with the upper limit of the integral. For the purpose of illustration, let us determine the cdf of the uniform random variable X in Problem 7, Page 151. Now, for this problem, ( 1 25 x 35 f(x;25, 35) = otherwise. It follows that for every x, F (x) = Z x f(y)dy = Z x 25 1 x 25 dy =

4 Therefore the cdf of X = time for a lab assistant to prepare equipment is F (x) = 0 x<25 x x<35 1 x 35. The idea of complements can also be used to compute probabilities when dealing with continuous random variables. If X is a continuous random variable and a is any constant, then P (X a) =P (X >a)=1 P (X a) =1 F (a). Some students may have noticed that this is different from the definition for discrete random variables. Recall that if Y is a discrete random variable and a is any integer, then P (Y a) =1 P (X <a)= 1 F (a 1). Students should therefore take note of this difference between discrete and continuous random variable. Also, we note that if X is a continuous random variable and a and b are constants, such that a b, then P (a X b) =P (X b) P (X a) =F (b) F (a). Again, this is slightly different from what we would have done if X were to be discrete. Percentiles of a Continuous Distribution Now, we have seen that given a value, say x = 4 and the density function of the continuous random variable X, we can compute the cdf of X at the point x = 4. For instance, let ( 0.5x 0 x 2 f(x) = 0 otherwise. Then, it is easy to verify that 0 x<0 F (x) = 0.25x 2 0 x<2 1 x 2. Thus, we can see that at x =3,F (3) = 1 and F ( 0.2) = 0. Similarly, at x =1.5, F (1.5) = 0.25 (1.5) 2 = Now, the question is, suppose we know that F (x) =0.375 and we are given the density function f(x), is it possible to find the correponding value of x, sayx = c? In this case, we can see that the value of x that gives F (x) =0.375 is x =1.5. In statistics, we call the point x =1.5, the ( )th = 37.5th percentile of the distribution of the continuous random variable X. Students may have observed that for both discrete and continuous random varables, the smallest value of the cdf is zero and the largest value is 1. A general definition of the percentile of a distribution is as follows. 4

5 Definition: Let p be a number between 0 and 1. The (100p)th percentile of the distribution of a continuous random variable X, we shall denote by c, is that value for which Z c p = F (c) = f(y)dy. Important cases of p are as follows. (a) p =0.5gives the 50th percentile which is also the median of the distribution of X. The median is the point on the x axis which divides the area under the density curve of X into two equal parts. In the course text, the median is denoted by eµ. (b) p =0.25 gives the 25th percentile which is also the first quartile of the distribution of X. The first quartile is the point on the x axis which divides the area under the density curve of X into two parts such that the area to the left is 25% of the total area and the area to the right is 75% of the total area. (c) p =0.75 gives the 75th percentile which is also the third quartile of the distribution of X. The third quartile is the point on the x axis which divides the area under the density curve of X into two parts such that the area to the left is 75% of the total area and the area to the right is 25% of the total area. As an example, suppose we wish to find the median of the distribution of the random variable X with density function given by f(x) = ( 0.5x 0 x 2 0 otherwise. The first step is to compute F (x). From previous results, we have 0 x<0 F (x) = 0.25x 2 0 x<2 1 x 2. Now, we are seeking the value of x for which F (x) = 50% = 0.5. Thus, we solve the equation F (x) =0.25x 2 =0.5, for x. We obtain x = ± q = ± 2. Since the negative value falls outside the acceptable limits, the median of the distribution of X is x = 2. Examples and Practice Problems (see class notes for solutions) 5

6 Problem 1, Page 150 Let X denote the amount of time for which a book on a 2-hour reserve at a college library is checked out by a randomly selected student and suppose that X has density function ( 0.5x 0 x 2 f(x) = 0 otherwise Calculate the following probabilities (a) P (X 1) (b) P (0.5 X 1.5) (c) P (1.5 X). Problem 7, Page 151 The time X (min) for a lab assistant to prepare the equipment for a certain experiment is believed to have a uniform distribution with A = 25 and B = 35. (a) Write the pdf of X and sketch its graph. (b) What is the probability that preparation time exceeds 33 min? (c) What is the probability that preparation time is within 2 min of the mean time? (d) For any a such that 25 <a<a+2 < 35, what is the probability that preparation time is between a and a + 2 min? Problem Find the cdf of X = measurement error with pdf (a) Compute P (X <0). f(x) = (b) Compute P ( 1 <X<1). ( 3 (4 32 x2 ) 2 x 2 0 otherwise (c) Verify that the median of the distribution of X is zero Expected or Mean Values Definition: The expected or mean vaue of a continuous random variable X with pdf f(x) is µ = µ X = E(X) = Z x f(x)dx. 6

7 If h(x) is any function of X, then µ h(x) = E[h(X)] = Z h(x) f(x)dx. A very important special case which we have met in our study of discrete random variable is when h(x) =(X µ) 2. The expectation is then called the variance of X and denoted by σ 2 or σx 2 or V (X). That is, σ 2 = V (X) = Z (x µ) 2 f(x)dx. We repeat, once again, that for computational purposes, it is easier to use the expression V (X) =E(X 2 ) [E(X)] 2. Examples and Practice Problems (see class notes for solutions) Problem 21, Page 160 An ecologist wishes to mark off a circular sampling region having radius 10m. However, the radius of the resulting region is actually a random variable R with pdf f(r) = (a) Compute E(R) and V (R). ( 3 4 [1 (10 r)2 ] 9 x 11 0 otherwise (b) What is the expected area of the resulting circular region? Problem 22, Page 160 The weekly demand for propane gas (in 1000 s of gallons) from a particular facility is a continuous random variable X with pdf f(x) = (a) Compute the cdf of X. ( 2 ³ 1 1 x 2 1 x 2 0 otherwise. (b) Obtain an expression for the (100p)th percentile. What is the median? (c) Compute the mean and variance of X. (d) If 1.5 thousand gallons are in stock at the beginning of the week and no new supply is due in during the week, how much of the 1.5 thousand gallons is expected to be left at the end of the week? 7

8 4.5. The Normal Distribution In statistics, the normal distribution is the most important distribution because several random variables that we encounter in real life have distributions that can be approximated by the normal distribution. Students who chose to pursue a career in statistics or take more statistics courses will find that the normal distribution is central to several statistical techniques that are used in data analysis. Definition: A continuous random variable X is said to have a normal distribution with parameters µ and σ 2 if the pdf of X is f(x; µ, σ 2 )= 1 σ 2π exp ½ 1 2 ³ x µ 2¾ σ x, where µ and σ 2 > 0. It can be shown that the mean and variance of a normally distributed random variable X are E(X) =µ and V (X) =σ 2. The function g(x) =exp(x) is called the exponential function with f(1) = exp(1) In statistics, the notation X N(µ, σ 2 ) is interpreted to mean that the random variable X has the normal distribution with mean µ and variance σ 2. Some properties of the normal distribution are as follows. (a) The normal density curve is bell shaped. (b) The normal density curve is symmetric about the mean x = µ. (c) The median of the normal density curve is equal to the mean of the normal density curve. That is, the curve is centered on x = µ and the point x = µ divides the area under the normal curve into two equal parts. Therefore, the area to the left of x = µ is 0.5 and the area to the right is also 0.5. (d) If X N(µ, σ1) 2 and Y N(µ, σ2) 2 are two normally distributed random variables with same mean but different variance σ1 2 > σ2), 2 then the density curve of the random variable X with the larger variance σ1 2 will be more spread out than the density curve for Y. Now, let X N(µ, σ 2 ) consider a change of variable from x to z by defining a new random variable Z = X µ, σ which comes from the argument of the exponential function in the expression for the normal density. This technique is called standardizing the random variable X. IfQ is a random variable with mean 8

9 µ Q = E(Q) and standard deviation σ Q = q V (Q), then we can define a new random variable by standardizing Q. In general, to standardize Q, we subtract µ Q from Q and divide by σ Q to obtain the expression, say U, given by U = Q µ Q. σ Q It is quite straightforward to verify that µ Z = E(Z) = 0 and σz 2 = V (Z) = 1. Now, using a common technique in statistic, which is beyond the scope of this course, we can see that the new density function for Z, called the standard normal density function is f(z;0, 1) = 1 2π exp ³ z2 z. 2 The properties of the standard normal density curve are the same as before except that µ = 0 and σ 2 = 1. We shall denote the cdf of a standard normal random variable Z by Φ(Z) Φ(Z) =P (Z z) = Z f(y;0, 1)dy. Observe that computing probabilities involving normal random variables will require that we integrate the normal density function. This integral is not easy to evaluate. As a result, values of the cumulative distribution function Φ(z) of the standard normal curve have been evaluated and tabulated for us to use. Examples (see notes for solutions) Problem 26, Page 171 Let Z be a standard normal random variable and calculate the following probabilities (a) P (0 Z 2.17) (b) P ( 1.53 Z 2) (c) P ( 1.75 Z) Problem 27, Page 171 In each case, determine the value of the constant c that makes the probability statement correct. (a) Φ(c) = (b) P (c Z ) =

10 Percentiles of the Standard Normal Distribution We recall that the point c, such that Φ(c) =P (Z c) =p is called the (100p)th percentile of the standard normal distribution. A notation that is commonly used in relation to the normal distribution is z α, where z α is the 100(1 α)th percentile of the standard normal distribution. That is, Φ(z α )=P(Z z α )=1 α. This means that the area under the standard normal curve to the right of z α is α and the area to the left is 1 α. As a result of symmetry of the standard normal curve about the point µ z =0, we have that the area to the left of z α is α and the area to the right is 1 α. The z α values are usually called critical values rather than percentiles for reasons that will be made clear later in our studies. This notation is particularly useful and convenient because later in this course we will need the critical values for some small values of α. Example: From the standard normal table, we can see that (a) z 0.05 = 95th percentile of the standard normal distribution = That is, for Φ(z 0.05 )=P(Z z 0.05 )= = 0.95, we must have z 0.05 = (b) Similarly, z = 97.5th percentile of the standard normal distribution = (c) z 0.01 = 99th percentile of the standard normal distribution = Students will sometimes encounter problems that involve nonstandard normal distributions in the sense that although the distribution is normal, the mean is not zero and the variance may or may not be 1. In that case, we cannot use the standard normal tables directly to compute probabilities. We first have to transform or standardize the nonstandard normal random variable into a standard normal random variable as previously discussed before we can use the standard normal table. For instance, let X N(µ, σ 2 ) with µ 6= 0 and σ 6= 1, and we wish to compute P (a X b). Since X has a nonstandard normal distribution, we first apply the transformation Z = X µ, σ to the variable X and the limits a and b and then use the standard normal table to obtain the required probability as follows. Ã a µ P (a X b) = P X µ b µ! σ σ σ Ã a µ = P Z b µ! σ σ Ã! µ b µ a µ = Φ Φ. σ σ 10

11 Example (see class notes for solution) Problem 31, Page 171 Suppose the force acting on a column that helps to support a building is normally distributed with mean 15 kips and standard deviation 1.25 kips. (a) What is the probability that the force is at most 18 kips? (b) What is the probability that the force is between 10 and 12 kips? (c) What is the probability that the force differs from 15 kips by at most 2 standard deviations? (d) How would you characterize the largest 5% of all values of the force? Problem 32, Page 171 The article Reliability of Domestic-Waste Biofilm Reactors (J. of Envir. Engr., 1995: ) suggests that substrate concentration (mg/cm 3 )ofinfluent to a reactor is normally distributed with µ =0.3 and σ =0.06. (a) What is the probability that the concentration exceeds 0.25? (b) How would you characterize the largest 5% of all concentration values? The Normal Approximation to the Binomial Distribution Students may have noticed that the normal distribution is a continuous distribution. However, it is sometimes used to approximate distributions that are not continuous. Surprisingly, these approximations are sometimes very close to the true value. In this section we shall discuss one of several possible approximations. In such cases a correction, called continuity correction is often introduced in the calculations to account for the fact that we are using a continuous distribution to approximate a discrete distribution. Let X Bin(n, p). Then, for sufficiently large value of n, the distribution of X can be approximated by a normal distribution with mean µ = np and variance σ 2 = np(1 p). The approximation appear to work well if both np 10 and n(1 p) 10. In particular, if x is any one of the possible values of X, then µ X µ P (X x) = B(x; n, p) P (X x +0.5) = P x +0.5 µ σ σ = P q X np x +0.5 np q np(1 p) np(1 p) = Φ q x +0.5 µ. np(1 p) 11

12 Example (see class notes for solution) Problem 51, Page 173 Suppose only 40% of all drivers in a certain state regularly wear a seat belt. A random sample of 500 drivers is selected. What is the probability that between 180 and 230 (inclusive) of the drivers in the sample regularly wear a seat belt? 5. JOINTLY DISTRIBUTED DISCRETE RANDOM VARIABLES AND RANDOM SAMPLES In some experimental situations, the experimenter may define more than one random variable, say X and Y, on the sample space of the experiment and may be interested in studying some properties of these random variables. The experimenter may wish to obtain the distribution of X and the distribution of Y separately. It is also possible to obtain the distribution of X and Y together (i.e. the joint distribution of X and Y. The joint probability distribution of X and Y outlines how much probability mass is placed on each possible pair of values (x, y). In this section, we shall restrict our discussion to joint distributions for discrete random variables. Definition: Let X and Y be two discrete random variables defined on the sample space of an experiment. The joint probability mass function (pmf) p(x, y) ofx and Y is defined, for each pair of numbers (x, y), by p(x, y) =P (X = y and Y = y). Given the joint pmf of two discrete random variables, it is possible to obtain the pmf of each of the random variables separately. These separate pmfs are called marginal pmfs. The marginal pmf of, say X, is obtained by summing the joint pmf over all possible values of Y. Definition: Let the joint probability mass function (pmf) of X and Y be p(x, y). Then the marginal pmf of X, denoted by p X (x) is given by p X (x) = X p(x, y). all y Similarly, the marginal pmf of Y, denoted by p Y (y) is given by p Y (y) = X all x p(x, y). 12

13 Given the marginal pmfs of X and Y, we say that the two random variables are independent if the joint pmf is equal to the product of the marginal pmfs at all possible pair of values (x, y). That is, if X and Y are independent then p(x, y) =p X (x) p Y (y), for every (x, y). If this relation fails to hold, even for only one of all the possible values, then the random variables are not independent. They are dependent. Examples and Practice Problems (see class notes for solutions) Problem 1, Page 215 A service station has both self-service and full-service islands. On each island, there is a single regular unleaded pump with two hoses. Let X denote the number of hoses being used on the self-service island, and let Y denote the number of hoses on the full-service island in use. The joint pmf of X and Y are in the table below: y p(x, y) x (a) What is P (X =1and Y =1)? (b) Compute P (X 1 and Y 1). (c) Compute P (X 6= 0and Y 6= 0). (d) Compute P (X + Y 1). (e) Compute the marginal pmf of X and of Y. (f) Are X and Y independent random variables? Problem 3, Page 216 A certain market has both an express checkout line and a superexpress checkout line. Let X 1 denote the number of customers in line at the express checkout at a particular time of day and let X 2 denote the number of customers in line at the superexpress checkout at the same time. Suppose the joint pmf of X 1 and X 2 is as given in the accompanying table. x 2 p(x, y) x

14 (a) What is the probability that there is exactly one customer in each line? (b) What is the probability that the numbers of customers in the two lines are identical? (c) What is the probability that there are at least two more customers in one line than in the other line? (d) What is the probability that the total number of customers in the two lines is exactly four? At least four? Problem 4, Page 216 Return to the situation described in Problem 3, Page 212. (a) Determine the marginal pmf of X 1 and then calculate the expected number of customers in line at the express checkout. (b) Determine the marginal pmf of X 2. (c) Are X 1 and X 2 independent? 5.1. Expected Values, Covariance and Correlation Consider two jointly distributed discrete random variables X and Y with joint pmf p(x, y). Let h(x, Y ) be any function of the random variables, then h(x, Y ) is itself a random variable. The expected or mean value of h(x, Y ) denoted by µ h(x,y ) or E[h(X, Y )] is given by µ h(x,y ) = E[h(X, Y )] = X X h(x, y) p(x, y). all x all y A special case is when h(x, Y ) takes the form h(x, Y )=(X µ X )(Y µ Y ). In that case, E[h(X, Y )] is called the covariance of X and Y and denoted by Cov(X, Y ). That is, the covariance between X and Y is Cov(X, Y )=E[(X µ X )(Y µ Y )] = X X (x µ X )(y µ Y ) p(x, y). all x all y For ease of computation it is best to use the formula Cov(X, Y )=E(XY ) µ X µ Y. If X and Y are independent, it can be shown that E(XY )=E(X) E(Y )=µ X µ Y. It is then clear that when X and Y are independent, Cov(X, Y ) = 0. On the other hand Cov(X, Y ) = 0 does not necessarily imply independence. 14

15 We have seen that the variance of a random variable σx 2 or V (X) is a measure of the size of the error made by an experimenter. If the experimenter committed a lot of mistakes during experimentation, the variance will tend to be large. Alternatively, if the experimenter was very careful and made little or no mistakes during repeated trials of the experiment, variance will tend to be small. Now, the covariance between X and Y measures how the two variables are changing together. For instance if the values of Y tend to increase as X increases, then we say that there is a positive linear relationship between X and Y. In that case, the value of Cov(X, Y ) will be positive and large if the linear relationship is very strong. On the other hand if the values of Y tend to decrease as X increases, then there is a negative linear relationship between X and Y and the Cov(X, Y ) will be negative. One problem with using covariance as a measure of the strength of linear relationships is the fact that it is usually affected by the unit in which the observations were measured. In order to remove the dependence of the covariance on the units, it is standardized to obtain the correlation coefficient denoted by ρ X,Y or simply ρ or Corr(X, Y ) and given by ρ X,Y = Cov(X, Y ) σ X σ Y, where σ X is the standard deviation of X and σ Y is the standard deviation of Y. Examples and Practice Problems (see class notes for solutions) Refer to Problem 1, Page 215. (a) Compute the expected total number of hoses in use E(X + Y ). (b) Compute the covariance for X and Y. (c) Compute the correlation for X and Y. Problem 22, Page 224 An instructor has given a short quiz consisting of two parts. For a randomly selected student, let X = the number of points earned on the first part and Y = the number of points earned on the second part. Suppose that the joint pmf of X and Y is given in the accompanying table y p(x, y) x

16 (a) If the score recorded in the grade book is the total number of points earned on the two parts, what is the expected recorded score E(X + Y )? (b) If the maximum of the two scores is recorded, what is the expected recorded score. (c) Compute the covariance for X and Y. (d) Compute the correlation for X and Y. (e) Instead of marking the first part out of 15 and the second part out of 10, suppose the instructor changed the points in the first part to y =0, 1, 2, 3 and x =0, 1, 2 for the second part. How would this affect the covariance and the correlation? Explain Statistics and Their Distribution Suppose that the owner of a vending machine on campus is trying to determine how often the machine should be refilled in a week. The owner decides to observe the number of items left in the vending machine at the end of each day. Let X = number of items left in the machine on any given day. Now, at the end of Day 1 suppose the observed number of items is x 1. Similarly, let x 2,x 3,,x 30 be the observed numbers from Day 2 to Day 30. We shall call these observed values a sample of size 30. From these observed numbers we can compute the average number of items the owner expects will be left over in any month of the year, x = x 1 + x x = X i=1 x i. We can also determine how the numbers vary from day to day, s 2 = X i=1 (x i x) 2, called the sample variance. Suppose, we repeat the experiment for another 30 days, we expect the observed values x 1,x 2,x 3,,x 30 to be slightly or very different from the observed values from the first experiment. This will in turn lead to different values of x and s 2. This variation in observed values in turn implies that the value of any function of the sample observations, such as mean, variance, standard deviation, and so on, also varies from sample to sample. Definition: A statistic is any quantity that is calculated from sample data. Now, before we actually observe the daily number of left over items, the owner has no way of knowing what the value will be on any given day. The number could range from 0 to the total number of items the machine can contain, say n. Therefore, before the experiment is actually performed, the 16

17 daily number of leftover items is itself viewed as a random variable. Denote the random variables by X 1,X 2,X 3,,X 30. Therefore, the mean and variances computed from these random variables will also be a random variable, X = X 1 + X X = 1 30 and S 2 = 1 30X (X i 29 X) 2. i=1 Notice that we have used lower case or small letter x for the actual observations and upper case or capital letter X to represent the random variable. Thus, the sample mean regarded as a statistic is denoted by X with calculated or observed value x. Similarly, the sample variance regarded as a statistic is represented as S 2 with s 2 as its computed value. Definition: The random variables X 1,X 2,,X n are said to form a random sample of size n if 1. the X i s are independent random variables. 2. every X i has the same distribution. Another way of saying the same thing is to say that X 1,X 2,,X n are independent and identically distributed abbreviated as iid. Recall that every random variable has a probability distribution. This implies that every statistic, being a random variable, has a probability distribution. The probability distribution of a statistic is sometimes called sampling distribution simply to emphasize how values of a statistic varies from sample to sample. The main issue is how the sampling distribution of a statistic can be derived. There are a number of ways in which this can be done. If the X i s is a random sample and their distribution is known, then we can use probability rules to determine the distribution of a statistic computed from the random sample. In other situations were it is too complicated to use probability rules we may obtain information about the sampling distribution of a statistic by using a computer to mimick the behaviour of the statistic through what is normally called a simulation experiment. We shall use an example to illustrate how the probability mass function of a statistic can be derived. Example (see class notes for solution) Problem 38, Page 234 There are two traffic lights on my way to work. Let X 1 be the number of lights at which I must stop and suppose that the distribution of X 1 is as follows: 17 30X i=1 X i,

18 x µ =1.1, σ p(x 1 ) =0.49 Let X 2 be the number of lights at which I must stop on the way home; X 2 is independent of X 1. Assume that X 2 has the same distribution as X 1, so that X 1, X 2 is a random sample of size n =2. (a) Let T 0 = X 1 + X 2 and determine the probability distribution of T 0. (b) Calculate µ T0 and σ 2 T 0. Problem 41, Page 236 Let X be the number of packages being mailed by a randomly selected customer at a certain shipping facility. Suppose the distribution of X is as follows. x p(x) (a) Consider a random sample of size n = 2 and let X be the sample mean number of packages shipped. Obtain the probability distribution of X. (b) Refer to Part (a) and calculate P ( X 2.5). (c) Calculate µ X and σ 2 X. (d) Again consider a random sample of size n = 2, but now focus on the statistic R = the sample range (difference between the largest and the smallest values in the sample). Obtain the distribution of R The Distribution of the Sample Mean In Problem 41, Page 236, we saw that the population mean of the random variable X, wasµ = E(X) = 2 and the population variance was σ 2 = V (X) =E(X 2 ) µ 2 =5 4 = 1. The sample mean X of a random sample of size n on the random variable X provides an indication of what a typical value of X will be and the sample variance S 2 tells us how spread out the values of X will be from this typical value X. Intuitively, we can say that since X is a typical value of X, the mean or expected value of X should be the same as that of X. That is, we expect that µ X = E( X) =E(X) =µ =2. In our example in class, Problem 41, Page 236, we saw that this was the case. However, we saw that the variance of X was slightly different from the variance of X, in the sense that σ 2 X = V ( X) = V (X) n = σ2 n = 1 2 =

19 This gives us an idea of the relationship between the mean of X, µ and the mean of X, µ X and that between the variance of X, σ and the variance of X, σ 2 X. This relationship can be extended to more general situations. We state this extension below. Proposition: Let X 1,X 2,,X n be a random sample from a distribution with mean value µ and standard deviation σ. Then 1. µ X = E( X) =E(X) =µ. 2. σ 2 X = V ( X) = V (X) n = σ2 n. and σ X = q V (X) n = σ n. In addition, if T 0 = X 1 + X 2 + X X n is the sample total, then E(T 0 )=nµ, σt 2 0 = V (T 0 )=nσ 2 and σ T0 = σ n. We remark that the above statement refers to a random sample from any distribution. We state below the distribution of X if the random sample is from a normal distribution. Proposition: If in particular the random sample X 1,X 2,,X n is from a normal distribution with mean µ and variance σ 2,(X N(µ, σ 2 )) then ³ 1. X N µ, σ. 2 n 2. T 0 N (nµ, nσ 2 ). Remark: Even when the random sample X 1,X 2,,X n is not from a normal distribution but the sample size n is sufficiently large, it can be shown that both the sample mean X and the sample total T 0 can be approximated by a normal distribution. That is, if the random sample is not from a normal distribution, then ³ 1. X N µ, σ. 2 n 2. T 0 N (nµ, nσ 2 ). The larger the value of n, the better the approximation. This statement of approximation is called the central limit theorem (CLT). In this class, by sufficiently large, we mean n>30. Example (see class notes for solution) Problems 46, 47, Page 242 The inside diameter of a randomly selected piston ring is a random variable with mean value 12cm and standard deviation 0.04cm. 19

20 (a) If X is the sample mean diameter for a random sample of n = 16 rings, where is the sampling distribution of X centered, and what is the standard deviation of the X distribution? (b) Suppose the distribution of diameter is normal, calculate P (11.99 X 12.01). Problem 48, Page 242 Let X 1,X 2,...,X 100 denote the actual net weights of 100 randomly selected 50-lb bags of fertilizer. If the expected weight of each bag is 50 and the variance is 1, calculate P (49.75 X 50.25). Problem 50, Page 243 The breaking strength of a rivet has a mean value of 10,000psi and a standard deviation of 500psi. What is the probability that the sample mean breaking strength for a random sample of 40 rivets is between 9900 and 10,200? 5.4. The Distribution of a Linear Combination Some students may have noticed that the sample mean can also be written as X = X 1 + X X n, n X = 1 n X n X n X n. If we replace the coefficients of X 1,X 2,,X n, in the expression for X, by arbitrary constants a 1,a 2,,a n, we obtain a more general expression nx Y = a 1 X 1 + a 2 X a n X n = a i X i, called a linear combination we can use to represent any statistic for different values of the constants. We observe that the sample mean X is a special case of the linear combination Y with a 1 = 1, n a 2 = 1,,a n n = 1. When a n 1 =1,a 2 = a 3 = = a n = 1, the linear combination Y becomes the sample total T 0. The next issue is how to compute the expectation and variance of a linear combination. Proposition: Let X 1,X 2,,X n be random variables with mean values E(X 1 )=µ 1, E(X 2 )=µ 2,,E(X n )=µ n and variances V (X 1 )=σ1, 2 V (X 2 )=σ2, 2,V(X n )=σn. 2 Then, the mean value of the linear combination Y is µ Y = E(a 1 X 1 + a 2 X a n X n ) = a 1 E(X 1 )+a 2 E(X 2 )+ + a n E(X n ) nx = a 1 µ 1 + a 2 µ a n µ n = a i µ i. 20 i=1 i=1

21 Furthermore, if the random variables X 1,X 2,,X n are independent, then the variance of the linear combination Y is σy 2 = V (a 1 X 1 + a 2 X a n X n ) = a 2 1V (X 1 )+a 2 2V (X 2 )+ + a 2 nv (X n ) = nx a 2 1σ1 2 + a 2 2σ a 2 nσn 2 = a 2 i σi 2. Now, if the random variables X 1,X 2,,X n are not independent then the variance of the linear combination Y is nx nx σy 2 = V (a 1 X 1 + a 2 X a n X n )= a i a j Cov(X i,x j ). i=1 i=1 j=1 To fix ideas, we discuss the following examples. Example (see class notes for solution) Problem 59, Page 247 Let X 1, X 2 and X 3 represent the times necessary to perform three successive repair tasks at a certain service facility. Suppose they are independent normal random variables with expected values µ 1, µ 2 and µ 3 and variances σ1, 2 σ2 2 and σ3 2 respectively. If µ 1 = µ 2 = µ 3 = 60 and σ1 2 = σ2 2 = σ3 2 = 15 calculate (a) P (X 1 + X 2 + X 3 200). (b) P (58 X 62). (c) P ( 10 X 1 0.5X 2 0.5X 3 5). (d) If µ 1 = 40, µ 2 = 50, µ 3 = 60, σ1 2 = 10, σ2 2 = 12 and σ3 2 = 14, calculate P (X 1 + X 2 + X 3 160). Problem 70, Page 248 Suppose we take a random sample of size n from a continuous distribution having median 0 so that the probability of any one observation being positive is 0.5. We now disregard the signs of the observations, rank them from smallest to largest in absolute value, and then let W = the sum of the ranks of the observations having positive signs. For example, if the observations are -0.3, 0.7, 2.1, and -2.5, then the 21

22 ranks of the positive observations are 2 and 3, so W = 5. In Chapter 15, W is called Wilcoxon s signed rank statistic. W can be represented as follows: nx W =1 Y 1 +2 Y 2 +3 Y n Y n = i Y i. where the Y i s are independent Bernoulli random variables, each with p =0.5 (Y i = 1 corresponds to the observation with rank i being positive). Compute the following: i=1 (a) E(Y i ) and then E(W ). Hint nx i = i=1 n(n +1). 2 (b) V (Y i ) and then V (W ). Hint nx i 2 = i=1 n(n + 1)(2n +1) POINT ESTIMATION NOT PART OF SYLLABUS. SKIP TO CHAPTER INTERVAL ESTIMATION Let us assume that we are trying to figure out the amount of money a student needs to survive in a typical university for a given semester. Now, based on our experience as students or other information available to us, in the form of data, we may estimate the amount in two ways. We can choose to use the data to specify a single amount, say $ , as our estimate. Such an estimate which specifies a single value is called a point estimate. It is called a point estimate because it is a single value, a point in some sense. Such estimates are discussed in Chapter 6. Someone who is more careful and does not like to be wrong may suggest instead that we specify an interval as an estimate. So, we may say, the amount should lie between $3500 and $4500. Such an estimate is 22

23 called an interval estimate. We may decide not to be too precise by specifying a very wide range of values, say $3500 and $ In that case, the interval estimate will be unreliable and therefore not very helpful to students because it is too wide, the width being $ $3500 = In this section, we will be focusing on how to use information in data to obtain interval estimates Basic Properties of Confidence Intervals We shall introduce the basic concepts of interval estimation with an example that is practically unrealistic. We shall then consider situations that may arise in practice. Consider a random variable X = alcohol content of a certain brand of wine manufactured by a company. Suppose, the problem is to estimate the unknown population mean µ of the alcohol content of all bottles of this particular brand of wine manufactured by the company. Let us assume that X N(µ, σ 2 ), with σ 2 known and suppose that we take a random sample of size n wine bottles of this brand with alcohol content X 1,X 2,,X n. On the average, the alcohol content of the n wine bottles will be X. From results of 5, we know that X N(µ, σ 2 /n). If we standardize X, we obtain a standard normal random variable Z = X µ σ n. Using the cumulative standard normal probability table we find that P ( 1.96 <Z<1.96) = If we replace Z by the expression for standardizing the mean, we have P 1.96 < X µ σ < 1.96 =0.95. n Then, by simplifying the expression within the brackets, we can rewrite the probability expression above as à P X 1.96 σ <µ< X σ! =0.95. n n This simply means that the probability that the random interval à X 1.96 σ, X σ! n n, will cover the true mean µ is The interval is random because the statistic X is random. That is, before the experiment is performed and measurements are taken, we are 95% certain or confident 23

24 that the true mean µ will lie inside the interval above. Once the experiment is performed and measurements are observed, say X 1 = x 1,X 2 = x 2,,X n = x n, then the observed average or mean becomes X = x. The resulting fixed, non-random interval given by à x 1.96 n σ, x n σ!, or written as x 1.96 σ <µ< x σ n n or in a more compact form as x ± 1.96 σ, n is called a 95% confidence interval for µ. The situation we have used to obtain the expression for the confidence interval is practically unrealistic because of the assumptions we have made. We assumed that the random sample is from a normal population with unknown mean µ and known variance σ 2. In practice, the mean is needed in order to compute the variance. So, if the mean is unknown, it is highly unlikely that the variance will be known. Some students may have noticed that there is a relationship between the value 1.96 in the expression for the confidence interval and the confidence level We observe that 1.96 is the (1 0.05/2)100th =( )100th = 97.5th percentile or critical value of the standard normal distribution. Using the notations from Chapter 4, we write z =1.96. Thus, for a general confidence level, say 99% or 98% or 90%, a (1 α)100% confidence interval for the population mean µ of a distribution is obtained by replacing 1.96 with the critical value z α, 2 à x z α 2 σ n, x + z α 2! σ, n or x ± z α 2 σ n. The width of the (1 α)100% confidence interval is à x + z α 2! à σ x z n α 2! σ =2z n α 2 σ n. An interval width a large width shows that the error in our estimation is too large and therefore the interval will not be reliable or the estimate becomes unrealistic. Example (see class notes for solution) 24

25 Problems 1, 5, Page 290 Assume that the helium porosity (in percentage) of coal samples taken from any particular seam is normally distributed with true standard deviation (a) What is the confidence level for the interval x ± 2.81 σ n? (b) What value of z α/2 in the CI formula results in a confidence level of 99.7%? (c) Compute a 98% CI for the true average porosity of a certain seam if the average porosity for 16 specimens from the seam was (d) How large a sample size is necessary if the width of the 95% interval is to be 0.4? Problem 2, Page 290 Each of the following is a confidence interval for µ = true average (i.e., population mean) resonance frequency (Hz) for all tennis rackets of a certain type: (114.4, 115.6) (114.1, 115.9) (a) What is the value of the sample mean resonance frequency? (b) Both intervals were calculated from the same sample data. The confidence level for one of these intervals is 90% and for the other is 99%. Which of the intervals has the 90% confidence level, and why? 7.2. Large-Sample Confidence Intervals for a Population Mean and Proportion In 7.1 we made some assumptions which are practically unrealistic. We assumed that the variance of the population distribution σ 2 is known. The assumption that the random sample is from a normal population may be valid sometimes but may fail in other situations. We will now discuss large-sample confidence intervals whose validity does not depend on these assumptions. Large-Sample Interval for µ Let X 1,X 2,,X n be a random sample from a population whose distribution is not known with mean µ and variance σ 2 that are also unknown. Recall that if the sample size n is large, then by the Central Limit Theorem we discussed in Chapter 5, the distribution of the sample mean X is approximately normal with mean E( X) =µ and variance V ( X) =σ 2 /n. That is, It follows that the standardized random variable X N(µ, σ 2 /n). Z = X µ σ n, 25

26 has approximately the standard normal distribution. Therefore, P z α/2 X µ z α/2 1 α. σ n Following the technique used in 7.1 we can rearrange the expression in the parentheses to obtain an approximate (1 α)100% confdence interval for the population mean µ as x ± z α/2 The difficulty with this expression for confidence interval is that σ is not known. Therefore, since n is sufficiently large, we simply replace σ by the sample standard deviation σ n. v u s = t 1 Ã X n x 2 i n x!, n 1 2 i=1 to obtain the large-sample confidence interval for µ with an approximate confidence level of (1 α)100% as s x ± z α/2. n Examples (see class notes for solution) Problem 13, Page 297 The article Extravisual Damage Detection? Defining the Standard Normal Tree (Photogrammetric Engr. and Remote Sensing, 1981: ) discusses the use of color infrared photography in identification of normal trees in Douglas fir stands. Among data reported were summary statistics for green-filter analytic optical densitometric measurements on samples of both healthy and diseased trees. For a sample of 69 healthy trees, the sample mean dye-layer density was 1.028, and the sample standard deviation was (a) Calculate a 95% (two-sided) CI for the true average dye-layer density for all such trees. (b) Suppose the investigators had made a rough guess of 0.16 for the value of s before collecting data. What sample size would be necessary to obtain an interval width of 0.05 for a confidence level of 95%? 26

27 Large-Sample Interval for a Population Proportion p In Chapter we discussed how the normal distribution can be used to approximate the binomial distribution. Now, suppose that X is a binomial random variable with parameters n and p, where n is the number of trials and p is the probability or proportion of successes. We recall that if n is sufficiently large, then X has approximately a normal distribution with mean E(X) = np and variance V (X) =np(1 p). Now, if X = number of successes in n trials, a reasonable estimator of the proportion of successes is ˆp = X/n, the sample fraction of successes. Since ˆp is just a fraction of X, the distribution of ˆp is also approximately normal. It can be shown that E(ˆp) = p and V (ˆp) = p(1 p)/n. Therefore, the standardized ˆp defined by Z = ˆp p, q p(1 p) n is approximately standard normal. Thus, using the same approach as before, we find that P z α/2 ˆp p q p(1 p) n z α/2 1 α. If we rearrange the inequalities in the bracket, we obtain a large-sample confidence interval for a population proportion p with an approximate confidence level of (1 α)100% to be r ˆp + z2 α/2 ± z ˆp(1 ˆp) 2n α/2 + z2 α/2 n 4n 2 1+ z2 α/2 n Suppose that the width of the confidence interval for p is w. Then, the sample size required to obtain a confidence interval of width w is n = 2z2 α/2ˆp(1 ˆp) z2 α/2 w2 ± q 4z 4 α/2ˆp(1 ˆp)[ˆp(1 ˆp) w2 ]+w 2 z 4 α/2 w 2. Unfortunately, the formula for n involves the unknown ˆp. It can be shown that the maximum value of ˆp(1 ˆp) is attained when ˆp =0.5. Therefore, using the value of ˆp =0.5will ensure that the width will be at most w regardless of what value of ˆp results from the sample. Alternatively, if the investigator believes strongly based on prior experience and information, that the true value of p, say p 0, is less than 0.5 then the value p 0 can be used in place of ˆp. Examples (see class notes for solution). 27

28 Problem 21, Page 298 A random sample of 539 households from a certain city was selected, and it was determined that 133 of these households owned at least one firearm. Compute a 95% confidence interval for the proportion of all households in this city that own at least one firearm. Problem 23, Page 298 The article An Evaluation of Football Helmets Under Impact Conditions (American Journal of Sports Medicine, 1984: ) reports that when each football helmet in a random sample of 37 suspension-type helmets was subjected to a certain impact test, 24 showed a damage. Let p denote the proportion of all helmets of this type that would show damage when tested in the prescribed manner. (a) Calculate a 99 % CI for p. (b) What sample size would be required for the width of a 99 % CI to be at most 0.10, irrespective of ˆp Intervals Based on a Normal Population Distribution The interval estimates we discussed in 7.1 was not practical because of the assumptions underlying the estimates. In 7.2 we discussed intervals for large samples, e.g. that is when n>30. Here, we discuss interval estimates for small samples and the assumptions under which the interval estimation method can be used. In 7.2 the intervals where obtained by invoking the central limit theorem because the sample size was assumed to be large. The key was that the standardized variable Z = X µ S n, has approximately the standard normal distribution when n is large even when the population distribution of the random sample X 1,X 2,,X n is not the normal distribution. Now, when the sample size n is small, we cannot use the central limit theorem to obtain a random variable that has an approximate standard normal distribution. However, we can show that if n is small and the random sample X 1,X 2,,X n of size n is from a normal population, then the standardized variable T = X µ, S n has a probability distribution that is called a student t-distribution with n 1 degrees of freedom, where S is the standard deviation. Notice that the standardized variable has not changed, we 28

29 have simply chosen to denote it by T rather than Z in order to emphasize the point that it is no longer normally distributed. Due to the complicated mathematical expression for the probability density function of T, we shall not provide the expression in this class. In addition, to assist us in computing probabilities without actually integrating the density function, a table of values of cumulative probability for the t-distribution is available in most elementary statistics text and will be made available to us in the class. Properties of t Distributions The normal distribution is characterized by two parameters, the mean µ and the variance σ 2. By that we mean, once these two parameters are known, the normal distribution is completely determined. On the other hand, the t-distribution is characterized by only one parameter called the degrees of freedom we shall denote by the Greek letter ν. Possible values of ν are 1, 2, 3,. For a fixed value of ν, some of the properties or features of the density curve of the t-distribution are as follows. Let t ν denote the density curve of the t-distribution with ν degrees of freedom. Then, 1. each t ν curve is bell-shaped and centered at 0. The standard normal z curve also has this property. 2. each t ν curve is more spread out than the standard normal z curve. In order words, the variance of data from a population with a t-distribution is more likely to be larger than the variance of data from a population with a standard normal distribution. 3. as ν increases, the spread of the corresponding t ν curve decreases. 4. as ν becomes larger and larger or tends to infinity, the sequence of t ν curves approaches the standard normal z curve. Notation In Chapter 4, we introduced the z α notation which we called the z critical value. An equivalent notation for the t-distribution is t α,ν = the point on the t-axis of the t ν curve for which the area to the right under the curve is α. We shall call t α,ν a t critical value. The One-Sample t Confidence Interval for a Population mean Consider the random sample X 1,X 2,,X n of size n from a normal population with n<30. Let X be the sample mean. Then, the standardized random variable T = X µ S n, 29

STAT Chapter 5 Continuous Distributions

STAT Chapter 5 Continuous Distributions STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Chapter 4: Continuous Random Variables and Probability Distributions

Chapter 4: Continuous Random Variables and Probability Distributions Chapter 4: and Probability Distributions Walid Sharabati Purdue University February 14, 2014 Professor Sharabati (Purdue University) Spring 2014 (Slide 1 of 37) Chapter Overview Continuous random variables

More information

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Let X = lake depth at a randomly chosen point on lake surface If we draw the histogram so that the

More information

Ch. 5 Joint Probability Distributions and Random Samples

Ch. 5 Joint Probability Distributions and Random Samples Ch. 5 Joint Probability Distributions and Random Samples 5. 1 Jointly Distributed Random Variables In chapters 3 and 4, we learned about probability distributions for a single random variable. However,

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Five Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Five Notes Spring 2011 1 / 37 Outline 1

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

ST 371 (IX): Theories of Sampling Distributions

ST 371 (IX): Theories of Sampling Distributions ST 371 (IX): Theories of Sampling Distributions 1 Sample, Population, Parameter and Statistic The major use of inferential statistics is to use information from a sample to infer characteristics about

More information

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS Recall that a continuous random variable X is a random variable that takes all values in an interval or a set of intervals. The distribution of a continuous

More information

Chapter 1: Revie of Calculus and Probability

Chapter 1: Revie of Calculus and Probability Chapter 1: Revie of Calculus and Probability Refer to Text Book: Operations Research: Applications and Algorithms By Wayne L. Winston,Ch. 12 Operations Research: An Introduction By Hamdi Taha, Ch. 12 OR441-Dr.Khalid

More information

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1

More information

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or Expectations Expectations Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or µ X, is E(X ) = µ X = x D x p(x) Expectations

More information

Notes for Math 324, Part 19

Notes for Math 324, Part 19 48 Notes for Math 324, Part 9 Chapter 9 Multivariate distributions, covariance Often, we need to consider several random variables at the same time. We have a sample space S and r.v. s X, Y,..., which

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

Part Possible Score Base 5 5 MC Total 50

Part Possible Score Base 5 5 MC Total 50 Stat 220 Final Exam December 16, 2004 Schafer NAME: ANDREW ID: Read This First: You have three hours to work on the exam. The other questions require you to work out answers to the questions; be sure to

More information

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions Week 5 Random Variables and Their Distributions Week 5 Objectives This week we give more general definitions of mean value, variance and percentiles, and introduce the first probability models for discrete

More information

Continuous Probability Distributions

Continuous Probability Distributions 1 Chapter 5 Continuous Probability Distributions 5.1 Probability density function Example 5.1.1. Revisit Example 3.1.1. 11 12 13 14 15 16 21 22 23 24 25 26 S = 31 32 33 34 35 36 41 42 43 44 45 46 (5.1.1)

More information

Chapter 2. Probability

Chapter 2. Probability 2-1 Chapter 2 Probability 2-2 Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance with certainty. Examples: rolling a die tossing

More information

Homework 10 (due December 2, 2009)

Homework 10 (due December 2, 2009) Homework (due December, 9) Problem. Let X and Y be independent binomial random variables with parameters (n, p) and (n, p) respectively. Prove that X + Y is a binomial random variable with parameters (n

More information

STAT FINAL EXAM

STAT FINAL EXAM STAT101 2013 FINAL EXAM This exam is 2 hours long. It is closed book but you can use an A-4 size cheat sheet. There are 10 questions. Questions are not of equal weight. You may need a calculator for some

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

The mean, variance and covariance. (Chs 3.4.1, 3.4.2)

The mean, variance and covariance. (Chs 3.4.1, 3.4.2) 4 The mean, variance and covariance (Chs 3.4.1, 3.4.2) Mean (Expected Value) of X Consider a university having 15,000 students and let X equal the number of courses for which a randomly selected student

More information

Sample Problems for the Final Exam

Sample Problems for the Final Exam Sample Problems for the Final Exam 1. Hydraulic landing assemblies coming from an aircraft rework facility are each inspected for defects. Historical records indicate that 8% have defects in shafts only,

More information

Continuous Probability Distributions

Continuous Probability Distributions 1 Chapter 5 Continuous Probability Distributions 5.1 Probability density function Example 5.1.1. Revisit Example 3.1.1. 11 12 13 14 15 16 21 22 23 24 25 26 S = 31 32 33 34 35 36 41 42 43 44 45 46 (5.1.1)

More information

CHAPTER 5. Jointly Probability Mass Function for Two Discrete Distributed Random Variables:

CHAPTER 5. Jointly Probability Mass Function for Two Discrete Distributed Random Variables: CHAPTER 5 Jointl Distributed Random Variable There are some situations that experiment contains more than one variable and researcher interested in to stud joint behavior of several variables at the same

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

EXAM # 3 PLEASE SHOW ALL WORK!

EXAM # 3 PLEASE SHOW ALL WORK! Stat 311, Summer 2018 Name EXAM # 3 PLEASE SHOW ALL WORK! Problem Points Grade 1 30 2 20 3 20 4 30 Total 100 1. A socioeconomic study analyzes two discrete random variables in a certain population of households

More information

5.2 Continuous random variables

5.2 Continuous random variables 5.2 Continuous random variables It is often convenient to think of a random variable as having a whole (continuous) interval for its set of possible values. The devices used to describe continuous probability

More information

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2 Math 205 Spring 206 Dr. Lily Yen Midterm 2 Show all your work Name: 8 Problem : The library at Capilano University has a copy of Math 205 text on two-hour reserve. Let X denote the amount of time the text

More information

STA 584 Supplementary Examples (not to be graded) Fall, 2003

STA 584 Supplementary Examples (not to be graded) Fall, 2003 Page 1 of 8 Central Michigan University Department of Mathematics STA 584 Supplementary Examples (not to be graded) Fall, 003 1. (a) If A and B are independent events, P(A) =.40 and P(B) =.70, find (i)

More information

Stat 704 Data Analysis I Probability Review

Stat 704 Data Analysis I Probability Review 1 / 39 Stat 704 Data Analysis I Probability Review Dr. Yen-Yi Ho Department of Statistics, University of South Carolina A.3 Random Variables 2 / 39 def n: A random variable is defined as a function that

More information

Statistics 100A Homework 5 Solutions

Statistics 100A Homework 5 Solutions Chapter 5 Statistics 1A Homework 5 Solutions Ryan Rosario 1. Let X be a random variable with probability density function a What is the value of c? fx { c1 x 1 < x < 1 otherwise We know that for fx to

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

2. A music library has 200 songs. How many 5 song playlists can be constructed in which the order of the songs matters?

2. A music library has 200 songs. How many 5 song playlists can be constructed in which the order of the songs matters? Practice roblems for final exam 1. A certain vault requires that an entry code be 8 characters. If the first 4 characters must be letters (repeated letters are allowed) and the last 4 characters are numeric

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Applied Statistics I

Applied Statistics I Applied Statistics I Liang Zhang Department of Mathematics, University of Utah July 8, 2008 Liang Zhang (UofU) Applied Statistics I July 8, 2008 1 / 15 Distribution for Sample Mean Liang Zhang (UofU) Applied

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Continuous random variables

Continuous random variables Continuous random variables Continuous r.v. s take an uncountably infinite number of possible values. Examples: Heights of people Weights of apples Diameters of bolts Life lengths of light-bulbs We cannot

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique

More information

2 (Statistics) Random variables

2 (Statistics) Random variables 2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes

More information

Continuous random variables

Continuous random variables Continuous random variables Can take on an uncountably infinite number of values Any value within an interval over which the variable is definied has some probability of occuring This is different from

More information

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Probability Density Functions

Probability Density Functions Probability Density Functions Probability Density Functions Definition Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f (x) such that

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

1 Review of Probability

1 Review of Probability 1 Review of Probability Random variables are denoted by X, Y, Z, etc. The cumulative distribution function (c.d.f.) of a random variable X is denoted by F (x) = P (X x), < x

More information

Closed book and notes. 120 minutes. Cover page, five pages of exam. No calculators.

Closed book and notes. 120 minutes. Cover page, five pages of exam. No calculators. IE 230 Seat # Closed book and notes. 120 minutes. Cover page, five pages of exam. No calculators. Score Final Exam, Spring 2005 (May 2) Schmeiser Closed book and notes. 120 minutes. Consider an experiment

More information

Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1

Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1 Joint probability distributions: Discrete Variables Two Discrete Random Variables Probability mass function (pmf) of a single discrete random variable X specifies how much probability mass is placed on

More information

ENGG2430A-Homework 2

ENGG2430A-Homework 2 ENGG3A-Homework Due on Feb 9th,. Independence vs correlation a For each of the following cases, compute the marginal pmfs from the joint pmfs. Explain whether the random variables X and Y are independent,

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

3 Modeling Process Quality

3 Modeling Process Quality 3 Modeling Process Quality 3.1 Introduction Section 3.1 contains basic numerical and graphical methods. familiar with these methods. It is assumed the student is Goal: Review several discrete and continuous

More information

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators. IE 230 Seat # Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators. Score Exam #3a, Spring 2002 Schmeiser Closed book and notes. 60 minutes. 1. True or false. (for each,

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics,

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

Special Discrete RV s. Then X = the number of successes is a binomial RV. X ~ Bin(n,p).

Special Discrete RV s. Then X = the number of successes is a binomial RV. X ~ Bin(n,p). Sect 3.4: Binomial RV Special Discrete RV s 1. Assumptions and definition i. Experiment consists of n repeated trials ii. iii. iv. There are only two possible outcomes on each trial: success (S) or failure

More information

Introduction to Probability

Introduction to Probability LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute

More information

Course information: Instructor: Tim Hanson, Leconte 219C, phone Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment.

Course information: Instructor: Tim Hanson, Leconte 219C, phone Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment. Course information: Instructor: Tim Hanson, Leconte 219C, phone 777-3859. Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment. Text: Applied Linear Statistical Models (5th Edition),

More information

Point Estimation and Confidence Interval

Point Estimation and Confidence Interval Chapter 8 Point Estimation and Confidence Interval 8.1 Point estimator The purpose of point estimation is to use a function of the sample data to estimate the unknown parameter. Definition 8.1 A parameter

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

STAT515, Review Worksheet for Midterm 2 Spring 2019

STAT515, Review Worksheet for Midterm 2 Spring 2019 STAT55, Review Worksheet for Midterm 2 Spring 29. During a week, the proportion of time X that a machine is down for maintenance or repair has the following probability density function: 2( x, x, f(x The

More information

Chapter 2. Continuous random variables

Chapter 2. Continuous random variables Chapter 2 Continuous random variables Outline Review of probability: events and probability Random variable Probability and Cumulative distribution function Review of discrete random variable Introduction

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Week 9 The Central Limit Theorem and Estimation Concepts

Week 9 The Central Limit Theorem and Estimation Concepts Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population

More information

Continuous Random Variables and Probability Distributions

Continuous Random Variables and Probability Distributions Continuous Random Variables and Probability Distributions Instructor: Lingsong Zhang 1 4.1 Probability Density Functions Probability Density Functions Recall from Chapter 3 that a random variable X is

More information

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University Statistics for Economists Lectures 6 & 7 Asrat Temesgen Stockholm University 1 Chapter 4- Bivariate Distributions 41 Distributions of two random variables Definition 41-1: Let X and Y be two random variables

More information

Stat 315: HW #6. Fall Due: Wednesday, October 10, 2018

Stat 315: HW #6. Fall Due: Wednesday, October 10, 2018 Stat 315: HW #6 Fall 018 Due: Wednesday, October 10, 018 Updated: Monday, October 8 for misprints. 1. An airport shuttle route includes two intersections with traffic lights. Let i be the number of lights

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics. Probability Distributions Prof. Dr. Klaus Reygers (lectures) Dr. Sebastian Neubert (tutorials) Heidelberg University WS 07/8 Gaussian g(x; µ, )= p exp (x µ) https://en.wikipedia.org/wiki/normal_distribution

More information

Estadística I Exercises Chapter 4 Academic year 2015/16

Estadística I Exercises Chapter 4 Academic year 2015/16 Estadística I Exercises Chapter 4 Academic year 2015/16 1. An urn contains 15 balls numbered from 2 to 16. One ball is drawn at random and its number is reported. (a) Define the following events by listing

More information

Solutions - Final Exam

Solutions - Final Exam Solutions - Final Exam Instructors: Dr. A. Grine and Dr. A. Ben Ghorbal Sections: 170, 171, 172, 173 Total Marks Exercise 1 7 Exercise 2 6 Exercise 3 6 Exercise 4 6 Exercise 5 6 Exercise 6 9 Total 40 Score

More information

Random Variables and Expectations

Random Variables and Expectations Inside ECOOMICS Random Variables Introduction to Econometrics Random Variables and Expectations A random variable has an outcome that is determined by an experiment and takes on a numerical value. A procedure

More information

Exponential, Gamma and Normal Distribuions

Exponential, Gamma and Normal Distribuions Exponential, Gamma and Normal Distribuions Sections 5.4, 5.5 & 6.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 9-3339 Cathy Poliak,

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Confidence intervals CE 311S

Confidence intervals CE 311S CE 311S PREVIEW OF STATISTICS The first part of the class was about probability. P(H) = 0.5 P(T) = 0.5 HTTHHTTTTHHTHTHH If we know how a random process works, what will we see in the field? Preview of

More information

Twelfth Problem Assignment

Twelfth Problem Assignment EECS 401 Not Graded PROBLEM 1 Let X 1, X 2,... be a sequence of independent random variables that are uniformly distributed between 0 and 1. Consider a sequence defined by (a) Y n = max(x 1, X 2,..., X

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

CS145: Probability & Computing

CS145: Probability & Computing CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,

More information

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R Random Variables Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R As such, a random variable summarizes the outcome of an experiment

More information

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes The Poisson Distribution 37.3 Introduction In this Section we introduce a probability model which can be used when the outcome of an experiment is a random variable taking on positive integer values and

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4. UCLA STAT 11 A Applied Probability & Statistics for Engineers Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology Teaching Assistant: Christopher Barr University of California, Los Angeles,

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Continuous distributions

Continuous distributions CHAPTER 7 Continuous distributions 7.. Introduction A r.v. X is said to have a continuous distribution if there exists a nonnegative function f such that P(a X b) = ˆ b a f(x)dx for every a and b. distribution.)

More information

BASICS OF PROBABILITY

BASICS OF PROBABILITY October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Semester , Example Exam 1

Semester , Example Exam 1 Semester 1 2017, Example Exam 1 1 of 10 Instructions The exam consists of 4 questions, 1-4. Each question has four items, a-d. Within each question: Item (a) carries a weight of 8 marks. Item (b) carries

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

15 Discrete Distributions

15 Discrete Distributions Lecture Note 6 Special Distributions (Discrete and Continuous) MIT 4.30 Spring 006 Herman Bennett 5 Discrete Distributions We have already seen the binomial distribution and the uniform distribution. 5.

More information

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 2.4 Random Variables Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan By definition, a random variable X is a function with domain the sample space and range a subset of the

More information

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions Introduction to Statistical Data Analysis Lecture 3: Probability Distributions James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information