Some Special Distributions (Hogg Chapter Three)

Size: px
Start display at page:

Download "Some Special Distributions (Hogg Chapter Three)"

Transcription

1 Some Special Distributions Hogg Chapter Three STAT 45-: Mathematical Statistics I Fall Semester 3 Contents Binomial & Related Distributions The Binomial Distribution Some advice on calculating n 3 k Properties of the Binomial Distribution 4 The Multinomial Distribution 5 3 More on the Binomial Distribution 5 4 The Negative Binomial Distribution 6 5 The Hypergeometric Distribution 7 The Poisson Distribution 7 Aside: Simulating a Poisson Process 8 Poisson Process and Rate 8 3 Connection to Binomial Distribution and Derivation of pmf 8 3 Gamma and Related Distributions 3 Gamma Γ Distribution 3 Exponential Distribution 33 Chi-Square χ Distribution 3 Copyright 3, John T Whelan, and all that 34 Beta β Distribution 3 4 The Normal Distribution 3 5 Multivariate Normal Distribution 6 5 Linear Algebra: Reminders and Notation 6 5 Special Case: Independent Gaussian Random Variables 8 53 Multivariate Distributions 9 54 General Multivariate Normal Distribution 55 Consequences of the Multivariate Normal Distribution 3 55 Connection to χ Distribution 4 56 Marginal Distribution 4 57 Conditional Distribution 5 6 The t- and F Distributions 6 6 The t distribution 6 6 The F distribution 7 63 Student s Theorem 7

2 Tuesday 8 October 3 Read Section 3 of Hogg Binomial & Related Distributions The Binomial Distribution Recall our first example of a discrete random variable, the number of heads on three flips of a fair coin In that case, we were able to count the number of outcomes in each event, eg, there were three ways to get two heads and a tail: {HHT, HT H, T HH}, and each of the eight possible outcomes had the same probability, 5 If we consider a case where 8 the coin is weighted so that each flip has a 6% chance of being a head and only a 4% chance of being a tail, then the probability of each of the ways to get two heads and a tail is: P HHT P HT H P T HH a b c In each case, the probability is , so if X is the number of heads in three coin flips, P X We can work out all of the probabilities in this way: #H outcomes # probeach outcome tot prob {T T T } {T T H, T HT, HT T } {T HH, HT H, HHT } {T T T } This is an example of a binomial random variable We have a set of n identical, independent trials, experiments which could each turn out one of two ways We call one of those possible results success eg, heads and the other failure eg, tails The probability of success on a given trial is some number p [, ], so the non-negative integer n and real number p are parameters of the distribution The random variable X is the number of successes in the n trials Evidently, the possible values for X are,,,, n The probability of x successes is px P X x # of outcomesprob of each outcome We can generalize the discussion above to see that the probability for any sequence containing x successes, and therefore n x failures is p x p n x In the simple example we just listed all of the outcomes in a particular event, but that won t be practical in general Instead we fall back on a result from combinatorics The number of ways of choosing x out of the n trials, which we call n choose x, is n x nn n n x + xx x n! n x!x! 3 As a reminder, consider the number of distinct poker hands, consisting of five cards chosen from the 5-card deck If we consider first the cards dealt out in order on the table in front of us, there are 5 possibilities for the first card, 5 for the second, 5 for the third, 49 for the fourth and for the fifth So the number of ways five cards could be dealt to us in order is ! But that is overcounting 47! the number of distinct poker hands, since we can pick up the five cards and rearrange them anyway we like Since we have 5 choices for the first card, 4 for the second, 3 for the third, for the fourth and for the fifth, any five-card poker hand can be rearranged 543 5! different ways So in the list of 5! ordered poker hands, each unordered hand 47!

3 is represented 5! times Thus the number of unordered poker hands is 5 5! 5 47!5! returning to the binomial distribution, this makes the pmf n px p x p n x x,,, n 5 x Some advice on calculating n k There is some art to calculating the binomial coëfficient n n! k n k!k! 6 For one thing, you basically never want to actually calculate the factorials For example, consider It s easy to calculate if we write On the other hand, if we write! 98!! 8 we find that! and 98! are enormous numbers which will overflow calculators, single precision programs, etc A few identities which make these calculations easier: n n! n k k!n k! n 9a k n n n n n n n! n!! 9b n! n!! n 9c Note that this uses the definition that! If k is small, you can write out the fraction, cancelling out n k factors to leave k factors in the numerator and k factors including in the denominator If n k is small, you cancel out k factors to leave n k If k is really small you can use Pascal s triangle n : n : n : n 3: 3 3 n 4: where element k of row n is n k and is created for n > by the recursion relation n n n + k k k I mention this mostly as a reminder that the binomial coëfficient is what appears in the binomial expansion a + b n n k n a k b n k k This can be used to show the binomial pmf is properly normalized: n n n px p x p n x p + [ p] n n x x x As a curiosity, note that sometimes a general formula [such as the recursion relation ] might tell us to calculate some sort 3

4 of impossible binomial coëfficient, like 5 or 4 6 Logically, it would seem reasonable to declare these to be zero, since there are no ways eg, to pick six different items from a list of four In fact, there s a sense in which the definition with factorials extends to this For example, 3 4 3! 4!! 3 Now, the factorial of a negative number is not really defined, but if we try to extend the definition, it turns out to be, in some sense, infinite We will, in the near future, define something known as the gamma function, usually defined by the integral Γα t α e t dt 4 It s easy to do the integral for α and show that Γ Using either integration by parts or parametric differentiation, we can show that Γα + αγα, from which you can show that for non-negative integer n, Γn + n! If we try to use the recursion relation to find Γ, which would be!, we d get Γ Γ 5 which is not possible if Γ is finite So it makes some sense to replace with zero Proceeding in this way you can show that Γ Γα also blows up if α is any negative integer We can also use the Gamma function to extend the definition of the factorial to non-integer values, but that s a matter for another time Finally, there s the question of how to calculate binomial probabilities when x and n x are really large At this point, you re going to be using a computer anyway There are a few tricks for dealing with the large factorials, primarily by calculating ln px rather than px, but many statistical software packages do a lot of the work for you For instance, in R, you can calculate the binomial pdf and/or cdf with commands like n<-5 p<-5 x<-:n px<-dbinomx,n,p Fx<-pbinomx,n,p or with python import numpy from scipystats import binom n5 p5 xnumpyarangen+ pxbinompmfx,n,p Fxbinomcdfx,n,p Properties of the Binomial Distribution Returning to the binomial distribution, we can find its moment generating function and use it to find the mean and variance relatively quickly: Mt Ee tx n e tx px x n pe t x p n x pe t +[ p] n x 6 It is often easier to work with the logarithm of the mgf, known as the cumulant generating function, which you studied in Hogg problem 48: ψt ln Mt n lnpe t + [ p] 7 We use the chain rule to take its derivative: ψ t npe t pe t + [ p] 8 4

5 from which we get the mean EX ψ and differentiating again gives ψ t from which we get the variance np p + [ p] np 9 npe t pe t + [ p] np e t pe t + [ p] VarX ψ np np np p These should be familiar results the mean is sort of obvious if you think about it, but deriving them directly requires tricky manipulations of the sums With the mgf it s child s play The Multinomial Distribution The binomial distribution can be thought of as a special case of the multinomial distribution, which consists of a set of n identical and independent experiments, each of which has k possible outcomes, with probabilities p, p,, p k, with p + p + + p k For the binomial case, k, p becomes p, and p is p Define random variables X, X, X k which are the number of experiments out of the n total with each outcome We could also define X k, but it is completely determined by the others as X k n X X X k The probability of any particular sequence of outcomes of which x are of the first sort, x of the same sort, etc, with x + x + + x k n is p x p x p x k k How many different such outcomes are there? Well, there are n x n! x!n x! ways to pick the x experiments with the first outcome Once we ve done that, there are n x possibilities from which to choose the x outcomes of the second sort, so there are n x x n x! x!n x x ways to do that, or a total of! n n x x x n! n x! x!n x! x!n x x! n! x!x!n x x! 3 ways to pick the experiments that end up with the first two sorts of outcomes Continuing in this way, we find the total number of outcomes of this sort to be n n x n x x n x x x k x x x 3 x k n! x!x! x k! So the joint pdf for these multinomial random variables is n! px, x,, x k x!x! x k! px p x p x k k, x,, n; x,,, n x ; Thursday October 3 Read Section 3 of Hogg 4 ; x k n x x x k 5 3 More on the Binomial Distribution Recall that if we add two independent random variables X and X with mgfs M t Ee tx and M t Ee tx, we can work out the distribution of their sum Y X + X by finding 5

6 its mgf: M Y t Ee ty Ee tx +X Ee tx e tx Ee tx Ee tx M tm t 6 where we have used the fact that the expectation value of a product of functions of independent random variables is the product of their expectation values Now suppose we add independent binomial random variables: X has n trials with a probability of p the notation introduced by Hogg calls this a bn, p and X has n trials, also with a probability of p which Hogg would call bn, p Their mgfs are thus M t pe t + [ p] n and M t pe t + [ p] n If we call their sum Y X + X, its mgf is 7 M Y t M tm t pe t + [ p] n +n 8 but this is just the mgf of a bn + n, p distribution This important result also makes sense from the definition of the binomial distribution Adding the number of successes in n trials and in n more trials is the same as counting the number of successes in n + n trials, as long as all the trials are independent, and the probability for success on each one is p 4 The Negative Binomial Distribution The binomial distribution is appropriate for describing a situation where the number of trials is set in advance Sometimes, however, one decides to do as many trials as are needed to obtain a certain number of successes The random variable in that situation would then be the number of trials that were required, or equivalently, the number of failures that occurred This is know as a negative binomial random variable: given independent trials, each with probability p of success, X is the number of failures before the rth success This probability is a little more involved to estimate To get X x, we have to have r successes and x failures in the first x + r trials, and then a success in the last trial, so the pmf is x + r x + r px p r p x p p r p x r r 9 We can have any non-negative integer number of failures, so the pmf is defined for x,,, You will explore this distribution on the homework, and show that the mgf is Mt p r [ pe t ] r 3 The factor in brackets is a binomial raised to a negative power, which is where the name negative binomial distribution comes from Note that in the special case r, this is the number of failures before the first success, which is the geometric distribution which you ve considered on a prior homework As an aside, the distinction between an experiment where you ve decided in advance to do n trials, and one where you ve decided to stop after r successes, leads to a complication in classical statistical inference called the stopping problem The problem arises because in classical, or frequentist, statistical inference, you have to analyze the results of your experiment in terms of what would happen if you repeated the same experiment many times, and think about how likely results like you saw would be given various statistical models and parameter values To answer such questions, you have to know what experiment you were planning to do, not just what observations 6

7 you actually made One of the advantages of Bayesian statistical inference, which asks the more direct question of how likely various models and parameter values are, given what you actually observed, is that you typically don t have to say what you d do if you repeated the experiment 5 The Hypergeometric Distribution Both the binomial and negative binomial distribution result from repeated independent trials with the same probability p of success on each trial This is a situation that s sometimes called sampling with replacement, since it s what would happen if you put, eg, 35 red balls and 65 white balls in a bowl, drew one at random, noted its color, put it back, mixed up the balls, and repeated Each time there would be a probability of 35/ of drawing a red ball If, on the other hand, you picked the ball out and didn t put it back, the probability of drawing a red ball on the next try would change If the first ball was red, you d have only a 34/ chance of the next one being red since there are now 34 red balls and still 65 white balls in the bowl, while if the first one was white, the probability for the second one to be red would go up to 35/ This is called sampling without replacement In practice, making some sort of huge tree diagram for all of the conditional probabilities, draw after draw, is impractical, so instead, if you want to know the probability to draw, say, three red balls out of ten, you count up the number of ways to do it If you pick balls out of a bowl containing, there are! ways to do 9!! that To count the number of those ways that have 3 red balls and 7 white balls, you need to count the number of ways to pick 3 of the 35 red balls, which is ! times the number of 3!3! ways to pick 7 of the 65 white balls, which is !, so 58!7! the probability of drawing exactly 3 red balls out of when sampling without replacement from a bowl with 35 red balls out of is In general, if there are N balls, of which D are red, and we draw n of them without replacement, the number of red balls in our sample will be a hypergeometric random variable X whose pmf is D N D px x n x N 3 n As you might guess from the name, the mgf for a hypergeometric distribution is a hypergeometric function, which doesn t simplify things much When D and N D are both large compared to n, the hypergeometric distribution reduces approximately to the binomial distribution Jaynes uses the hypergeometric distribution in many of his examples, to avoid making the simplifying assumption of sampling with replacement The Poisson Distribution Consider the numerous statistical statements you get like: On average 83 people per day are killed in traffic accidents in the US On average there are 367 gamma-ray bursts detected per year by orbiting satellites 3 During a rainstorm, an average of 994 raindrops falls on a square foot of ground each minute E T Jaynes, Probability Theory: the Logic of Science Cambridge, 3 I fabricated the last few significant figures in each of these numbers to produce a concrete example 7

8 In each of these cases, the number of events, X, occurring in one representative interval is a discrete random variable with a probability mass function The average number of events occurring is a parameter of this distribution, which Hogg writes as m From the information given above, we only know the mean value of the distribution, EX m The extra piece of information that defines a Poisson random variable is that each of the discrete events to be counted is independent of the others Aside: Simulating a Poisson Process To underline what this independence of events means, and to further to illustrate the kind of situation described by a Poisson distribution, consider the following thought experiment Suppose there s a field divided into a million square-foot patches, and I m told that there is an average of 45 beetles per square foot patch, with the location of each beetle independent of the others If I wanted to simulate that situation, I could distribute 4,5, beetles randomly among the one million patches I d do this by placing the beetles one at a time, randomly choosing a patch among the million available for each beetle, without regard to where any of the other beetles were placed The number of beetles in any one patch would then be pretty close to a Poisson random variable It wouldn t be exactly a Poisson rv, because the requirement of exactly 4,5, beetles total would break the independence of the different patches, but in the limit of infinitely many patches, and a corresponding total number of beetles, it would become an arbitrarily good approximation In fact, the numbers of beetles in each of the one million patches would make a good statistical ensemble for approximately describing the probability distribution for the Poisson random variable Poisson Process and Rate These scenarios described by a Poisson random variable are often associated with some sort of a rate: 83 deaths per day, 367 GRBs per year, 994 raindrops per square foot per minute, 45 beetles per square foot We talk about a Poisson process with an associated rate α eg, a number of events per unit time, and if we count all the events in a certain interval of size T eg, a specified duration of time, it is a Poisson random variable with parameter m αt So for example, if the clicks on a Geiger counter are described by a Poisson process with a rate of 45 clicks per minute, the number of clicks in an arbitrarily chosen oneminute interval of time will be a Poisson random variable with m 45 If the interval of time is 3 seconds half a minute, the Poisson parameter will be m 45/minute5 minute 35 If we count the clicks in four minutes, the number of clicks will be a Poisson random variable with m 45/minute4 minutes 8 Note that m will always be a number, and will not contain a per minute or per square foot, or anything That will be part of the rate α 3 Connection to Binomial Distribution and Derivation of pmf The key information that lets us derive the pmf of a Poisson distribution is what happens if you break the interval up into smaller pieces If there s a Poisson process at play, the number of events in each subinterval will also be a Poisson random variable, and they will all be independent of each other So if we divide the day into equal pieces of 4 minutes 4 seconds each, the number of traffic deaths in each is an independent Poisson random variable with a mean value of 83 Now, in practice this assumption will often not quite be true: the rate of traffic deaths is higher at some times of the day than others, some 8

9 patches of ground may be more prone to be rained on because of wind patterns, etc, but we can imagine an idealized situation in which this subdivision works Okay, so how do we get the pmf? Take the interval in question time interval, area of ground, or whatever and divide it into n pieces Each one of them will have an average of m/n events If we make n really big, so that m/n, the probability of getting one event in that little piece will be small, and the probability that two or more of them happen to occur in the same piece is even smaller, and we can ignore it to a good approximation We can always make the approximation better by making n bigger That means that the number of events in that little piece, call it Y, has a pmf of py p py p py > a b c In order to have EY m/n, the probability of an event occurring in that little piece has to be p m/n But we have now described the conditions for a binomial distribution! Each of the n tiny sub-pieces of the interval is like a trial, a piece with an event is a success, and a piece with no event is a failure So the pmf for this Poisson random variable must be the limit of a binomial distribution as the number of trials gets large: px lim n n! mx x! m x m x!n x! n n m n n! n n x! lim n n x /n m/n Now, the ratio of factorials is a product of x things: n! n x! x nn n n x + 3 The last factor is of course also the product of x things, ie, x identical copies of /n m/n n m 4 But that means the two of them together give you x n! /n n n n n x! m/n n m n m n m n x + n m 5 which is the product of x fractions, each of which goes to as n goes to infinity, so we can lose that factor in the limit and get px mx x! where we ve used the exponential function lim m n m x n n x! e m 6 e α lim + α n ; 7 n n If we recall that the exponential function also has the Maclaurin series e α α k 8 k! we can see that the pmf is normalized: px x k px mx x! e m 9 x m x x! e m e m e m 9

10 We can also use this series to get the mgf tx mx Mt e x! e m met x e m e met e m e met x! x To find the mean and variance, it s useful once again to use the cumulant generating function ψt ln Mt: Differentiating gives us and so the mean is ψt me t ψ t me t 3 ψ t me t 4 EX ψ m 5 which was the definition we started with and the variance is VarX ψ m 6 Note that these are also the limits of the mean and variance of the binomial distribution We can also the mgf to show what happens when we add two Poisson random variables with means m and m ; the mgf of their sum is Mt e m e t e m e t e m +m e t 7 which is the mgf of a Poisson random variable with mean m + m, so the sum of two Poisson rvs is itself a Poisson rv Tuesday 5 October 3 No Class Monday Schedule Thursday 7 October 3 Read Section 33 of Hogg 3 Gamma and Related Distributions 3 Gamma Γ Distribution We now consider our first family of continuous distributions, the Gamma distribution A Gamma random variable has a pdf with two parameters α > and β > : fx Γαβ α xα e x/β < x < 3 We sometimes say that such a random variable has a Γα, β distribution As we ll see, the β parameter just changes the scale of the distribution function, but α actually changes the shape We will show how special cases of the Gamma distribution describe situations of interest, notably the exponential distribution with rate λ, which is Γ,, and the chi-squared distribution with r λ degrees of freedom also known as χ r, which is Γ r, For now, though, let s look at the general Gamma distribu- tion First of all, consider the constant The Γα in the Γαβ α denominator is the Gamma function 3 Γα u α e u du 3 which we introduced last week as a generalization of the factorial function, so that if n is a non-negative integer, Γn + n! 3 Note that in an unfortunate collision of notation, Γα is a function which returns a number for each value of α, while Γα, β is a label we use to refer to a distribution corresponding to a choice of α and β

11 That constant is just what we need to make sure the distribution function is normalized: fx dx Γαβ α Γα Γα x β x α e x/β dx α x/β dx e β u α e u du Γα Γα 33 where we have made the change of variables u x/β The cdf for the Gamma distribution is x F x fy dy x α y y/β dy e Γα β β x/β 34 u α e u du Γα which is usually written in terms of either the incomplete gamma function γy; α y u α e u du 35 defined so that lim y γy; α Γα or the standard Gamma cdf F y; α y u α e u du 36 Γα defined so that lim y F y; α One or both of these functions may be available in your favorite software package, or tabulated in a book For instance, F y; α is tabulated in the back of Devore, Probability and Statistics for Engineering and the Sciences, the book you presumably used for intro Probability and Statistics Hogg does not tabulate these, but it does include some values of the chi-squared cdf, which is a special case of the Gamma cdf Note that if we define a random variable Y X/β, where X is a Γα, β random variable, its pdf is f Y y dp dy dp/dx dy/dx f Xβy /β Γα yα e y 37 which is a Γα, distribution This is what we mean when we say β is a scale parameter Now we turn to the mgf of the Gamma distribution, which is Mt Ee tx Γαβ α Γαβ α x α exp x α e x/β e tx dx [ ] β t x dx 38 If we require t < and make the substitution u tx so β β x β u, this becomes βt α β Mt u α e u du βt α 39 Γαβ α βt If we again construct the cumulant generating function we find the derivative so the mean is and the second derivative ψt ln Mt α ln βt 3 ψ t αβ βt 3 EX ψ αβ 3 ψ t αβ βt 33

12 so the variance is VarX ψ αβ 34 which should be familiar results from elementary probability and statistics Even more useful than the mean and variance, the mgf can be used to show what happens when we add two independent Gamma random variables with the same scale parameter, say X which is Γα, β and X which is Γα, β The mgf of their sum is Mt M tm t βt α βt α βt α +α 35 which we see is the mgf of a Γα +α, β random variable So if you add Gamma rvs with the same scale parameter, their sum is another Gamma rv whose scale parameter is the sum of the individual scale parameters 3 Exponential Distribution Suppose we have a Poisson process with rate λ, so that the probability mass function of the number of events occurring in any period of time of duration t is P n events in t λtn e λt 36 n! We can show that, if you start counting at some moment, the time you have to wait until k more events have occurred is a Γk, random variable In Hogg they show this by summing λ the Poisson distribution from to k, but once we know what happens when you sum Gamma random variables, all we actually have to do is show that the waiting time for the first event is a Γ, random variable If you start waiting for the second λ event once the first has happened, that waiting time is another independent Γ, random variable, and so forth The waiting time for the kth event is thus the sum of k independent λ Γ, random variables, which by the addition property we ve λ just seen is a Γk, random variable λ So we turn to the question of the waiting time for the first event, and return to the Poisson distribution In particular, evaluating the pmf at n we get P no events in t λt e λt e λt 37! Now pick some arbitrary moment and let X be the random variable describing how long we have to wait for the next event If X t that means there are one or more events occurring in the interval of length t starting at that moment, so the probability of this is P X t P no events in t e λt 38 But that is the definition of the cumulative distribution function, so F x P X x e λx 39 Note that it doesn t make sense for X to take on negative values, which is good, since F This means that technically, { x < F x 3 e λx x We can differentiate with respect to x to get the pdf { x < fx λe λx x 3 This is known as the exponential distribution, and if we recall that Γ!, we see that this is indeed the Gamma distribution where α and β λ

13 33 Chi-Square χ Distribution We turn now to another special case of the Gamma distribution If we set α to r, where r is some positive integer, and β to, we get the pdf fx Γ r r/ x r e x/ < x < 3 which is known as a chi-square distribution with r degrees of freedom, or χ r Note that if we add independent chi-square random variables, eg, X which is χ r Γ r, and X which is χ r Γ r,, their sum X + X is Γ r + r, χ r + r So in particular the sum of r independent χ random variables is a χ r random variable Next week, we ll see that a χ random variable is the square of a standard normal random variable, so a χ r is the sum of the squares of r independent standard normal random variables Note that Table II in Appendix C of Hogg has some percentiles values of x for which P X < x is some given value for chi-square distributions with various numbers of degrees of freedom This means that if you have a Gamma random variable which you can scale to be a chi-square random variable which is basically possible if α is any integer or half-integer, you can get percentiles of that random variable 34 Beta β Distribution Please read about the β distribution in section 33 of Hogg Tuesday October 3 Read Section 34 of Hogg 4 The Normal Distribution A random variable X follows a normal distribution, also known as a Gaussian distribution, with parameters µ and σ >, known as Nµ, σ, if its pdf is fx σ π exp x µ σ To show that this is normalized, we take the integral fx dx σ π e x µ σ dx π 4 e z / dz 4 where we have made the substitution z x µ/σ The integral I e z / dz 43 is a bit tricky; there s no ordinary function whose derivative is e z /, so we can t just do an indefinite integral and evaluate at the endpoints But we can do the definite integral by writing I e x / dx e y / dy 44 e x +y / dx dy If we interpret this as a double integral in Cartesian coördinates, we can change to polar coördinates r and φ, and write π I e r / r dr dφ π π e r / π e r / r dr 45 3

14 so I e z / dz π 46 and the pdf does integrate to one To get the mgf, we have to take the integral Mt σ x µ exp + tx dx 47 π σ If we complete the square in the exponent, we get x µ + tx x [µ + tσ ] + µtσ + t σ 4 σ σ σ so Mt σ t σ π eµt+ σ t σ π eµt+ x [µ + tσ ] t σ + µt + σ exp x [µ + tσ ] dx σ e z / dz e µt+ t σ This means that the cumulant generating function is taking derivatives gives so the mean is and ψt ln Mt µt + σ t ψ t µ + σ t 4 EX ψ µ 4 ψ t σ 43 so the variance is VarX ψ σ 44 which means that the parameters µ and σ are the mean and standard deviation of the distribution, as their names suggest Note that this is in some sense the simplest possible distribution with a given mean and variance For a general random variable, since ψ, ψ EX and ψ VarX, the first few terms of the Maclaurin series for ψt must be ψt t EX + t VarX + Ot3 45 Given a random variable X which follows a Nµ, σ distribution, we can define Z X µ Its pdf will be σ f Z z σf X µ + zσ π e z / 46 which is a N, distribution, also known as a standard normal distribution The cdf of a Nµ, σ random variable will be F x σ π x e u µ /σ du π x µ/σ e t / dt 47 again, e t / is not the derivative of any known function, but it s useful enough that we define a function Φz π z e t / dt 48 which is tabulated in lots of places In terms of this, the cdf for a Nµ, σ rv is x µ P X x Φ 49 σ 4

15 If we add two independent Gaussian random variables, X and X, following Nµ, σ and Nµ, σ distributions, respectively, their sum has the mgf Mt M tm t exp exp tµ + t σ tµ + µ + t σ + σ exp tµ + t σ 4 which is the mgf of a Nµ +µ, σ+σ distribution In general, if we add independent random variables, their sum has a mean which is the sum of the means and a variance which is the sum of the variances, but what s notable here is the sum obeys a normal distribution, so it s characterized only by its mean and variance Finally, suppose we have r independent Gaussian random variables {X i } with means µ i and variances σi Consider the combination r Xi µ i Y 4 i We can show that this obeys a χ r distribution f Y y σ i Γr/ r/ y r e y/ 4 As we saw last time, the sum of r independent χ random variables is a χ r random variable, so all we need to do is show that if X is Nµ, σ, so that Z X µ is N,, Y X µ σ σ Z is χ Now, since f Z z π e z / < z < 43 we can t quite use the usual formalism for transformation of pdfs, since the transformation Y Z is not invertible But since f Z z f Z z it s not hard to see that if we define a rv W Z, it must have a pdf f W w f Z w + f Z w f Z z π e w / < w < 44 and then we can use the transformation y w, w y / to work out f Y y dp dy dp dw dw dy y / f W y / y / e y/ < y < π If we recall the χ pdf from last time, it was fy 45 Γ/ / y / e y/ < y < 46 so the pdf of Y Z is the χ pdf, if the value of Γ/ is π Now, if we think about it, that has to be the case, in order for the two pdfs to be normalized, but we can work out the value directly Recall the Gamma function Γα t α e t dt 47 which is a finite positive number for any α > For positive integer n, we know that Γn n! Thus Γ/ t / e t dt 48 We can show that the integral is well-behaved at the lower limit of t, and evaluate it, by changing variables to u t so that t u / and du / t / dt; thus Γ/ e u du π π 49 5

16 as expected We ve used the symmetry of the integrand to say that e u du π e u du 43 Thursday 4 October 3 Read Section 35 of Hogg 5 Multivariate Normal Distribution 5 Linear Algebra: Reminders and Notation If A is an m n matrix: A A A n A A A n A A m A m A mn and B is an n p matrix, B B B p B B B p B B n B n B np 5 5 then their product C AB is an m p matrix as shown in Figure so that C ik n j A ijb jk If A is an m n matrix, B A T is an n m matrix with elements B ij A ji : B B B m A A A m B B B m B A A A m AT B n B n B nm A n A n A mn 54 If v is an n-element column vector which is an n matrix and A is an m n matrix, w Av is an m-element column vector ie, an m matrix: w w w m A A A n v w Av A A A n v A m A m A mn v n A v + A v + + A n v n A v + A v + + A n v n A m v + A m v + + A mn v n 55 so that w i n j A ijv j If u is an n-element column vector, then u T is an n-element row vector a n matrix: u T u u u n 56 If u and v are n-element column vectors, u T v is a number, known as the inner product: v u T v v u u u n v n u v + u v + + u n v n n u i v i i 57 If v is an m-element column vector, and w is an n-element 6

17 C C C p A A A n B B B p C C C p C AB A A A n B B B p C m C m C mp A m A m A mn B n B n B np A B + A B + + A n B n A B + A B + + A n B n A B p + A B p + + A n B np A B + A B + + A n B n A B + A B + + A n B n A B p + A B p + + A n B np A m B + A m B + + A mn B n A m B + A m B + + A mn B n A m B p + A m B p + + A mn B np 53 Figure : Expansion of the product C AB to show C ik n j A ijb jk column vector, A vw T is an m n matrix A A A n A A A n A vwt A m A m A mn v v w v w v w n v v w v w v w n w w w m v m v m w v m w v m w n 58 so that A ij v i w j If M and N are n n matrices, the determinant detmn detm detn If M is an n n matrix known as a square matrix, the inverse matrix M is defined by M M n n MM where n n is the identity matrix n n 59 If M exists, we say M is invertible If M is a real, symmetric n n matrix, so that M T M, ie, M ji M ij, there is a set of n orthonormal eigenvectors {v, v,, v n } with real eigenvalues {λ, λ,, λ n }, so that Mv i λ i v i Orthonormal means { vi T i j v j δ ij 5 i j where we have introduced the Kronecker delta symbol δ ij The eigenvalue decomposition means n M λ i v i vi T 5 i 7

18 The determinant is detm n i λ i If none of the eigenvalues {λ i } are zero, M is invertible, and the inverse matrix is M n i λ i v i v T i 5 If all of the eigenvalues {λ i } are positive, we say M is positive definite If none of the eigenvalues {λ i } are negative, we say M is positive semi-definite 5 Special Case: Independent Gaussian Random Variables Before considering the general multivariate normal distribution, consider the case of n independent normally-distributed random variables {X i } with means {µ i } and variances {σ i } The pdf for X i, the ith random variable, is and its mgf is f i x i exp x i µ i σ i π σ i M i t i exp t i µ i + t i σi If we consider the random variables X i to be the elements of a random vector X X X X n 55 its expectation value is µ µ µ n EX µ and its variance-covariance matrix is σ σ CovX Σ σn which is diagonal because the different X i s are independent of each other and therefore have zero covariance We can thus write the joint mgf for these random variables as n n [ Mt M i t i exp t i µ i + ] t i σi i i exp t T µ + 58 tt Σt We can also write the joint pdf fx n f i x i i n i πσ i exp n x i µ i 59 in matrix form if we consider a few operations on the matrix Σ First, since it s a diagonal matrix, its determinant is just the product of its diagonal entries: det Σ i σ i n σi 5 i 8

19 and, for that matter, detπσ n πσi 5 i Also, we can invert the matrix to get so i σ Σ σ i σ σn 5 n x i µ i x µ T Σ x µ 53 which makes the pdf for the random vector X fx exp detπσ x µt Σ x µ 54 The generalization from n independent normal random variables to an n-dimensional multivariate normal distribution is to use the same matrix form for Mt and just to replace Σ, which was a diagonal matrix with positive diagonal entries, with a general symmetric positive semi-definite matrix So one change is to allow Σ to have off-diagonal entries, and another is to allow it to have zero eigenvalues If Σ is positive definite, ie, its eigenvalues are all positive, we can use the matrix expression for the pdf fx as well As we ll see, if Σ has some zero eigenvalues, we won t be able to define a pdf for the random vector X 53 Multivariate Distributions Recall that a set of n random variables X, X, X n can be combined into a random vector with expectation value and variance-covariance matrix X X X X n µ µ µ n EX µ Σ CovX E[X µ] T [X µ] VarX CovX, X CovX, X n CovX, X VarX CovX, X n CovX, X n CovX, X n VarX n 57 The variance-covariance matrix must be positive semi-definite, ie, have no negative eigenvalues To see why that is the case, let {λ i } be the eigenvalues and {v i } be the orthonormal eigenvectors, so that Σ n i vt iλ i v i For each i define a random variable X i vi T X It has mean EX i vi T µ and variance VarX i E[v T i X µ] E[v T i X µx µ T v i ] v T i E[X µx µ T ]v i v T i Σv i λ i v T i v i λ i 58 9

20 Since the variance of a random variable must be non-negative, Σ cannot have any negative eigenvalues Incidentally, we can see that the different random variables {X i } are uncorrelated, since CovX i, X j E[v T i X µx µ T v j ] v T i E[X µx µ T ]v j v T i Σv j λ j v T i v j λ i δ ij 59 Remember that CovX i, X j does not necessarily imply that X i and X j are independent It will, however, turn out to be the case for normally distributed random variables Note that if we assemble the {X i } into a column vector X X X n v T v T v T n X X ΓX 53 where the matrix Γ is made up out of the components of the orthonormal eigenvectors {v i }: v T v v v n v T Γ v v v n v n v n v n n v T n 53 This matrix is not symmetric, but it is orthogonal, meaning that Γ T Γ We can see this from v T v T ΓΓ T v v v n v T n v T v v T v v T v n v T v v T v v T v n vn T v vn T v vn T v n 53 This matrix Γ can be thought of a transformation from the original basis to the eigenbasis for Σ One effect of this is that it diagonalizes Σ: λ ΓΣΓ T λ λ n Λ 533 Finally, recall that the moment generating function is defined as Mt Ee tx 534 and that if we define ψt ln Mt, and ψ t i µ i 535 t ψ t i t j CovX, X 536 t

21 This means that if we do a Maclaurin expansion of ψt we get, in general, ψt t T µ + tt Σt where the terms indicated by have three or more powers of t 54 General Multivariate Normal Distribution We define a multivariate normal random vector X as a random vector having the moment generating function Mt exp t T µ + tt Σt 538 We refer to the distribution as N n µ, Σ Note that this is equivalent to starting with the Maclaurin series for ψt ln Mt and cutting it off after the quadratic term We start with the mgf rather than the pdf because it applies whether the variance-covariance matrix Σ is positive definite or only positive semi-definite, ie, whether it has one or more zero eigenvalues To see what happens if one or more of the eigenvalues is zero, we use the orthonormal eigenvectors {v i } of Σ to combine the random variables in X into n uncorrelated random variables {X i }, where X i vi T X, which have means EX i vi T µ and variances VarX i λ i If we combine the {X i } into a random vector where X ΓX 539 v T v T v T n Γ 54 is the orthogonal matrix made up out of eigenvector components, X has mean EX Γµ and variance-covariance matrix λ CovX Λ ΓΣΓ T λ λ n 54 The random vector X also follows a multivariate normal distribution, in this case N n Γµ, Λ To show this, we ll show the more general result which is Hogg s theorem 35, that if X is a N n µ, Σ multivariate normal random vector, A is an m n constant matrix and b is an m-element column vector, the random vector Y AX + b also obeys a multivariate normal distribution Note that this works whether m is equal to, less than, or greater than n! We prove this using the mgf The mgf for Y is M Y t E[expt T Y] E[expt T AX + t T b] e ttb E[exp[A T t] T X] 54 Now here is the key step, which Hogg doesn t elaborate on t is an m-element column vector A is an m n matrix, so its transpose A T is an n m matrix, and the combination A T t is an n-element column vector, whose transpose is the n-element row vector [A T t] T t T A 543 Therefore, the expectation value in the last line above is just the mgf for the original multivariate normal random vector X

22 evaluated at the argument A T t: E[exp[A T t] T X] M X A T t exp [A T t] T µ + [AT t] T Σ[A T t] exp t T Aµ + tt AΣA T t This makes the mgf for Y equal to e ttb times this, or M Y t exp t T [Aµ + b] + tt AΣA T t the joint pdf is f X ξ n f Xi ξ i i exp detπλ ξ ΓµT Λ ξ Γµ 547 We can then do a multivariate transformation to get the pdf for X Γ X Γ T X The Jacobian of the transformation is Γ T, whose determinant is either or, because det detγγ T det Γ 548 This is true for any orthogonal matrix This means det Γ, and the pdf for X is simply which is the mgf for a normal random vector with mean Aµ + b and variance-covariance matrix AΣA T, ie, one that obeys a N m Aµ + b, AΣA T distribution So, we return to the random vector X ΓX, which we now see is a multivariate normal random vector with mean Γµ and diagonal variance-covariance matrix Λ Its mgf is M X t exp t T Γµ + n tt Λt exp [t i vi T µ + λ it i ] n i exp t i vi T µ + λ it i i n M Xi t i i 546 which is the mgf of n independent random variables There are two possibilities: either Σ and thus Λ is positive definite, which means all of the {λ i } are positive, or one or more of the {λ i } are zero In the first case, we have the special case we considered before, n independent normally-distributed random variables, so f X x f X Γx 549 If we also note that det Λ n i λ i det Σ, we see that f X x exp detπλ Γx µt Λ Γx µ exp detπσ x µt Γ T Λ Γx µ 55 But the combination Γ T Λ Γ is just the inverse of Σ, because Λ ΓΣΓ T Γ T Σ Γ 55 so we find that, for arbitrary positive definite symmetric Σ, f X x exp detπσ x µt Σ x µ 55 which is exactly the generalization we expected from the n- independent-random-variable case

23 On the other hand, if Σ has one or more zero eigenvalues, so that det Σ, and Σ is not defined, that pdf won t make sense In that case, consider the random variable X i corresponding to the zero eigenvalue λ i Its mgf is Ee t ix i M Xi t i exp t i vi T µ + λ it i exp t i vi T µ 553 but the only way that is possible is if X i is always equal to vi T µ, ie, X i is actually a discrete random variable with pmf { ξ i vi T µ P X i ξ i 554 otherwise This is the limit of a normal distribution as its variance goes to zero Returning to the case where Σ is positive definite, so that each X i is an independent Nv T i µ, λ i random variable, we can construct the corresponding standard normal random variable Z i λ i / X i v T i µ λ i / v T i X i µ 555 which we could combine into a N n, random vector Z Z Z Z n 556 However, it s actually more convenient to combine them into a different N n, random vector Z n v i Z i Γ T Z i n v i λ i / vi T X µ Σ / X µ i 557 with pdf f Z z π n/ e zt z/ 558 Hogg uses this as the starting point for deriving the pdf for the multivariate normal random vector X, with the factor of detσ / 559 det Σ coming from the Jacobian determinant associated with the transformation Tuesday 9 October 3 Read Section 36 of Hogg 55 Consequences of the Multivariate Normal Distribution Recall that if X is a multivariate normal random vector, obeying a N n µ, Σ distribution, this is equivalent to saying X has the its mgf M X t exp t T X + tt Σt 56 Its mean is X µ and its variance-covariance matrix is CovX Σ If the variance-covariance matrix is invertible, then X has the probability density function f X x exp detπσ X µt Σ X µ 56 3

24 55 Connection to χ Distribution We also showed that we could make a random vector Z Z Z Σ / X µ 56 Z n which obeyed a N n, distribution, ie, where the {Z i } were independent standard normal random variables Previously, we showed that if we took the sums of the squares of n independent standard normal random variables, the resulting random variable obeyed a chi-square distribution with n degrees of freedom So in this case we can construct Y n Z i Z T Z [Σ / X µ] T [Σ / X µ] i X µ T Σ / Σ / X µ X µ T Σ X µ 563 and it will be a χ n random variable This is Theorem 354 of Hogg Note that if Σ is diagonal, so that {X i } are n independent random variables, with X i being Nµ i, σi, the chi-square random variable reduces to n Xi µ i Y if Σ diagonal 564 i 56 Marginal Distribution σ i One last question to consider is, what if we split the n random variables in the random vector X into two groups: the first m, which we collect into a random vector X, and the last p m n, which we collect into a random vector X, so that X X where X X X X m X and X X m+ X m+ X n We d like to know what the marginal distributions for X and X are, and also the conditional distributions for X X and X X As a bit of bookkeeping, we partition the mean vector µ µ µ and the variance-covariance matrix Σ Σ Σ Σ Σ in the same way µ is an m-element column vector, µ is a p-element column vector, Σ is a m m symmetric matrix, Σ is a p p symmetric matrix, Σ is a m p matrix ie, m rows and p columns, and Σ is a p m matrix Since Σ is symmetric, we must have Σ Σ T Σ Σ T Σ T 569 Σ so Σ Σ T Now, it s pretty easy to get the marginal distributions using the mgf Partitioning t in the usual way, we see t T µ t T µ + t T µ 57 4

25 and t T Σt t T t T Σ Σ t Σ Σ t t T Σ t + t T Σ t + t T Σ t + t T Σ t t T Σ t + t T Σ t + t T Σ t 57 which is only equal to Mt, t if the cross-part Σ of the variance-covariance matrix is zero So X and X are independent if and only if Σ m p 57 Conditional Distribution where in the last step we ve used the fact that since t T Σ t is a matrix, ie, a number, it is its own transpose: t T Σ t t T Σ t T t T Σ T t t T Σ t 57 Thus the joint mgf is Mt Mt, t exp t T µ + t T µ + tt Σ t + t T Σ t + tt Σ t The mgf for X is thus M t Mt, exp t T µ + tt Σ t which means X is a N m µ, Σ multivariate normal random vector, and likewise the mgf for X is thus M t M, t exp t T µ + tt Σ t 575 so X is N p µ, Σ Hogg also shows this as a special case of the more general result that AX + b is a normal random vector, where A is a m n constant matrix and b is a m-element constant column vector We see that M t M t exp t T µ + t T µ + tt Σ t + tt Σ t 576 Finally, we d like to consider the conditional distribution of X given that X x Hogg proves that the distribution is a multivariate normal, specifically N m µ Σ Σ [x µ ], Σ Σ Σ Σ, by using the mgf, but their proof assumes you already know the form of the distribution The proof assumes that Σ is positive definite, and in particular that Σ exists We might like to try to work this out rather than starting with the answer, and so we could divide the pdf fx fx, x exp detπσ X µt Σ X µ 577 by the marginal pdf f x detπσ exp X µ T Σ X µ to get the conditional pdf f x x fx, x f x Working out the details of this is a little nasty, though, since it requires a block decomposition of Σ, which exists, but is kind of complicated For example [Σ ] [Σ Σ Σ Σ ] It s not hard to see that what comes out will a Gaussian in x minus something, though Still, for simplicity we can limit the explicit demonstration to the case where m and p, so 5

26 n Then all of the blocks in the decomposition of the n n matrices are just numbers, specifically and Σ µ µ µ VarX CovX, X CovX, X VarX σ ρσ σ ρσ σ σ familiar results from matrix algebra tell us that and Σ det Σ σ ρσ σ ρσ σ σ σ σ ρ 58 σ σ ρ σ ρσ σ ρσ σ σ 583 We know that the correlation coefficient ρ obeys ρ ; in order for Σ to be positive definite, we require ρ < You will show on the homework that the joint pdf for this bivariate normal distribution is fx, x πσ σ ρ exp x µ σ ρ + ρx µ x µ x µ σ σ ρ σ ρ 584 and since the marginal distribution for X is x µ f x exp σ π σ 585 the conditional pdf will be f x x fx, x f x σ ρ π exp x µ σ ρ + ρx µ x µ σ σ ρ σ ρ π exp σ ρ ρ x µ σ ρ [ x µ ρσ σ x µ which is a normal distribution with mean µ ρ σ σ x µ and variance σ ρ Thursday 3 October 3 Read Section 4 of Hogg 6 The t- and F Distributions Our final distributions are two distributions related to the normal and chi-square distributions, which are very useful in statistical inference 6 The t distribution If Z is a standard normal random variable [N, ] and V is a chi-square random variable with r degrees of freedom [χ r], and Z and V are independent, the combination T Z V /r 6 ] 586 6

27 is a random variable following a t-distribution with r degrees of freedom; its pdf is f T t r+ + t r < t < 6 where we have not written the explicit normalization constant, in the interest of notational simplicity This pdf can be derived by the change of variables formalism If r is very large, the t- distribution is approximately the same as the standard normal distribution As you ll show on the homework, if r, it s the Cauchy distribution Note that the moment generating function for the t-distribution doesn t exist, since for n r, the integral r+ t n + t r dt 63 diverges, as its integrand becomes, up to a constant, t n r for large t 6 The F distribution If U is a χ r random variable and V is a χ r random variable, and U and V are independent, the combination F U/r V /r 64 obeys an F distribution Again its pdf can be worked out by the change of variables method, and is, up to the normalization constant f F x x r + r r +r x r < x < Student s Theorem We can illustrate the usefulness of the t-distribution, also known as Student s t-distribution, by considering a result by William S Gosset, writing under the pseudonym of Student while working in the Guinness brewery in Dublin Suppose we have n iid normal random variables, each with mean µ and variance σ, ie, X is a random vector with mean µ EX µ µ µ e 66 µ where we have defined the n-element column vector e and variance-covariance matrix 67 CovX σ n n 68 We know from elementary statistics that the sample mean X n X i 69 n and sample variance S n i n X i X 6 i can be used to estimate the mean µ and variance σ, respectively, ie, EX µ and ES σ, and also that VarX σ /n Student s theorem says that the following things are also true: 7

Some Special Distributions (Hogg Chapter Three)

Some Special Distributions (Hogg Chapter Three) Some Special Distributions Hogg Chapter Three STAT 45-: Mathematical Statistics I Fall Semester 5 Contents Binomial & Related Distributions. The Binomial Distribution............... Some advice on calculating

More information

Multivariate Distributions (Hogg Chapter Two)

Multivariate Distributions (Hogg Chapter Two) Multivariate Distributions (Hogg Chapter Two) STAT 45-1: Mathematical Statistics I Fall Semester 15 Contents 1 Multivariate Distributions 1 11 Random Vectors 111 Two Discrete Random Variables 11 Two Continuous

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Stat 5101 Notes: Brand Name Distributions

Stat 5101 Notes: Brand Name Distributions Stat 5101 Notes: Brand Name Distributions Charles J. Geyer September 5, 2012 Contents 1 Discrete Uniform Distribution 2 2 General Discrete Uniform Distribution 2 3 Uniform Distribution 3 4 General Uniform

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information

15 Discrete Distributions

15 Discrete Distributions Lecture Note 6 Special Distributions (Discrete and Continuous) MIT 4.30 Spring 006 Herman Bennett 5 Discrete Distributions We have already seen the binomial distribution and the uniform distribution. 5.

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Stat 5101 Notes: Brand Name Distributions

Stat 5101 Notes: Brand Name Distributions Stat 5101 Notes: Brand Name Distributions Charles J. Geyer February 14, 2003 1 Discrete Uniform Distribution DiscreteUniform(n). Discrete. Rationale Equally likely outcomes. The interval 1, 2,..., n of

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2 APPM/MATH 4/5520 Solutions to Exam I Review Problems. (a) f X (x ) f X,X 2 (x,x 2 )dx 2 x 2e x x 2 dx 2 2e 2x x was below x 2, but when marginalizing out x 2, we ran it over all values from 0 to and so

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

BASICS OF PROBABILITY

BASICS OF PROBABILITY October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text. TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Math/Stats 425, Sec. 1, Fall 04: Introduction to Probability. Final Exam: Solutions

Math/Stats 425, Sec. 1, Fall 04: Introduction to Probability. Final Exam: Solutions Math/Stats 45, Sec., Fall 4: Introduction to Probability Final Exam: Solutions. In a game, a contestant is shown two identical envelopes containing money. The contestant does not know how much money is

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

Sampling Distributions

Sampling Distributions Sampling Distributions In statistics, a random sample is a collection of independent and identically distributed (iid) random variables, and a sampling distribution is the distribution of a function of

More information

Week 1 Quantitative Analysis of Financial Markets Distributions A

Week 1 Quantitative Analysis of Financial Markets Distributions A Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October

More information

Chapter 4: An Introduction to Probability and Statistics

Chapter 4: An Introduction to Probability and Statistics Chapter 4: An Introduction to Probability and Statistics 4. Probability The simplest kinds of probabilities to understand are reflected in everyday ideas like these: (i) if you toss a coin, the probability

More information

Common ontinuous random variables

Common ontinuous random variables Common ontinuous random variables CE 311S Earlier, we saw a number of distribution families Binomial Negative binomial Hypergeometric Poisson These were useful because they represented common situations:

More information

Discrete Distributions

Discrete Distributions A simplest example of random experiment is a coin-tossing, formally called Bernoulli trial. It happens to be the case that many useful distributions are built upon this simplest form of experiment, whose

More information

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise. 54 We are given the marginal pdfs of Y and Y You should note that Y gamma(4, Y exponential( E(Y = 4, V (Y = 4, E(Y =, and V (Y = 4 (a With U = Y Y, we have E(U = E(Y Y = E(Y E(Y = 4 = (b Because Y and

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Sampling Distributions

Sampling Distributions In statistics, a random sample is a collection of independent and identically distributed (iid) random variables, and a sampling distribution is the distribution of a function of random sample. For example,

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

1 Solution to Problem 2.1

1 Solution to Problem 2.1 Solution to Problem 2. I incorrectly worked this exercise instead of 2.2, so I decided to include the solution anyway. a) We have X Y /3, which is a - function. It maps the interval, ) where X lives) onto

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Stat4 Probability and Statistics II (F6 Exponential, Poisson and Gamma Suppose on average every /λ hours, a Stochastic train arrives at the Random station. Further we assume the waiting time between two

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Brief Review of Probability

Brief Review of Probability Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics. Probability Distributions Prof. Dr. Klaus Reygers (lectures) Dr. Sebastian Neubert (tutorials) Heidelberg University WS 07/8 Gaussian g(x; µ, )= p exp (x µ) https://en.wikipedia.org/wiki/normal_distribution

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Probability Background

Probability Background CS76 Spring 0 Advanced Machine Learning robability Background Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu robability Meure A sample space Ω is the set of all possible outcomes. Elements ω Ω are called sample

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

Continuous random variables

Continuous random variables Continuous random variables Continuous r.v. s take an uncountably infinite number of possible values. Examples: Heights of people Weights of apples Diameters of bolts Life lengths of light-bulbs We cannot

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER 2 2017/2018 DR. ANTHONY BROWN 5.1. Introduction to Probability. 5. Probability You are probably familiar with the elementary

More information

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique

More information

Lecture 5: Moment generating functions

Lecture 5: Moment generating functions Lecture 5: Moment generating functions Definition 2.3.6. The moment generating function (mgf) of a random variable X is { x e tx f M X (t) = E(e tx X (x) if X has a pmf ) = etx f X (x)dx if X has a pdf

More information

Continuous Random Variables

Continuous Random Variables Continuous Random Variables Recall: For discrete random variables, only a finite or countably infinite number of possible values with positive probability. Often, there is interest in random variables

More information

T has many other desirable properties, and we will return to this example

T has many other desirable properties, and we will return to this example 2. Introduction to statistics: first examples 2.1. Introduction. The basic problem of statistics is to draw conclusions about unknown distributions of random variables from observed values. These conclusions

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013 Multivariate Gaussian Distribution Auxiliary notes for Time Series Analysis SF2943 Spring 203 Timo Koski Department of Mathematics KTH Royal Institute of Technology, Stockholm 2 Chapter Gaussian Vectors.

More information

The Multivariate Gaussian Distribution [DRAFT]

The Multivariate Gaussian Distribution [DRAFT] The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,

More information

Midterm Exam 1 Solution

Midterm Exam 1 Solution EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2015 Kannan Ramchandran September 22, 2015 Midterm Exam 1 Solution Last name First name SID Name of student on your left:

More information

STAT 450. Moment Generating Functions

STAT 450. Moment Generating Functions STAT 450 Moment Generating Functions There are many uses of generating functions in mathematics. We often study the properties of a sequence a n of numbers by creating the function a n s n n0 In statistics

More information

Continuous Distributions

Continuous Distributions A normal distribution and other density functions involving exponential forms play the most important role in probability and statistics. They are related in a certain way, as summarized in a diagram later

More information

Common probability distributionsi Math 217 Probability and Statistics Prof. D. Joyce, Fall 2014

Common probability distributionsi Math 217 Probability and Statistics Prof. D. Joyce, Fall 2014 Introduction. ommon probability distributionsi Math 7 Probability and Statistics Prof. D. Joyce, Fall 04 I summarize here some of the more common distributions used in probability and statistics. Some

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems

More information

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3. Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

The Multivariate Normal Distribution. In this case according to our theorem

The Multivariate Normal Distribution. In this case according to our theorem The Multivariate Normal Distribution Defn: Z R 1 N(0, 1) iff f Z (z) = 1 2π e z2 /2. Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) T with the Z i independent and each Z i N(0, 1). In this

More information

We introduce methods that are useful in:

We introduce methods that are useful in: Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

Statistics 3657 : Moment Generating Functions

Statistics 3657 : Moment Generating Functions Statistics 3657 : Moment Generating Functions A useful tool for studying sums of independent random variables is generating functions. course we consider moment generating functions. In this Definition

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text. TEST #3 STA 536 December, 00 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. You will have access to a copy

More information

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1 The Exciting Guide To Probability Distributions Part 2 Jamie Frost v. Contents Part 2 A revisit of the multinomial distribution The Dirichlet Distribution The Beta Distribution Conjugate Priors The Gamma

More information

Continuous Distributions

Continuous Distributions Chapter 3 Continuous Distributions 3.1 Continuous-Type Data In Chapter 2, we discuss random variables whose space S contains a countable number of outcomes (i.e. of discrete type). In Chapter 3, we study

More information

APPM/MATH 4/5520 Solutions to Problem Set Two. = 2 y = y 2. e 1 2 x2 1 = 1. (g 1

APPM/MATH 4/5520 Solutions to Problem Set Two. = 2 y = y 2. e 1 2 x2 1 = 1. (g 1 APPM/MATH 4/552 Solutions to Problem Set Two. Let Y X /X 2 and let Y 2 X 2. (We can select Y 2 to be anything but when dealing with a fraction for Y, it is usually convenient to set Y 2 to be the denominator.)

More information

1 Proof techniques. CS 224W Linear Algebra, Probability, and Proof Techniques

1 Proof techniques. CS 224W Linear Algebra, Probability, and Proof Techniques 1 Proof techniques Here we will learn to prove universal mathematical statements, like the square of any odd number is odd. It s easy enough to show that this is true in specific cases for example, 3 2

More information

Convergence in Distribution

Convergence in Distribution Convergence in Distribution Undergraduate version of central limit theorem: if X 1,..., X n are iid from a population with mean µ and standard deviation σ then n 1/2 ( X µ)/σ has approximately a normal

More information

Lecture 10: Powers of Matrices, Difference Equations

Lecture 10: Powers of Matrices, Difference Equations Lecture 10: Powers of Matrices, Difference Equations Difference Equations A difference equation, also sometimes called a recurrence equation is an equation that defines a sequence recursively, i.e. each

More information

IEOR 4701: Stochastic Models in Financial Engineering. Summer 2007, Professor Whitt. SOLUTIONS to Homework Assignment 9: Brownian motion

IEOR 4701: Stochastic Models in Financial Engineering. Summer 2007, Professor Whitt. SOLUTIONS to Homework Assignment 9: Brownian motion IEOR 471: Stochastic Models in Financial Engineering Summer 27, Professor Whitt SOLUTIONS to Homework Assignment 9: Brownian motion In Ross, read Sections 1.1-1.3 and 1.6. (The total required reading there

More information

Chapter 1 Review of Equations and Inequalities

Chapter 1 Review of Equations and Inequalities Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................

More information

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3) STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 07 Néhémy Lim Moment functions Moments of a random variable Definition.. Let X be a rrv on probability space (Ω, A, P). For a given r N, E[X r ], if it

More information

Nonparametric hypothesis tests and permutation tests

Nonparametric hypothesis tests and permutation tests Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon

More information

ACM 116: Lectures 3 4

ACM 116: Lectures 3 4 1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Systems of Linear ODEs

Systems of Linear ODEs P a g e 1 Systems of Linear ODEs Systems of ordinary differential equations can be solved in much the same way as discrete dynamical systems if the differential equations are linear. We will focus here

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER /2018

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER /2018 ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 2017/2018 DR. ANTHONY BROWN 1. Arithmetic and Algebra 1.1. Arithmetic of Numbers. While we have calculators and computers

More information

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

Chapter 7: Special Distributions

Chapter 7: Special Distributions This chater first resents some imortant distributions, and then develos the largesamle distribution theory which is crucial in estimation and statistical inference Discrete distributions The Bernoulli

More information

Chapter 2. Discrete Distributions

Chapter 2. Discrete Distributions Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Probability (Devore Chapter Two)

Probability (Devore Chapter Two) Probability (Devore Chapter Two) 1016-345-01: Probability and Statistics for Engineers Fall 2012 Contents 0 Administrata 2 0.1 Outline....................................... 3 1 Axiomatic Probability 3

More information

4 Moment generating functions

4 Moment generating functions 4 Moment generating functions Moment generating functions (mgf) are a very powerful computational tool. They make certain computations much shorter. However, they are only a computational tool. The mgf

More information