FW 544: Computer Lab Probability basics in R

Size: px
Start display at page:

Download "FW 544: Computer Lab Probability basics in R"

Transcription

1 FW 544: Computer Lab Probability basics in R During this laboratory, students will be taught the properties and uses of several continuous and discrete statistical distributions that are commonly used in ecological models. The students will learn how to generate random data from each distribution using R software and how to develop simple simulation models by passing randomly generated values from one distribution to another. This will provide students with the understanding of probability distributions that will be needed to quantify uncertainty and comprehend the basics of Bayesian probability and Monte Carlo simulation. Laboratory exercises will evaluate the ability of students to build simple simulation models that use randomly generated data to approximate ecological processes, such as survival or the occurrence of a disturbance. Overview of probability and random variables We introduced random (stochastic) variables and statistical distributions last week. Here we will somewhat more formally define these ideas, and get into the details of some important statistical distributions. Probability (P) can be thought of as a measure of the uncertainty in a random outcome. If we say that event X occurs with P=1 then we are certain about X; if we say P=0 then we are certain that X does not occur; and if we say P=0.5 we are equally uncertain about whether X occurs or does not. The value or outcome X referred to is a random variable, as distinguished from a deterministic variable whose values may vary, but do so in a predictable or deterministic manner. A probability distribution or statistical distribution (or distribution for short) is a model that describes the relationship between values of a random variable and the probabilities of assuming these values. The basic types of distributions are discrete and continuous. Discrete distributions model outcomes that occur in discrete classes or integer values; examples include the Bernoulli, Binomial, and Poisson, all discussed in more detail below. Continuous distributions model outcomes that take on continuous (generally,

2 real) values and include the Uniform, Normal, Beta, and Gamma distributions, also discussed in more detail below. The probability density function for a distribution describes the probability that the random variables take on particular values (for a discrete distribution) or are in the neighborhood of a value (for a continuous distribution). The density is often written as f(x). For example, for a discrete random variable (e.g., from a Poisson) distribution we may write f(4)=0.45 indicating that the value of 4 is taken with probability For a discrete distribution f(x) will sum to 1 over the support (region of x where f(x)>0) of the distribution. For example, if we have a binomial distribution with parameters n=5 and p=0., the distribution has support for x =0,1,,3,4,5 with f(0)+f(1)+f()+f(3)+f(4)+f(5) = =1. The density for continuous distributions follows a similar idea but because the support is continuous (and this uncountable) f(x) is not directly interpretable as a point probability. However, by analogy to the discrete distribution the f(x) integrates to one over the support of f(x). The probability distribution function (or cumulative distribution function) represents the probability that a the random variable x is less than or equal to a particular value F(x) = Prob(X x). For discrete distributions, F(x) is readily obtained by summation, e.g., for the binomial example: F(3) = f(0)+f(1)+f()+f(3) = =

3 Calculation of F(x) for continuous distributions is trickier and requires integration. By definition x F ( x) f ( v) dv where the lower limit may sometimes be higher (e.g., if the support starts at zero). Usually these computations are done by computer functions (or looked up in standard tables). Once F(x) is available (for either discrete or continuous distributions) we can easily ask questions like what is the probability that x is between a and b? since Prob( a X b) b a f ( v) dv b f ( v) dv a f ( v) dv F( b) F( a). So for example if we have a Normal with mean 0 and standard deviation 1 F()=0.8=9775, F(1)= and Prob(1 X 1) F() F(1) We can reverse the idea of distributions and, for a given probability level of a distribution function obtain the value of x or quantile associated with that value. The quantiles are essentially found by inverting the distribution function and solving for x, though for discrete distribution they can easily be gotten by examination and interpolating between values. In practice we will get the quantiles of standard distributions using built-in functions in R. To take the example of the normal distribution (mean=0, sd=1), the quantiles associated with F(x) =0.01, 0.5, and 0.99 are -.33,0.00, and.33, respectively. You will often encounter the term moments, which refer to a number of important functions of distributions. The most important moments are the mean and the variance. The mean and variance are formal defined in terms of the density functions as for discrete distributions and E ( x) xf ( x) x

4 E ( x) vf( v) dv for continuous distributions, where in both cases summation or integration is over the support of x. The variance of x follows from the definition of expectation and the relationship x V ( x) E( x ). We generally estimate these population moments by their sample equivalents, the sample mean and variance. and n is the size of a random sample. ˆ x n i1 x i / n ( xi x) i1 ˆ s. n 1 n The Normal Distribution is an example where the distributions parameters-the constants that determine the behavior of the distribution and what it will predict about the data are familiar: the parameters of the Normal are just the mean ( ) and the variance ( ). We will introduce parameters for other distributions when we consider the distributions in detail, below. Random number generation The idea of random number generation is to produce a value (or a list/sample of values) of a random variable x, given some assumptions about the distribution of x and its parameters. For example, we which to obtain a simulated sample of 100 values that come from a normal distribution with mean 5 and standard deviation 10. Depending on whether the random variable is discrete or continuous and the complexity of the distribution function, there are a variety of procedures to generate random variables. Most of the common ones rely on being able to find the inverse distribution function, that is the function F 1 ( U ) that, given a value for U, the cumulating probability of x, returns the value for x. The idea is then to generate a uniform random variable between 0 and

5 1 (the range for a probability) and then solve F 1 ( U ) to get x. Thus, many random number generators start from the capacity to generate a uniform random number, which can then be used to create random variables from other distributions. In practice, R goes through these steps for you, but we will illustrate them for a few simple examples so that you can see that there is often more than one way to obtain a simulated random variable. Technically, we are not generate true random numbers with these procedures, but rather computer generated sequences of numbers that behave like random numbers, known as pseudorandom numbers. The exact means by which pseudorandom numbers are generated is an advanced topic beyond the scope of this course, and has been the subject of intensive development and refinement over the years. Suffice it to say that some pseudorandom number generators perform better (i.e., act like the real deal ) than others, so it is important to be sure that you are using a generator that has been thoroughly tested. Fortunately for us, the developers of R and the R user community have thoroughly vetted the pseudorandom number generators in R, so you can be confident when you use these procedures that the results will be essentially random. Probability distributions in R R provides a very convenient way to calculate and plot many common statistical distributions and related functions and generate random variables, so we will perform most of these tasks using built-in R functions. In a few cases we ll be able to see how to build functions from scratch or nearly so, which may help you to generalize these principles. R-code for all the examples is accumulated and saved in an R script file available on Blackboard. Uniform Distribution Density, distribution, and quantiles Perhaps the simplest distribution is the continuous Uniform (or Rectangular) Distribution, which assumes that values of x over the support of f(x) occur with equal probability. The parameters of the Uniform are simply the lower and upper bounds for

6 x, so that x is equally likely to be anywhere inside the interval a x b, but cannot occur outside the interval (i.e. the support is totally in the interval). Formally the density for x is then The distribution function is simply The mean of the uniform is obtained from f ( x; a, b) 1/( b a), a x b 0 x a or x b F( x) ( x a) /( b a), a x ( a b) / and is just the midpoint between the minimum (a) and maximum (b), the parameters of the distribution, while the variance is ( b a) /1. The density is easily implemented in R by the command >dunif(x,a,b) where a and b are the parameters and x is a value or list of values. So for a simple example we can compute and plot the density for Uniform(a=,b=8) over the range x from 0 to 10. #generate 1000 equally spaced values between 0 and 10 >x<-(0:1000)*0.01 #compute the uniform density for each value of x >density<-dunif(x,,8) Alternatively, we could have written a short function in R to do the same thing: #user-defined density >my_dunif<-function(x,a,b){1/(b-a)*(x>=a & x<=b)} >d<-my_dunif(x,,8) Either approach should produce a plot from >plot(x,density) like this

7 Notice what happens to the density when x< or x>8. Likewise we can produce and plot a distribution for x by >distrib<-punif(x,,8) >plot(x,distrib) producing

8 Quantiles at specified probability levels are produced by the qunif() command, for example: #quantiles at standard probability levels >prob_levels<-c(0,.05,.5,.5,.75,.95,1) >quants<-qunif(prob_levels,,8) >quants [1] Likelihood function We are used to thinking of probability functions as describing the probability of an outcome x given the underlying model and parameter values (e.g., Uniform(,8) above). We can turn this idea around though and ask the question of how likely a given

9 parameter value is, given the data we have and an underlying model. In this way of the thinking the data are fixed and the model parameter(s) is (are) variable. Mathematically, the calculation is the same but we are just varying different quantities. In the example we just considered, we can ask the question: how likely, given a= and a value of x=5, are integer values of the parameter b in the range 3 to 8? >a<- >x<-5 >b<-3:1 >b [1] >like<-dunif(x,a,b) >like [1] [9] We see that there is no likelihood that b is 3 or 4 (obviously ruled out by the value x=5) but that given the single observation x=5 we can t rule out b being 6,7,8 or even higher. We will come back to the likelihood and just how we use data to estimate parameters, in a later lab. Random number generation It is very easy to generate random uniform number in R using the runif() function. The first value for the function specifies the number of values you want, and the next to specify the minimum and maximum (a,b) parameters. By default a=0 and b=1 so runif(100) for example would produce 100 uniform random number between 0 and 1, something that is often the starting point for simulating other, more complicated distributions. To take a specific case, suppose we want to generate 100 uniform random number between 5.5 and #generating n Uniform(a,b) random numbers >n<-100 >a<-5.5

10 >b<-10.4 >x<-runif(n,a,b) would produce a list of numbers (x) with these characteristics. You can calculate the sample mean from the simulated data >mean(x) and confirm that while this gets close to the distribution mean of (a+b)/ it s not exact why is that? Normal Distribution Density, distribution, and quantiles The Normal distribution is perhaps the most familiar statistical distribution. It is symmetric about the mean, with the familiar bell-shaped curve, and is used to model continuous, real values with theoretical range from negative to positive infinity. It is the limiting distribution of many test statistics and functions and is commonly used as an approximation, even when the data are thought to follow some other distribution, often after transformation to reduce skewness or discontinuities in the data. The normal density is determined by the parameters and ( >0) as 1 ( x ) f ( x;, ) exp, x For example, the Normal density function over -50, 50 for =5 and =15 is produced by #normal distribution #generate equally spaced values between -50 and 50 >x<-(-5000:5000)*0.01 >mu<-5 >sigma<-15

11 >density<-dnorm(x,mu,sigma) >plot(x,density) We can produce a comparable distribution function by #distribution function >distrib<-pnorm(x,mu,sigma) >plot(x,distrib) producing

12 Specified probability quantiles are easily obtained from the qnorm() function, for example >prob_levels<-c(0.001,.05,.5,.5,.75,.95,0.999) > quants<-qnorm(prob_levels,mu,sigma) > quants [1] Equivalently, we could say that we are 90% confident that x is between and 9.67, with 10% probability (5% in each tail) outside this range. Notice that in the density dnorm() distribution pnorm() functions, we passed the data as a list to the function, for scalar (1-dimensioned) values of the parameter. Generally

13 speaking any of these function arguments can be lists, and it will make sense below (under the likelihood function) to reverse which ones are. Likelihood function Again, we can turn the model around and ask the question: how likely is a specific parameter value, given an observation (or a sample of observations)? To keep things simple for the normal, let s assume that we ve observed the values x=5 and x = 10, and assume that the standard deviation is fixed at 1. Assuming the normal model, how likely are various values of (say between and 16)? We can compute a likelihood for each data value by dnorm() First, let s make life simpler by introducing an R function that will generate a regular sequence at a specified interval, seq(). We use this to produce values for mu in the range of to 16, at 0.5 spacing (finer if we wish), and then feed them into the likelihoods for x=5 and x = 10. #likelihood >mu<-seq(,16,0.5) >like1<-dnorm(5,mu,1) >like<-dnorm(10,mu,1) At this point, we can first recognize that, assuming that the observations of x are independent, we can multiple their likelihoods or add them on the log scale to get a joint (log) likelihood for the data. >loglike<-dnorm(5,mu,1,log=true)+dnorm(10,mu,1,log=true) Finally, we can examine our log likelihood, see which one is biggest, and see which value of mu produced that log likelihood. R has a nice built in index function that will do this which works like this: >mu[loglike==max(loglike)] which basically says find the index of loglik associated with the biggest value and then tell me what the corresponding mu value is at that same index. In this example, the

14 result is 7.5, which (not coincidentally) is the arithmetic mean of 5 and 10. What we just did is a very crude way (but sometimes effective) way to get the maximum likelihood estimate under a specified model, something we ll explore more in a later lab. Random number generation The easiest way to generate random Normal numbers is by using the built-in R function rnorm(). #Generate 100 random numbers for mu=5 and sigma =10 #method 1 n<-100 mu<-5 sigma<-10 x<-rnorm(n,mu,sigma) The second (and just as valid) way is to first generate 100 random Uniform(0,1) numbers, and then treat these as #method #first generate 100 random uniform(0,1) deviates U<-runif(100) #now treat these as probability values in qnorm(), which functions as the inverse distribution function to return values of x given U. x<-qnorm(u,mu,sigma) You should be able to test these approaches out convince yourself that they produce equivalent results.

15 Poisson Distribution The Poisson Distribution is a very important discrete distribution that models outcomes that take on non-negative integer values (0, 1,,., n ). Examples include counts of animals, plants, =the process generating the counts in space is random in the sense that counts are not clustered separated except by chance. Density, distribution, and quantiles The Poisson Distribution is specified by the single parameter which is equal to both the population mean and variance. Thus sometimes the ratio of the sample mean to the variance is used as evidence (or lack thereof) of Poisson assumptions, with values of this ration ~1 taken as support for a Poisson count model. The density function of the Poisson is given by x e f ( x; ) x!, x=0, 1,, 3, where e is the base of the natural logarithm, λ>0, and x! denotes the factorial function x*(x-1)*(x-)..1. The distribution function is simply given by summation over the discrete values of x of the density F( x) x f ( k; ) k0 k0 k! x k e, x=0, 1,, 3,. The Poisson density and distribution are easily implemented in R., for example for λ=5: #poisson distribution #generate a sequence between 0 and 0 >x<-0:0 >lambda<-5 >density<-dpois(x,lambda) >plot(x,density,"h")

16 #distribution function >distrib<-ppois(x,lambda) >plot(x,distrib,"h") Likewise, standard quantiles are easily computed- > #quantiles > prob_levels<-c(0.001,.05,.5,.5,.75,.95,0.999) > quants<-qpois(prob_levels,lambda) > quants [1]

17 Likelihood function As with the other distributions we have considered, the likelihood function is formed from the density, but reversing the roles of parameters and data, with the latter now fixed and the former varying over some specified range. We will cheat here because we know that given the data λ cannot be huge (say >15) and we know that it has to be >0, so we will only look at the likelihood in that range: > #likelihood > lambda<-seq(0.01,15.01,0.01) > x<-8 > like<-dpois(x,lambda) > loglike<-dpois(x,lambda,log=true) > plot(lambda,like) We can also use a device similar to what we used for the normal to get a fairly good approximation of the maximum likelihood value for λ > #find the maximum over list of lambdas > lambda[loglike==max(loglike)] [1] 8 which is not surprising: given the simplicity of this model (mean=variance) and the single observation x=8, we expect λ to be around 8. Random number generation Given the discrete nature of the random variable in the Poisson distribution, there are several options for generating random variables, some of them quite simple, others not so simple but more flexible. We illustrate all 3 with the example of generating 100 random values from a Poisson(10) distribution. Method 1- built in R function

18 First, we can of course rely on the standard, built-in R function rpois(). #Generate 100 random numbers for lambda=10 #method 1 n<-100 lambda<-10 x<-rpois(n,lambda) #Method Uniform deviate, quantile (inverse distribution function) The second method also relies on an R-function, the quantile or inverse distribution function, but performs the calculations by first computing 100 random(0,1) numbers and then transforming them with the inverse distribution (quantile) function. #method >n<-100 >lamdbda<-100 >U<-runif(100) >x<-qpois(u,lambda) Method 3- Uniform deviate and interpolation from distribution In the third approach we have built a random number generator based directly on the cumulative discrete distribution F(x). As in Method we generate 100 random Uniform variables and then use these and the definition of F(x) to obtain values of x (see Evans et a. 000); this requires a user-defined function (pfun) to map the continuous values of U into discrete values of x: #method 3 >n<-100 >lambda<-10 #generate F(x) from 0 to 50 values<-0:50

19 >F<-ppois(values,10) #define the function to do interpolation from F(x) >pfun<-function(f,u,v) > { > x<-array(0,dim<-c(length(u))) > for (j in 1:length(u)) > { > for (i in 1:50) > { > if (f[i]<=u[j] & u[j]<f[i+1]) {x[j]<-v[i+1]} > } > } > x > } >#generate the values of U and x >U<-runif(N) >x<-pfun(f,u,values) You can confirm that all 3 methods give similar results using large values for n and that the last methods give identical results for the same vector of Uniform numbers. However, Method 3 is much slower than Methods or 1, which simply confirms that (usually) the built-in functions in R tend to be more computationally efficient than what beginning users can build. Building a function like this on your own though does illustrate that in can be done, and this can be handy in situations where no built-in function exists in R. For example, suppose we have a discrete distribution F(x) without a known mathematical form, but for which we can write out numerical values F(x). A simple example of this is where we use the quantile function to summarize the data from a sample into an empirical distribution function and treat these as F(x). We can then use an approach such as Method 3 to simulate values from this distribution, even though we have no idea of its

20 mathematical form. We ll return to these ideas later when we get more deeply into simulation in a later lab. Bernoulli Distribution/ Binomial Distribution The Bernoulli Distribution is the natural distribution for modeling outcomes that can occur in 1 of classes, such as success or failure, lived or died, heads or tails, male or female. The Bernoulli Distribution has a single parameter p that describes the probability of a success (however it is defined). The Binomial Distribution defines the number of successes that occur in n independent Bernoulli trials, each with the same probability of success p. The Binomial is thus based on summing Bernoulli distributions, and has parameters, n and p. Because these distributions are so closely related we will consider them together below. Density, distribution, and quantiles The Bernoulli random variable x takes on possible values, either 1 (indicating success) or 0 (failure), and has a single parameter, p, denoting the probability of success. The probability density function is written as x f ( x; p) p (1 p) 1x, x=0, 1 which simplifies to f ( 0; p) (1 p) and f ( 1; p) p. Note that we assume that there are only possible outcomes, a success with probability p and a failure with probability 1-p, and that by definition the probability that it is either a success or failure adds to 1. The mean of the Bernoulli distribution is E(x)=µ=p and the variance is Var(x)= p(1-p). The Binomial distribution is closely related, with the Binomial variable x defined as number of success in n independent Bernoulli trials, each with probability p of success. The Binomial thus has parameters (n and p) though one of these (n) ordinarily is known and will not be estimated from data. The Binomial density function is f ( x; n, p) n x x nx p (1 p)., x=0, n

21 The Binomial distribution function is F( x; n, p) x k nk p (1 p) k0 n k, x=0, n The mean and variance are given by E(x)=µ=np and the variance is Var(x)= np(1-p). The Binomial density and distribution are easily implemented in R by the dbinom() and pbinom() functions (there is no separate Bernoulli function in R, with the Bernoulli simply being a Binomial with a single trial n=1). e.g., for a Bernoulli with p= 0.4 >#Bernoulli >p<-0.4 >x<-0:1 >density<-dbinom(x,1,p) >distrib<-pbinom(x,1,p) >plot(x,density,"h",ylim=c(0,1)) >plot(x,distrib,"h",ylim=c(0,1)) This produces plots for the density and distribution of:

22 Taking a Binomial with p=0.4 and n=10 trials we have >#Binomial >n<-10 >p<-0.4 >x<-0:n >density<-dbinom(x,n,p) >distrib<-pbinom(x,n,p) >plot(x,density,"h",ylim=c(0,1)) >plot(x,distrib,"h",ylim=c(0,1)) This produces plots

23 and

24 Quantiles at specified p-values are easy to produce using the qbinom() function, e.g., >n<-10 > p<-0.4 > #quantiles > prob_levels<-c(0.001,.05,.5,.5,.75,.95,0.999) > quants<-qbinom(prob_levels,n,p) > quants [1] Likelihood function As with other distributions, we can reverse the roles of the data and the parameters and now treat the parameters as variables. In the case of either the Bernoulli or the Binomial there is generally only one parameter of interest, since we usually know how many trials there are. Take a case where we have 10 trials and we observe 4 successes. We can examine the likelihood over the range of p =(0,1) and try a brute force maximization as before: > #Likelihood

25 > p<-seq(0,1,0.001) > n<-10 > x<-4 > like<-dbinom(x,n,p) > loglike<-dbinom(x,n,p,log=true) > plot(p,like) > plot(p,loglike) > #find the maximum over list of lambdas > p[loglike==max(loglike)] [1] 0.4

26 The results suggest a value of p =0.4 maximizes the log likelihood. However, notice how flat the log likelihood function is, with many values of p larger and smaller than 0.4 returning similar values. This suggests that the data (4 successes but only 10 trials) provides relatively poor information about the parameter value. We will return to this point when we consider estimation in more depth later in the course. Random number generation Generating random number for the Bernoulli and Binomial is quite easy and can be accomplished with either a simple random uniform number generator or with the built-in function rbinom(). The first approach computes Bernoulli outcome by simply comparing a Uniform(0,1) random number to p; if U> p then x=0, otherwise x =1 >#generating bernoulli random variables >#specify p >#specify n_reps >p<-0.35 >#method 1 >x<-(runif(n_reps)<p)*1 >#method >x<-rbinom(n_reps,1,p) Generating Binomial random variables can be accomplished by generating a series of n Bernoulli variables and then summing these. #generating Binomial random variables #specify n #specify p #specify n_reps n_reps<-100 n<-10 p<-0.35

27 #method 1 x<-array(0,c(n_reps)) for (i in 1:n_reps) { x[i]<-sum(runif(n)<=p)*1 } Alternatively, you can directly used the rbinom() function in R #method x<-rbinom(n_reps,n,p) The advantage of the first approach is that sometimes we will not want to assume that the parameter p remains constant, but instead allow it to vary from sample to sample (or even among Bernoulli trials within a sample). In such cases we can still simulate or model the data but no longer under Binomial assumptions (which require p to be constant). We will look at an example of this in a bit. Multinomial Distribution The Multinomial Distribution is similar to the Binomial, but instead of modeling outcomes that occur in ways ( success or failure ) the outcomes can occur in 3 or more ways. For example, suppose that an animal can die, and if it lives can either reproduce or not reproduce, and that these are the only possibilities. If we assign the probabilities to these events as p 1=probability of death, p =probability of living and reproducing, and p =probability of living and not reproducing, by definition p p p 1. Thus, if we know of the 3 probabilities we know the 3 rd by subtraction, e.g., p3 1 p1 p. In general, if we have k categories of outcomes we have k-1 probabilities to describe them, with the last by subtraction. Like the Binomial, the Multinomial is built from a series of n independent trials, each with the same probabilities describing the outcomes. The random variable x is now a vector, denoting the number of the n trials that fall into each category. For example, if we have

28 100 animals, the outcomes might be 5 die, 50 live and reproduce, and 5 live but do not reproduce. The Multinomial density is n x1 x f ( x; n, p) p1 p... p x1x... x k x k k Because of its multivariate nature it is difficult to visualize the density, but density and distribution values are readily computed in R. For example, the density for a 3- category multinomial with 10 trials is calculated by > #example > n<-10 > p<-c(.5,.5,.5) > x<-c(1,5,4) > density<-dmultinom(x,n,p) > density<-dmultinom(x,n,p) > density [1] Random Multinomial variables are generated by the rmultinom() function. For instance, to generate 0 instances of the above 10-trial trinomial we would use: > #Random variables > rmultinom(0,10,p) [,1] [,] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,1] [,13] [,14] [1,] [,] [3,]

29 [,15] [,16] [,17] [,18] [,19] [,0] [1,] [,] [3,] > Beta Distribution The Beta Distribution is a continuous distribution that models a random variable x that can take on values in the range 0 x 1. The Beta is therefore appropriate for modeling the distribution of probability values, and in particular for modeling heterogeneity in probabilities. The parameters α and β (or a and b) control the location and shape of the Beta Distribution. The Uniform(0,1) distribution is a special case of the Beta with α=β=1. The Beta Distribution assumes additional importance as a natural (or conjugate) distribution for the Binomial, describing uncertainty in the Binomial parameter p before (prior) and after (posterior) data collection. Finally, the Beta and the Binomial can be combined hierarchically in model (the Beta-Binomial) in which the random outcome is a binary, but the process describing success is heterogeneous. We will return to both of these themes in later labs. Density, distribution, and quantiles The mathematical form of the Beta density is ( ) ( ) f ( x;, ) x ( ) where (c) is the Gamma function 1 (1 x) 1 ; 0 x 1; 0, c1 ( c) exp( u) u du 0 The kernel of the Beta (part that involves the random variable x) is actually quite simple and, not coincidentally resembles the Binomial distribution The mean of the Beta distribution is x 1 (1 x) 1.

30 and the variance is V E (x) x) ( ) ( 1) ( We can use the relationship between the mean and variance and the parameters to solve for parameter estimates via the Method of Moments. More later. The Beta variable x is sometimes interpreted as modeling the probability of success based on previously observing 1successes and 1failures. The Beta distribution function is obtained by integrating the density from 0 to x. Both the density and the distribution can easily be evaluated in R using the dbeta() and pbeta() functions. For example, for 10 15we can produce density and distribution values over the range of x. #Beta distribution >a<-10 >b<-15 >x<-seq(0,1,0.001) >#density >density<-dbeta(x,as,b) >distrib<-pbeta(x,a,b) >plot(x,density) >plot(x,distrib) This code will produce a plot of the density

31 and of the distribution

32 Notice that the density is centered near 0.4, but takes on a fairly wide range, indicating that Beta(10,15) would be appropriate for modeling success probability that averaged about 0.4 but exhibits heterogeneity. We will return to this theme later when we consider the Beta-Binomial distribution. Standard quantiles of the Beta are easily produced with the qbeta() function, for example > #quantiles > prob_levels<-c(0.001,.05,.5,.5,.75,.95,0.999) > quants<-qbeta(prob_levels,a,b) > quants [1] Likelihood function

33 As in our previous examples, we can consider the data (x) as fixed (observed) and treat the parameters as variables, producing a likelihood function. Note that with the Beta distribution, like the Normal, we have parameters, so we have to find a combination of and that maximizes the likelihood. The likelihood is easy to compute using R, for example if we observe x= 0.4 >#Likelihood >a<-seq(5,5,0.001) >b<-seq(10,30,0.001) >x<-0.4 >like<-dbeta(x,a,b) >loglike<-dbeta(x,a,b,log=true) It is a bit trickier to use brute force methods to get the maximum, and instead we will use graphical methods to get an approximation. Because the parameter space is dimensional, we need to display the likelihood in 3 dimensions The scatterplot function in R will produce a 3-D scatterplot >library(scatterplot3d) >scatterplot3d(a,b,loglike)

34 The graph indicates that the log likelihood has a maximum at around α=0 and β=0. Graphical methods become cumbersome (and inaccurate) for or more parameters; we will consider more exact methods for maximizing the likelihood in a later chapter. Random number generation Random number generation is easily performed in R using the rbeta() function. For example, the following code will generate 100 Beta(10,15) random variables. >#Random betas >n<-100 >a<-10 >b<-15 >#Method 1 >x<-rbeta(n,a,b) If α and β are integers, the following code can also be used to generate Beta random variables from Gamma random variables, which in turn are generated by a log transformation from Uniform distributions:

35 >#Method (if a and b are integers) >x<-array(0,c(n)) >for (i in 1:n) >{ #g1 and g are Gamma random variables >g1<-sum(-log(runif(a))) >g<-sum(-log(runif(b))) >x[i]<-g1/(g1+g) >} Gamma The Gamma distribution is a continuous distribution which has parameters, and where x takes on nonnegative values ( 0 x ). The Gamma is important in statistics because several other important distributions such as the Chi-square and Exponential are special cases. We also saw above how Gamma distributions can be used to generate Beta random variables. However, most of our interest in the Gamma will be because of its special relationship to the Poisson distribution, both for modeling heterogeneity in the Poisson parameter and for as a conjugate distribution for the Poisson in Bayesian analysis. Density, distribution, and quantiles The density function of the Gamma is 1 f ( x; b, c) ( x / b) ( c) c1 exp(-x/b); 0 x ; b 0, c 0 The distribution function of the Gamma is given by integration from 0 to x. F( x; b, c) x o 1 ( v / b) ( c) c1 exp(-v/b)dv; 0 x ; b 0, c 0 The mean and variance of the Gamma are related to the parameters in a straightforward way by E( x) bc and

36 V ( x) b c. As we will see these relationships lead to easy (but not particularly optimal) parameter estimation by the Method of Moments. The density and distribution are easily generated in R using the dgamma() and pgamma() functions. For example, we can plot the density and distribution for Gamma(b=1,c=5) by >#Gamma distribution >c<-5 >b<-1 >x<-seq(0,10,0.001) >#density note R gamma functions use inverse scale (rate) = >1/b >density<-dgamma(x,c,1/b) >distrib<-pgamma(x,c,/1b) >plot(x,density) >plot(x,distrib) This produces a density over the range of 0 to 10 of and a distribution of

37 Quantiles are produced by the qgamma() function. For the same parameter values we can produce several quantiles by > #quantiles > prob_levels<-c(0.001,.05,.5,.5,.75,.95,0.999) > quants<-qgamma(prob_levels,c,1/b) > quants [1] This indicates, for example, that median (0.5 quantile) of Gamma(1,5) is around 4.7, and that 99% of the data can be expected to lie below Likelihood function As with other distributions, we can form the likelihood by considering the data (x) as fixed and allowing the parameter values to vary. For example suppose we observe x=5, we can plot the log likelihood versus values of b and c >#likelihood >b<-seq(0.01,1,0.001)

38 >c<-b*10 >x<-5 >like<-dgamma(x,c,1/b) >loglike<-dgamma(x,c,1/b,log=true) >library(scatterplot3d) >scatterplot3d(c,b,loglike) By eyeballing this graphic, we can see that values of c 6 and b<0.5 appear to maximize the log likelihood. Random number generation We present methods for producing random numbers from the Gamma distribution. The easiest is the built in rgamma() function. For example to generate 1000 Gamma(1,5) random variables: >#random number generation >#method 1 >n<-1000 >c<-5 >b<-1

39 >x<-rgamma(n,c,1/b) Gamma variables can be generated directly from Uniform(0,1) random variables via a log transformation (we used this approach already for Beta random variables), if the parameter c has an integer value: >#method - if c is integer >n<-1000 >c<-5 >b<-1 > x<-array(0,c(n)) >for (i in 1:n) >{ >x[i]<-b*sum(-log(runif(c))) >} Estimation methods Fundamentally, all estimation methods are based on considering the sample data x as known, and then using the statistical model to derive values of the parameters based on the data. We will consider approaches: the Method of Moments and Maximum Likelihood, with most emphasis on the second of these. Method of Moments The Method of Moments is very simple and can provide reasonable estimates of parameter is some situations. The basic steps are Determine the population moments (expected value, variance, etc.) as functions of the parameter(s) Set the population moments equal to the sample (data based) moments Solve for the parameter(s) as functions of the data.

40 To take a very simple case, consider a Binomial experiment where we have 10 independent Bernoulli trials, we observe 6 success, and we wish to estimate p the probability of success (assumed homogeneous among trials). The population moment is E( x) np Setting the population moment equal to the sample moment (in the case, simply x) provides x np and solving for p provides p ˆ x / n 6/ A somewhat more complicated example involves the Beta distribution and moments: the mean and the variance. Recall that for the Beta the mean and variance are and E(x) V x) ( ) ( 1) ( Because there are unknowns (, ) and equations, we should be able to solve for the parameters, and we can. First, we equate the expectation of the moments with the sample moments

41 x and s ( ) ( 1) Then we solve for α and β ˆ x{[ x(1 x)]/ s 1} and ˆ (1 x){[ x(1 x)]/ s 1}. We have written a small R function to provide these calculations > beta.mom<-function(mean,sd){ + v<-sd** + x<-mean + a<-x*(x*(1-x)/v-1) + b<-(1-x)*(x*(1-x)/v-1) + c(a,b) + + } For example, if we have a sample mean of 0.5 and SD of 0.1 in our data, the program provides > beta.mom(.5,.1) [1] 1 1

42 or estimates of ˆ 1, ˆ 1. We can confirm that these correspond to the moments by plugging them into the population moment formula > beta.stats<-function(a,b){ + x<-a/(a+b) + v<-a*b/((a+b)^*(a+b+1)) + c(x,sqrt(v)) + } > beta.stats(1,1) [1] which returns the correct mean and SD. Unfortunately, the Method of Moments can produce bizarre results. For example > beta.mom(.1,.4) [1] However, both parameter of the Beta must be positive numbers, so the Method of Moments in this case does not work. The Beta Method of Moments behaves well for many cases, but can easily produce inadmissible values for the parameters, as just illustrated. The Beta example illustrates one drawback of the Method of Moments, which is that is sometimes can produce nonsensical results (outside the admissible parameter space). The method also does not necessarily provide a way to assess parameter confidence (variance, confidence intervals). Finally, the Method of Moments does not share some of the desirable properties of the next method, such as sufficiency, minimum variance, and asymptotic normality. For this reason, most practitioners use the Method of Moments only as a method for quick approximation, if at all.

43 For completeness, here s a function for estimating the gamma parameters using the method of moments. Remember that this is a quick and dirty and could give negative (incorrect) values. > gamma.mom<-function(mu,sd){ + v<- sd** + c=v/mu + b=(mu/sd)^ + c(b,c) + } > ## gamma MOM using mean 7 and sd of 11 > theta<-gamma.mom(7,11) > ## again take note of use of inverse scale or rate > ## lets see how close we were > mean(x<- rgamma(10000,theta[1],1/theta[])) [1] > sd(x) [1] Maximum Likelihood Maximum likelihood methods have several advantages not necessarily shared by other approaches, and therefore are favored in much of statistics. Generally speaking maximum likelihood estimators (MLEs) Are asymptotically (i.e., with large samples) unbiased Are asymptotically Normally distributed Have minimum variance (i.e., have variance smaller than any other estimator) Provide variance estimates directly as part of estimation The basic idea of MLE is simple: given the data, we consider the parameter(s) to be unknown variables; the density function now behaves instead as a likelihood function. We then solve for the parameter values that maximize the likelihood function, give the data values. There are several ways to do this:

44 By graphic the likelihood function against candidate parameter values By brute force searching over the parameter spaces By exact solution using The Calculus By numerical optimization methods. We can illustrate all these approaches by taking a simple case involving the Binomial distribution. Supposed we conduct 100 Bernoulli trials and observe 40 successes. For example, the 100 trials could be 100 nests that we have discovered and have followed from initiation to success (fledging) or failure. Because we know the number of trials (n=100) we will focus on estimating the probability of success. The statistical model is f ( x; n, p) n x x nx p (1 p) However, we now know that n=100 and x=40 so we will recast this as a likelihood function L ( p; x 40, n 100) p (1 p Now the task is to find a value for p that maximizes this function. Usually, it will be more convenient to work with the natural logarithm of the likelihood function. Because the logarithmic transformation is monotonic, if we find value of p that maximize log(l(p)) we ve also found the value that maximize L(p). For this example the log of the likelihood is 100 ln L( p; x 40, n 100) ln 40 or in general (for any integers n and x n ) ) 60 40ln p 60 ln(1 p) n ln L( p; x, n) ln xln p ( n x)ln(1 p) x. As noted, there are several ways we can go about finding the maximum of this function, and we visited briefly in the previous lab. The first method is based in graphing the likelihood and log likelihood. Rather than use the built-in dbinom() function, we have bimomial_likelihood R script.r to graph the likelikhood and log likelihood. We do this for

45 reasons: first, we want students to see explicitly what the likelihood function and its log looks like, and second, we are going to do some mathematical manipulation in a minute that would not be easy using the built-in R function. Graphical approach When we plot ln L( p; x, n) vs. p we get a curve centered about a value of p ~ 0.4. Similarly the log likelihood seems to peak around 0.4. So, p =0.4 is looking to be a good candidate for the MLE.

46 Brute force As we saw earlier, we can fairly easily find the maximum by brute force if 1) we have a single parameter (p) in this case and ) the parameter is constrained over a reasonable range ( 0 to 1 here). Using our explicit code and the list maximize trick, we get the following > #Brute force > #Likelihood > p<-seq(0,1,0.001)

47 > n<-100 > x<-40 > binomial_like<-function(x,n,p_){ + like=log(choose(n,x))+x*log(p_)+(n-x)*log(1-p_) #choose function evaluates n choose x + return(like) + } > loglike<-binomial_like(x,n,p) > #find the maximum over list of lambdas > p[loglike==max(loglike)] [1] 0.4 Again, this confirms that p=0.4 appears to be viable as the MLE. Exact approach using The Calculus The Calculus provides an exact solution to the likelihood maximization under certain conditions. In particular if the likelihood is continuous and twice differentiable then a necessary condition that L(p*) is a maximum is that the first derivative with respect to p is zero. If the second derivative is negative, then this assures that L(p*) is a maximum and not a minimum. For the Binomial Likelihood this is best approached by operating with the log likelihood. The first derivative of the log likelihood is d ln L( p; x 40, n 100) dp p (1 p) Setting this to zero yields

48 40 60 p (1 p) and with a little algebra p ˆ 40/ More generally d ln L( p; x, n) dp x p n x (1 p) x p n x ( 1 p) p ˆ x / n We can confirm graphically that the derivative becomes zero at p=0.4.

49 Direct solution of the log-likelihood equations by algebra is possible for many statistical models and their parameters. In addition to the Binomial parameter p the Poisson parameter λ can be estimated is way, as can the Normal parameters µ and σ, although analysis becomes more complicated when or more parameters are involved. For example, estimation of the Normal parameters µ and σ requires taking partial derivatives of the log-likelihood with respect to each parameter and setting each of these equations to zero. Solution of these equations for µ and σ then provides the estimates ˆ ˆ x n i1 n i1 i x i / n ( x x) Astute students will notice that the second formula differs slightly from the usual sample variance. / n s n i1 ( x x) i /( n 1) The reason is that ˆ (the MLE) slightly biased for small samples, and suse of reduces this bias. Numerical methods Explicit formulas for MLEs exist and are readily computed for many common statistical models. However, as models become more complex (more parameters and structure) it can be difficult or impossible to obtain algebraic solutions to the MLEs. Fortunately, high speed computers are capable of solving the likelihood equations via numerical approaches. These approaches really are a special application of optimization approaches that we will consider in more detail later. They generally require the following: A mathematical expression (or computer code) for computing the log-likelihood for a given parameter value

50 An initial guess for the parameter value (sometimes based on simple statistics from the data) A means of searching to see if improvements (higher log-likelihood values) can be made by changing the parameter value A stopping rule to determine that the parameter values has converged on the apparent MLE. Gradient descent methods and Newton s Method are of the more familiar (and simpler) optimization methods. Both require the ability to evaluate 1 st and nd derivatives (partial derivatives if there is more than 1 parameter) with respect to each candidate parameter value (combination of values). The derivatives can be either explicitly written (i.e., algebraic) or computed via approximations. We have Newtons method script.r that applies Newton s Method to solving for the MLE of the Binomial parameter p. The basic steps are simple: 1. Start with an initial value for p, p 0. Compute the gradient evaluated at the current value of p(p i ) d ln L( pi ) g( pi ) p i d ln L( pi ) 3. Compute g ( pi ) dpi pi 1 pi g( pi) / g ( pi) 4. Update p by 5. Return to Step and repeat until convergence Convergence can be evaluated by evaluating how much (or little) p changes and/or by determining that g p ) is sufficiently close to zero (i.e., differs from zero by some ( i specified small amount). In the example code (n=100, x=30) p in initialized at 0.1 and converges rapidly to 0.3.

51 R also has a built-in optimization function optimize() that performs maximization or minimization of specified function. The attached code applies this function to the above binomial example. MLE for higher-dimensioned problems In principal exactly the same approaches used for single-parameter models extend to parameters with multiple parameters. However, both graphical and brute force approaches become cumbersome beyond about parameters (try visualizing 4- dimensional graph!) and a generally eschewed in favor of either direct or numerical solution of a system of likelihood equations. Example- Normal Likelihood We can take the example of the Normal likelihood and a sample x of n observations. Assuming that the data are independent, the joint likelihood is formed by product of n likelihoods: n i x i x L 1 ) ( exp 1 ), ( and the log-likelihood is n i x i n x L 1 ) ( ) log( ), ( log The partial derivatives of the log-likelihood with respect to the parameters simplify to ) ( 0 ), ( log x n x L

52 3 1 ) ( ), ( log n i i x n x x n x L These equations can be solved directly by n i x i n x 1 / ˆ n i i n x x 1 / ) ( ˆ or by trial and error, gradient, Newton s Method, or other numerical methods. Application of Newton s Method and other derivative-based methods requires evaluation of the matrix of partial second derivatives ln ln ln ln L L L L I The matrix I is sometimes known as the Hessian or Information Matrix. The vector of first partial derivates / / L L G Solutions to the likelihood equations occur when G = 0; the inverse of F provides the estimated Variance-covariance Matrix, with the variances on the diagonal and the covariances on the off-diagonal. This same approach applies to any dimension MLE problem, with the sizes of G and F determined by the number of parameters (k, so G is length k and F is k x k). The optim() procedure in R can be generalized to solve for the MLEs for more complicated likelihoods involving multiple parameters. In optimize.r we perform ML optimization for Binomial, Normal, and Beta examples. Note that the optim() performs

53 by minimization, so to get maximum likelihood we compute the negative log likelihood and then find the parameter values that minimize the function. The parameter method = BFGS specifies the use of a quasi-newton method (similar to Newton s Method above) and hessian=true species that the algorithm will produce the Hessian matrix, which we can then use to get the variance-covariance matrix. R built in functions As you may have guessed, R users have created several packages that can be used to > library(mass) > beta.data<- c(0.05,0.,0.03,0.4,0.15) > fitdistr(x=beta.data,"beta",start=list(shape1=1,shape=1) ) shape1 shape ( ) ( ) Warning messages: 1: In densfun(x, parm[1], parm[],...) : NaNs produced : In densfun(x, parm[1], parm[],...) : NaNs produced > > gammer.dater<- c(3,.1,17,1,0.5,1.3,.01) > fitdistr(x=gammer.dater,"gamma") shape rate ( ) ( ) We will get more use out of these functions as the course progresses. Writing simulation programs in R We have already done a great deal of simulation with individual distributions in R; here we will focus on putting things together into more complicated analyses, and in efficiency.

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

1 Probability Distributions

1 Probability Distributions 1 Probability Distributions A probability distribution describes how the values of a random variable are distributed. For example, the collection of all possible outcomes of a sequence of coin tossing

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Slope Fields: Graphing Solutions Without the Solutions

Slope Fields: Graphing Solutions Without the Solutions 8 Slope Fields: Graphing Solutions Without the Solutions Up to now, our efforts have been directed mainly towards finding formulas or equations describing solutions to given differential equations. Then,

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1 The Exciting Guide To Probability Distributions Part 2 Jamie Frost v. Contents Part 2 A revisit of the multinomial distribution The Dirichlet Distribution The Beta Distribution Conjugate Priors The Gamma

More information

R Functions for Probability Distributions

R Functions for Probability Distributions R Functions for Probability Distributions Young W. Lim 2018-03-22 Thr Young W. Lim R Functions for Probability Distributions 2018-03-22 Thr 1 / 15 Outline 1 R Functions for Probability Distributions Based

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom Central Limit Theorem and the Law of Large Numbers Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Understand the statement of the law of large numbers. 2. Understand the statement of the

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by

More information

Algebra Performance Level Descriptors

Algebra Performance Level Descriptors Limited A student performing at the Limited Level demonstrates a minimal command of Ohio s Learning Standards for Algebra. A student at this level has an emerging ability to A student whose performance

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information

GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference

GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference Solé Prillaman Harvard University February 11, 2015 1 / 48 LOGISTICS Reading Assignment- Unifying Political Methodology chs 2 and 4. Problem Set 3-

More information

Sometimes the domains X and Z will be the same, so this might be written:

Sometimes the domains X and Z will be the same, so this might be written: II. MULTIVARIATE CALCULUS The first lecture covered functions where a single input goes in, and a single output comes out. Most economic applications aren t so simple. In most cases, a number of variables

More information

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0. () () a. X is a binomial distribution with n = 000, p = /6 b. The expected value, variance, and standard deviation of X is: E(X) = np = 000 = 000 6 var(x) = np( p) = 000 5 6 666 stdev(x) = np( p) = 000

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions Introduction to Statistical Data Analysis Lecture 3: Probability Distributions James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Exponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that

Exponential families also behave nicely under conditioning. Specifically, suppose we write η = (η 1, η 2 ) R k R p k so that 1 More examples 1.1 Exponential families under conditioning Exponential families also behave nicely under conditioning. Specifically, suppose we write η = η 1, η 2 R k R p k so that dp η dm 0 = e ηt 1

More information

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

In Chapter 17, I delve into probability in a semiformal way, and introduce distributions

In Chapter 17, I delve into probability in a semiformal way, and introduce distributions IN THIS CHAPTER»» Understanding the beta version»» Pursuing Poisson»» Grappling with gamma»» Speaking eponentially Appendi A More About Probability In Chapter 7, I delve into probability in a semiformal

More information

Part 2: One-parameter models

Part 2: One-parameter models Part 2: One-parameter models 1 Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes

More information

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013 Number Systems III MA1S1 Tristan McLoughlin December 4, 2013 http://en.wikipedia.org/wiki/binary numeral system http://accu.org/index.php/articles/1558 http://www.binaryconvert.com http://en.wikipedia.org/wiki/ascii

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Lecture 4: Training a Classifier

Lecture 4: Training a Classifier Lecture 4: Training a Classifier Roger Grosse 1 Introduction Now that we ve defined what binary classification is, let s actually train a classifier. We ll approach this problem in much the same way as

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

A Primer on Statistical Inference using Maximum Likelihood

A Primer on Statistical Inference using Maximum Likelihood A Primer on Statistical Inference using Maximum Likelihood November 3, 2017 1 Inference via Maximum Likelihood Statistical inference is the process of using observed data to estimate features of the population.

More information

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation PRE 905: Multivariate Analysis Spring 2014 Lecture 4 Today s Class The building blocks: The basics of mathematical

More information

Chapter 4: An Introduction to Probability and Statistics

Chapter 4: An Introduction to Probability and Statistics Chapter 4: An Introduction to Probability and Statistics 4. Probability The simplest kinds of probabilities to understand are reflected in everyday ideas like these: (i) if you toss a coin, the probability

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer

Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer Lunds universitet Matematikcentrum Matematisk statistik Matematisk statistik allmän kurs, MASA01:A, HT-15 Laborationer General information on labs During the rst half of the course MASA01 we will have

More information

Exponential Families

Exponential Families Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,

More information

STT 315 Problem Set #3

STT 315 Problem Set #3 1. A student is asked to calculate the probability that x = 3.5 when x is chosen from a normal distribution with the following parameters: mean=3, sd=5. To calculate the answer, he uses this command: >

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Likelihood and Bayesian Inference for Proportions

Likelihood and Bayesian Inference for Proportions Likelihood and Bayesian Inference for Proportions September 9, 2009 Readings Hoff Chapter 3 Likelihood and Bayesian Inferencefor Proportions p.1/21 Giardia In a New Zealand research program on human health

More information

8.5 Taylor Polynomials and Taylor Series

8.5 Taylor Polynomials and Taylor Series 8.5. TAYLOR POLYNOMIALS AND TAYLOR SERIES 50 8.5 Taylor Polynomials and Taylor Series Motivating Questions In this section, we strive to understand the ideas generated by the following important questions:

More information

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text. TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

DIFFERENTIAL EQUATIONS

DIFFERENTIAL EQUATIONS DIFFERENTIAL EQUATIONS Basic Concepts Paul Dawkins Table of Contents Preface... Basic Concepts... 1 Introduction... 1 Definitions... Direction Fields... 8 Final Thoughts...19 007 Paul Dawkins i http://tutorial.math.lamar.edu/terms.aspx

More information

Review of Discrete Probability (contd.)

Review of Discrete Probability (contd.) Stat 504, Lecture 2 1 Review of Discrete Probability (contd.) Overview of probability and inference Probability Data generating process Observed data Inference The basic problem we study in probability:

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Eco517 Fall 2004 C. Sims MIDTERM EXAM Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering

More information

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the

More information

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Week 2: Review of probability and statistics

Week 2: Review of probability and statistics Week 2: Review of probability and statistics Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED

More information

Varieties of Count Data

Varieties of Count Data CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 33 Probability Models using Gamma and Extreme Value

More information

Practical Algebra. A Step-by-step Approach. Brought to you by Softmath, producers of Algebrator Software

Practical Algebra. A Step-by-step Approach. Brought to you by Softmath, producers of Algebrator Software Practical Algebra A Step-by-step Approach Brought to you by Softmath, producers of Algebrator Software 2 Algebra e-book Table of Contents Chapter 1 Algebraic expressions 5 1 Collecting... like terms 5

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Estimation of Quantiles

Estimation of Quantiles 9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses

More information

Just Enough Likelihood

Just Enough Likelihood Just Enough Likelihood Alan R. Rogers September 2, 2013 1. Introduction Statisticians have developed several methods for comparing hypotheses and for estimating parameters from data. Of these, the method

More information

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

Some general observations.

Some general observations. Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces

More information

Week 1 Quantitative Analysis of Financial Markets Distributions A

Week 1 Quantitative Analysis of Financial Markets Distributions A Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Mixture distributions in Exams MLC/3L and C/4

Mixture distributions in Exams MLC/3L and C/4 Making sense of... Mixture distributions in Exams MLC/3L and C/4 James W. Daniel Jim Daniel s Actuarial Seminars www.actuarialseminars.com February 1, 2012 c Copyright 2012 by James W. Daniel; reproduction

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

3.4 Complex Zeros and the Fundamental Theorem of Algebra

3.4 Complex Zeros and the Fundamental Theorem of Algebra 86 Polynomial Functions 3.4 Complex Zeros and the Fundamental Theorem of Algebra In Section 3.3, we were focused on finding the real zeros of a polynomial function. In this section, we expand our horizons

More information

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Fourier and Stats / Astro Stats and Measurement : Stats Notes Fourier and Stats / Astro Stats and Measurement : Stats Notes Andy Lawrence, University of Edinburgh Autumn 2013 1 Probabilities, distributions, and errors Laplace once said Probability theory is nothing

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

Topic 17: Simple Hypotheses

Topic 17: Simple Hypotheses Topic 17: November, 2011 1 Overview and Terminology Statistical hypothesis testing is designed to address the question: Do the data provide sufficient evidence to conclude that we must depart from our

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Bayesian Estimation An Informal Introduction

Bayesian Estimation An Informal Introduction Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads

More information

Likelihood and Bayesian Inference for Proportions

Likelihood and Bayesian Inference for Proportions Likelihood and Bayesian Inference for Proportions September 18, 2007 Readings Chapter 5 HH Likelihood and Bayesian Inferencefor Proportions p. 1/24 Giardia In a New Zealand research program on human health

More information

Quantitative Understanding in Biology 1.7 Bayesian Methods

Quantitative Understanding in Biology 1.7 Bayesian Methods Quantitative Understanding in Biology 1.7 Bayesian Methods Jason Banfelder October 25th, 2018 1 Introduction So far, most of the methods we ve looked at fall under the heading of classical, or frequentist

More information

Statistical Models. David M. Blei Columbia University. October 14, 2014

Statistical Models. David M. Blei Columbia University. October 14, 2014 Statistical Models David M. Blei Columbia University October 14, 2014 We have discussed graphical models. Graphical models are a formalism for representing families of probability distributions. They are

More information

Math 123, Week 2: Matrix Operations, Inverses

Math 123, Week 2: Matrix Operations, Inverses Math 23, Week 2: Matrix Operations, Inverses Section : Matrices We have introduced ourselves to the grid-like coefficient matrix when performing Gaussian elimination We now formally define general matrices

More information

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!!  (preferred!)!! Probability theory and inference statistics Dr. Paola Grosso SNE research group p.grosso@uva.nl paola.grosso@os3.nl (preferred) Roadmap Lecture 1: Monday Sep. 22nd Collecting data Presenting data Descriptive

More information

Probability. Table of contents

Probability. Table of contents Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Discrete probability distributions

Discrete probability distributions Discrete probability s BSAD 30 Dave Novak Fall 08 Source: Anderson et al., 05 Quantitative Methods for Business th edition some slides are directly from J. Loucks 03 Cengage Learning Covered so far Chapter

More information

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Be able to compute and interpret expectation, variance, and standard

More information

Common ontinuous random variables

Common ontinuous random variables Common ontinuous random variables CE 311S Earlier, we saw a number of distribution families Binomial Negative binomial Hypergeometric Poisson These were useful because they represented common situations:

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

HOMEWORK #4: LOGISTIC REGRESSION

HOMEWORK #4: LOGISTIC REGRESSION HOMEWORK #4: LOGISTIC REGRESSION Probabilistic Learning: Theory and Algorithms CS 274A, Winter 2019 Due: 11am Monday, February 25th, 2019 Submit scan of plots/written responses to Gradebook; submit your

More information