Exact Statistical Inference in. Parametric Models

Size: px
Start display at page:

Download "Exact Statistical Inference in. Parametric Models"

Transcription

1 Exact Statistical Inference in Parametric Models Audun Sektnan December 2016 Specialization Project Department of Mathematical Sciences Norwegian University of Science and Technology Supervisor: Professor Bo Henry Lindqvist

2 i

3 ii Summary In this report we look at ways to construct exact statistical tests for parametric models, by generating samples with certain properties. A number of algorithms are implemented in R, and most of the work done is a duplication of results from either (Lindqvist and Taraldsen, 2007) or (Lockhart et al., 2007). The samples generated are verified against calculated theoretical results, or by comparing them with results from other algorithms, whenever this is possible. Some examples are listed for each algorithm. These are later used in goodness-of-fit tests, for evaluating whether or not a certain data set fits well with either the exponential distribution or the gamma distribution.

4 iii

5 Contents Summary ii 1 Introduction Sufficiency Generation of co-sufficient samples Exponential distribution, using Algorithm Verification Truncated exponential distribution Conditions for using Algorithm Examples Algorithm Jeffreys prior Truncated exponential distribution, using Algorithm Convergence diagnostics Verification - Gibbs algorithm Verification - Exact distribution Gibbs sampler - Gamma distribution Transformation of variables Generation of new value Algorithm - Gibbs sampler Verification of rejection step Examples iv

6 CONTENTS v 3.6 Convergence diagnostics Application: Goodness-Of-Fit Tests Kolmogorov-Smirnov Cramer-von Mises criterion Other tests Examples Bibliography 39

7 Chapter 1 Introduction One important part of statistical inference is testing whether or not a random sample comes from a specific probability distribution. Such goodness-of-fit (gof) tests on parametric distributions are often based on the normal distribution and are asymptotic, or certain parameters in the distributions have to be specified, like in the Kolmogorov-Smirnov test. Here we will look at ways to formulate exact tests for certain parametric models, in which there is no need to make any inference on the values of the parameters in the distribution. An important statistical concept related to this, is that of sufficiency. 1.1 Sufficiency Consider a random sample X = (X 1, X 2,..., X n ) from a known probability distribution f X (x;θ) with an unknown parameter θ, possibly multidimensional. If the goal is to estimate θ or some function g (θ), one would calculate some statistic T (x) as an estimator, using the realization x = (x 1, x 2,..., x n ). These estimates might be equal for different realizations x and y, and one might wonder if it is possible to summarize the data from the sample x in such a way that no useful data is lost. This is the concept of sufficient statistics, which is defined as follows (Casella and Berger, 2002) 1

8 CHAPTER 1. INTRODUCTION 2 Definition 1.1 A statistic T (X ) is a sufficient statistic for θ if the conditional distribution of the sample X given the value of T (X ) does not depend on θ. From this it follows that any statistic calculated for estimating a function g (θ) only depends on the value of the sufficient statistic. 1.2 Generation of co-sufficient samples The term co-sufficient samples (conditional-sufficient samples) is taken from (Lockhart et al., 2007) (from now on refered to as Loc07 for simplicity), and refers to samples drawn from the conditional distribution of X = (X 1,..., X n ) given the value of the sufficient statistic T : f X T (x T = t;θ) = f X T (x T = t), assuming that X has a certain probability distribution f X (x;θ). This conditional distribution will not depend on the parameter θ, which follows from the definition of a sufficient statistic. Generation of such samples are in general not straightforward. The paper (Lindqvist and Taraldsen, 2007) (from here on refered to as Lin07 for simplicity) discusses three different algorithms for generating co-sufficient samples, which we will follow here. The first one, Algorithm 1, is rather simple, but works only when certain conditions are fulfilled, which will be discussed later. The algorithm is as follows:

9 CHAPTER 1. INTRODUCTION 3 Algorithm 1 Input: A random sample x = (x 1,..., x n ). Requires that you can generate a random vector U = (U 1,...,U n ) from a known density f U (u), and you have functions χ(u,θ) and τ(u,θ) such that ( χ(u,θ),τ(u,θ) ) (X,T ) when θ is known. 1: Calculate the value of the sufficient statistic t from the random sample x = (x 1,..., x n ). 2: Draw U = (U 1,...,U n ) from the density f U (u). 3: Solve the equation τ(u,θ) = t for θ, denote the unique solution as ˆθ(U, t). 4: Return x t (u) = χ ( u, ˆθ(u, t) ). In general it is difficult to find the exact theoretical distribution for the co-sufficient samples. The joint distribution, conditioned on the value of T, will be of the form f X1,...,X n T (x 1,..., x n t) = f X 1,...,X n,t (x 1,..., x n, t), (1.1) f T (t) which in most cases will be a complicated expression. The marginal distributions f Xi T (x i t) can be found by integrating out the rest of the variables x 1,..., x i 1, x i+1,..., x n. These marginal distributions are the same for all i, so when we later will check if the algorithms produce samples from the correct distribution, we can look at the histogram of the entire sample, not just the component x i. 1.3 Exponential distribution, using Algorithm 1 Assume that a random sample X = (X 1, X 2,..., X n ) comes from an exponential distribution, defined as f X (x;θ) = θe θx, x > 0, (1.2)

10 CHAPTER 1. INTRODUCTION 4 where the parameter θ is not known. A sufficient statistic for the exponential distribution is just the sum T (X ) = n X i. (1.3) i=1 This can be shown using the factorization theorem. If U i Exp(1) independent and identically distributed (iid) for all i = 1,...,n, then the functions ( U1 χ(u,θ) = θ,..., U n θ n U i τ(u,θ) = θ, i=1 will have the correct distribution ( χ(u,θ),τ(u,θ) ) (X,T ), according to Lin07. Algorithm 1 can in this case be written as follows: ), Algorithm 1 - Exponential distribution Input: A random sample x = (x 1,..., x n ). 1: Calculate t = n i=1 x i. 2: Draw U = (U 1,...,U n ), where U i Exp(1) for i = 1,...,n. 3: Calculate ˆθ(u, t) = n i=1 u i /t. ( ) 4: Return X t (u) = u1,..., u ṋ. ˆθ θ Next we try out this algorithm using three different input samples. Example 1.1 First a sample of size 5000 is drawn from an exponential distribution with parameter θ = 5. The histogram from this sample is shown to the left in Figure 1.1, with the histograms for the cosufficient sample to the right. This co-sufficient sample is from a conditional distribution that is not exponential, but since n is so large we have that this distribution is close to the distribution of the original sample.

11 CHAPTER 1. INTRODUCTION 5 Histogram of original sample Exponential Histogram of co sufficient sample Exponential x chi Figure 1.1: Original sample from an exponential distribution with θ = 5 (left), and one cosufficient sample assuming exponential distribution (right). Example 1.2 A sample of the size 5000 is drawn from a uniform distribution between 0 and 1. The histogram from this sample is shown in Figure 1.2, as well as the histogram of one co-sufficient sample. Although the generated sample have the same value for the sufficient statistic T, it is easy to see from the histograms that the original sample did not come from an exponential distribution. Histogram of original sample Uniform Histogram of co sufficient sample Exponential x chi Figure 1.2: Original sample from a uniform distribution between 0 and 1 (left), and one cosufficient sample assuming exponential distribution (right).

12 CHAPTER 1. INTRODUCTION 6 Example 1.3 Next, the original sample is from a lognormal distribution with location parameter µ = 0.5 and scale parameter σ = 0.5. The histogram from this sample is shown in Figure 1.3, as well as the histogram of one co-sufficient sample. Here the histograms are more similar, but it is clear that the mode of the lognormal distribution is not at x = 0, as looks to be the case for the distribution of the co-sufficient sample. Histogram of original sample Lognormal Histogram of co sufficient sample Exponential x chi Figure 1.3: Original sample from a lognormal distribution with (µ, σ) = (0.5, 0.5) (left), and one co-sufficient sample assuming exponential distribution (right). 1.4 Verification How do we know that the co-sufficient samples generated in examples 1 to 3 are correct? In the case of the exponential distribution, it turns out that it is possible to calculate the joint conditional distribution given by equation (1.1), and also the marginal distributions. The sufficient statistic is a sum of n iid exponentially distributed variables with rate parameter θ, and will therefore be gamma-distributed with probability distribution f T (t) = Gamma(n,θ) = θn Γ(n) t n 1 e θt, t > 0.

13 CHAPTER 1. INTRODUCTION 7 Here the gamma distribution is parametrized with the rate parameter instead of the scale parameter. Now we can write f X1,...,X n,t (x 1,..., x n, t) = f T X1,...,X n (t x 1,..., x n )f X1,...,X n (x 1,..., x n ) Next f X1,...,X n (x 1,..., x n ) = = n f Xi (x i ) i=1 n θe θx i i=1 = θ n e θ n i=1 x i, for x 1,..., x n > 0, and zero elsewhere. Next 1 if n i=1 f T X1,...,X n (t x 1,..., x n ) = x i = t 0 else. Hence, putting these results into equation (1.1) and cancelling some factors, we have that the joint conditional distribution can be written as Γ(n) if n t f X1,...,X n T (x 1,..., x n t) = n 1 i=1 x i = t 0 else. We are interested in the marginal distributions, to test if Algorithm 1 works for the exponential distribution. The marginal distribution of x 1, conditioned on t, is found by integrating out

14 CHAPTER 1. INTRODUCTION 8 x 2,..., x n. This can be done as follows f X1 T (x 1 t) = Γ(n) t n 1 = Γ(n) t n 1 = Γ(n) t n 1 = Γ(n) t n 1 t x1 t x1 x t x1 t x1 x t x1 t x1 x t x1 t x1 x t x1 x 2... x n 2 0 t x1 x 2... x n 3 0 t x1 x 2... x n 4 0 t x1 x 2... x n 5 This pattern continues, and the final result is 0 d x n 1...d x 3 d x 2 f X1 T (x 1 t) = Γ(n) (t x 1 ) n 2 t n 1, (n 2)! (t x 1 x 2... x n 2 )d x n 2...d x 3 d x (t x 1 x 2... x n 3 ) 2 d x n 3...d x 3 d x (t x 1 x 2... x n 4 ) 3 d x n 4...d x 3 d x 2. which simplifies to f X1 T (x 1 t) = n 1 t n 1 (t x 1) n 2, x 1 [0, t], n 2. (1.4) This marginal distribution can now be used to check if Algorithm 1 produces co-sufficient samples from the correct distribution, in the case of the exponential distribution. Note that this marginal conditional distribution does not depend on the parameter θ, because we are conditioning on a sufficient statistic. Figure 1.4 shows the results of three different input values for t and n. In all three cases we see that the histograms for the generated data fits well with the theoretical distribution specified by equation (1.4). When n = 2 we see that the distribution is a constant function, and when n = 3 it is a linear function. In any case, for n > 2 we have that the distribution have a maximum at x 1 = 0 and is a decreasing function between 0 and t.

15 CHAPTER 1. INTRODUCTION 9 n = 2 n = 3 n = x 1 x 1 x 1 Figure 1.4: Histograms of x 1 for three different values of n, using Algorithm 1 on the exponential distribution, as well as the corresponding theoretical densities. 1.5 Truncated exponential distribution Assume that a random sample X = (X 1, X 2,..., X n ) comes from a truncated exponential distribution, defined as θe θx if θ R\{0} e f X (x;θ) = θ 1 1 if θ = 0, for 0 x 1. (1.5) where the parameter θ R is not known. This distribution is obtained by truncating the exponential distribution at x = 1, and is actually a valid probability distribution for all θ R, not just for positive values (this distribution is used in Lin07). A sufficient statistic for the truncated exponential distribution is the sum n T (X ) = X i, (1.6) just as in the case of an exponential sample. This can easily be seen in the same way as for the exponential distribution. To use Algorithm 1 we need to generate a random vector U and find the functions χ(u,θ) and τ(u,θ) such that ( χ(u,θ),τ(u,θ) ) (X,T ) when θ is known. This can be done by inversion, because the cumulative distribution function for a random variable from i=1

16 CHAPTER 1. INTRODUCTION 10 the truncated exponential distribution is 1 e θx if θ R\{0} 1 e F X (x;θ) = θ x if θ = 0, for 0 x 1. Solving u = F X (x;θ) for x, assuming θ 0, leads to: x = ln( 1 + ( e θ 1 ) u ). θ So, if u i Unif[0,1], then this leads to x i f X (x;θ) as defined in equation (1.5). We can therefore choose the functions ( ( ( ln 1 + e θ 1 ) ) U 1 χ(u,θ) =,..., ln( 1 + ( e θ 1 ) )) U n, θ θ n ln ( 1 + ( e θ 1 ) ) U i τ(u,θ) =. θ i=1 We can now use Algorithm 1 on the truncated exponential distribution. Solving for ˆθ in step 3 is done numerically, using the uniroot-function in R.

17 CHAPTER 1. INTRODUCTION 11 Algorithm 1 - Truncated exponential distribution Input: A random sample x = (x 1,..., x n ). 1: Calculate t = n i=1 x i from the random sample x = (x 1,..., x n ). 2: Draw U = (U 1,...,U n ), where U i Unif[0,1] for i = 1,...,n. 3: Solve n ln ( 1 + ( e θ 1 ) ) u i = t. θ i=1 This must be done numerically. Denote the unique solution as ˆθ(U, t). 4: Return X t (u) = ln ( ( ) 1 + e ˆθ 1 )u 1 ˆθ ( ( ln 1 + e ˆθ 1,..., ˆθ ) )u n. In the truncated exponential case, it is known that the distribution of a co-sufficient sample of size n is the distribution of n independent uniformly distributed variables between 0 and 1, given their sum. The reason for this is as follows: The conditional distribution of (X 1,..., X n ) (T = t) should be independent of θ, because this is what defines a sufficient statistic. It follows that we can choose the parameter value θ = 0 in equation 1.5, which means the X i s are uniformly distributed between 0 and 1. Hence, the co-sufficient sample (X 1,..., X n ) (T = t) has a distribution of n independent uniform random variables between 0 and 1, given their sum. This result can be used to verify if the co-sufficient samples generated using Algorithm 1 have the correct distribution. The simplest case is n = 2, where X 1 (X 1 + X 2 = t) can be shown to be uniformly distributed between 0 and t for 0 t 1 and uniformly distributed between (t 1) and 1 for 1 t 2. Similarly, X 2 conditioned on T will have the same distribution as X 1 conditioned on T. Now we can test the algorithm in the following example.

18 CHAPTER 1. INTRODUCTION 12 Example 1.4 Co-sufficient samples, assuming a truncated exponential distribution, are generated using Algorithm 1. The input sample is (0.5, 0.5), which is chosen to make the resulting sufficient statistic equal to 1. Hence, a histogram of x 1 conditioned on t = 1 should approximately be a constant between 0 and 1, as would be the case for a uniform variable between 0 and 1. The number of generations used are , and the histogram of x 1 are shown in Figure X 1 Figure 1.5: Histogram of x 1 using Algorithm 1 on the truncated exponential distribution, as well as the theoretical distribution (blue line). Clearly the distribution of the generated co-sufficient samples is not uniform, so there must be something wrong with the procedure. As mentioned earlier, Algorithm 1 has some conditions to make sure the generated samples have the correct distribution, and it turns out that not all of these are fulfilled in the case of the truncated exponential distribution. 1.6 Conditions for using Algorithm 1 The following must hold in order for Algorithm 1 to work Uniqueness The equation τ(u,θ) = t has to have a unique solution ˆθ(U, t).

19 CHAPTER 1. INTRODUCTION 13 Pivotal condition The function τ(u,θ) depends on u through a function r (u), not dependent on θ. Independence condition The output sample χ(u, ˆθ) is independent of τ(u,θ) for some value of θ. It is clear that the pivotal condition does not hold in the case of the truncated exponential distribution, and that is why Algorithm 1 produces samples from the wrong distribution. The solution to overcome this is to put a prior distribution π(θ) on the parameter θ, which will be considered in the next chapter. 1.7 Examples We will work with two different data sets: Ball bearing data: Failure data for 23 ball bearings, measured in millions of revolutions to fatigue failure, gathered from the lecture notes for the course TMA4275: Lifetime Analysis at NTNU (TMA4275, 2016). Premier League data: The number of points for the football team finishing last in the Premier League, during the period (Altomfotball.no, 2016). Algorithm 1 is used to generate co-sufficient samples from these data sets, assuming an exponential distribution. The histograms are shown in Figure 1.6 and Figure 1.7. In both cases we see that the co-sufficient samples seems to have a different shape than the data. In Chapter 4 we will analyze these data sets for gof.

20 CHAPTER 1. INTRODUCTION x u Figure 1.6: Histogram of the co-sufficient samples generated from the ball bearing data (left) and the histogram of the original data (right), assuming an exponential distribution x u Figure 1.7: Histogram of the co-sufficient samples generated from the Premier League data (left) and the histogram of the original data (right), assuming an exponential distribution.

21 Chapter 2 Algorithm 2 In this chapter we look at the extension of Algorithm 1, in the case that the conditions mentioned in the previous chapter doesn t hold. Algorithm 2 is described in Lin07, and here we follow the application of this algorithm on the truncated exponential distribution. 2.1 Jeffreys prior Algorithm 2 involves choosing a prior for θ, for instance Jeffreys prior. This is given as the square root of the Fischer information. The truncated exponential distribution is given by equation 1.5, so we get ln(f X (x;θ)) = lnθ + θx ln(e θ 1) Differentiating this expression two times gives θ ln(f X (x;θ)) = 1 θ + x eθ e θ 1 2 θ 2 ln(f X (x;θ)) = 1 θ 2 + e θ (e θ 1) 2 15

22 CHAPTER 2. ALGORITHM 2 16 and so Hence, Jeffreys prior is in this case I(θ) = 2 θ 2 ln(f X (x;θ)) = 1 θ 2 e θ (e θ 1) 2. This can be used as a prior in Algorithm 2 described later. 1 π(θ) = θ 2 e θ (e θ 1) 2. (2.1) 2.2 Truncated exponential distribution, using Algorithm 2 The algorithm requires that you can generate a random vector U = (U 1,...,U n ) from a known density f U (u), and you have functions χ(u,θ) and τ(u,θ) such that ( χ(u,θ),τ(u,θ) ) (X,T ) when θ is known. Now the parameter is considered a random variable Θ, with a chosen prior distribution π(θ), independent of U. Denote W t (u) as the density of τ(θ,u). The algorithm generates a random vector V from a distribution proportional to W t (u)f U (u), in contrast to Algorithm 1 when a random vector U was generated from the distribution f U (u). Solving τ(u,θ) = t in terms of θ and denoting this solution as ˆθ(u, t) (must be a unique solution), we have that the density of τ(θ,u) can be written as W t (u) = π(θ) det( θ τ(u,θ)). (2.2) θ= ˆθ(u,t) In the case of the truncated exponential distribution, Lin07 uses ordinary inversion, where f Ui (u i ) is the standard normal distribution: U i Unif[0,1]. (2.3)

23 CHAPTER 2. ALGORITHM 2 17 The functions used in the algorithm is ( log(1 + (e θ 1)u 1 ) χ(u,θ) =,..., log(1 + ) (eθ 1)u n ) θ θ n log(1 + (e θ 1)u i ) τ(u,θ) =. θ i=1 In this case one can write θ τ(u,θ) θ= ˆθ(u,t) = e ˆθ(u,t) ˆθ(u, t) n ( u i ) 1 + e ˆθ(u,t) 1 i=1 u i t ˆθ(u, t). (2.4) The algorithm generating the V s is a Markov chain Monte Carlo (McMC) algorithm, where the proposal is U = (U 1,...,U n ) with U i Unif[0,1]. This proposal is also used as initialization. Algorithm 2 Input: A random sample x = (x 1,..., x n ) and the number of iterations m. 1: Calculate t = n i=1 x i. 2: Initialize v 1 by drawing n random variables iid from the standard uniform distribution. For j = 2,3,...,m 3: Generate a random sample U = (U 1,...,U n ), where U i Unif[0,1]. 4: Solve n log(1+(e θ 1)u i ) i=1 θ = t numerically for θ, and denote the solution ˆθ(u, t). 5: Calculate the ratio α = W t (u)f U (u) W t (v j 1 )f U (v j 1 ), using equations (2.2), (2.3) and (2.4), where v j 1 is the sample generated at the previous iteration. 6: Draw z Unif[0,1] 7: If z < α set v j = u, if not set v j = v j 1. End iteration 8: Return v = (v 1,..., v m ). The unique weights used are the values of the density W t (u) for each accepted sample u. The prior used here is Jeffreys prior, given by equation (2.1), and the more simple prior function π(θ) = 1 θ. In both cases we have that the priors go towards infinity when θ 0. This will affect

24 CHAPTER 2. ALGORITHM 2 18 the performance of the algorithm, and so it is chosen that the acceptance probability α is set to zero whenever ˆθ(u, t) or ˆθ(v j 1, t) is less than 0.5. Figure 2.1 shows the distribution of x 1 from the co-sufficient samples, in the case when the input sample is (0.5,0.5) and the prior distribution is 1 θ. Sample size n = 2 and sum t = 1 Unique weights X 1 weight Figure 2.1: Histogram of co-sufficient samples generated using Algorithm 2 (left), and the histogram of unique weights, using 1/θ as prior (right). Figure 2.2 shows the distribution of x 1 from the co-sufficient samples, in the case when the input sample is (0.5,0.5) and the prior distribution on θ is Jeffreys prior. We observe that the weights are more spread out in the case of π(θ) = 1 θ than in the case of Jeffreys prior. In both cases, however, we see that the histogram of x 1 looks to be distributed correctly for this input sample, and that there doesn t seem to be a big difference in the rate of convergence. The number of iterations m is in both cases. 2.3 Convergence diagnostics Algorithm 2 is a McMC-algorithm that will generate samples that are correlated. To analyze if the algorithm converges and how the data is correlated, we plot the trace plot of the first 1000 values of x 1 and the sample autocorrelation function (acf) of both x 1 and x 2 from the generated

25 CHAPTER 2. ALGORITHM 2 19 Sample size n = 2 and sum t = 1 Unique weights X weight Figure 2.2: Histogram of co-sufficient samples generated using Algorithm 2 (left), and the histogram of unique weights, using Jeffreys prior (right). samples in the previous section, using Jeffreys prior. This is shown in Figure 2.3. It looks like the acfs is decaying exponentially, and it goes very slowly towards zero, meaning that there is a high degree of correlation in the co-sufficient samples. The trace plot looks quite ok, and the acceptance rate was in this case 63.3%, and so we can conclude that Algorithm 2 seems to converge in a decent way. 2.4 Verification - Gibbs algorithm The distribution of the co-sufficient samples for the truncated exponential distribution is, as shown earlier, that of n independent uniformly distributed random variables between 0 and 1. It turns out that samples from this distribution can be generated using a Gibbs algorithm as follows (taken from (Lindqvist and Rannestad, 2011)).

26 CHAPTER 2. ALGORITHM 2 20 x Index x 1 x 2 ACF ACF Lag Lag Figure 2.3: Trace plot of the first 1000 iterations and acfs for the generated samples using Algorithm 2 on the truncated exponential distribution. Algorithm - Gibbs algorithm Input: A random sample x = (x 1,..., x n ) and a chosen number of iterations M. 1: Calculate the sum of the original sample, t = n i=1 x i. 2: Initialize X 0 = t, where all the sample points have the same value. i n Iterate m from 1 to M: 3: Draw two integers i < j from {1,...,n}, and compute a = X m i + X m j. 4: If a 1 draw X m+1 i Unif[0, a], if not draw X m+1 i Unif[a 1,1]. 5: Calculate X m+1 j = a X m+1 i. Set the remainding n 2 points of X m+1 equal to X m. End iteration This algorithm can be used to verify if Algorithm 2 works for the truncated exponential distribution, also in the general case of a sample of size n. Figure 2.4 compares histograms for co-sufficient samples generated using Algorithm 2 and the Gibbs algorithm, for three different values of n and t. The number of generations used is for both methods. From this is

27 CHAPTER 2. ALGORITHM 2 21 seems that Algorithm 2 is working properly. 2.5 Verification - Exact distribution In Chapter 1 we calculated the theoretical distribution of the co-sufficient samples. We try the same method here, but limit ourself to the case of n = 2 and n = 3, because the distribution gets quite difficult to calculate for higher n. First the parameter θ is chosen to be zero. We can do this because the co-sufficient samples have a distribution that is independent of θ, so one can choose any value for this parameter. This means that we are looking for the distribution of n independent random variables uniformly distributed between 0 and 1, given its sum. Following the procedure done for the exponential case in Chapter 1, we first note that the distribution of the sufficient statistic is (WolframMath- World, 2016) 1 f T (t) = 2(n 1)! ( ) n ( 1) k n (t k) n 1 sgn(t k). k k=0 Hence, the joint conditional distribution can be written as 2(n 1)! n f X1,...,X n T (x 1,..., x n t) = k=0 ( 1)k ( n if n k)(t k) n 1 sgn(t k) i=1 x i = t and 0 x i 1 i = 1,...,n 0 else. To find the marginal distribution of X 1 given T, we need to integrate out x 2,..., x n. It turns out that this is more difficult than in the case of the exponential distribution, because of the restriction 0 x i 1 i = 1,...,n. Therefore we only look at the case n = 3, where we get f X1 T (x 1 t) = 2(n 1)! min{1,t x1 } n k=0 ( 1)k( n k) (t k) n 1 d x 2, sgn(t k) max{0,t x 1 1} where the limits follow from the restrictions. Solving the integral and inserting n = 3, we get 4 f X1 T (x 1 t) = 3 k=0 ( 1)k( 3) k (t k) 2 sgn(t k) (min{1, t x 1} max{0, t x 1 1}). (2.5)

28 CHAPTER 2. ALGORITHM 2 22 This marginal distribution can now be used to test Algorithm 2 for the truncated exponenetial distribution, in the case of n = 3. This is done for two values of the sufficient statistic, t = 1.5 and t = 2.5, and the histograms of the generated samples, compared with the theoretical density function from equation (2.5), are shown in Figure 2.5. The number of generations are in both cases , and we see that the generated samples seems to be correctly distributed, at least for n = 3.

29 CHAPTER 2. ALGORITHM n=2, t= n=2, t= X 1 X n=3, t= n=3, t= X 1 X n=5, t= n=5, t= X 1 X 1 Figure 2.4: Histogram of the co-sufficient samples using Algorithm 2 (left), compared to the samples obtain by using the Gibbs algorithm (right), for three different values of n and t.

30 CHAPTER 2. ALGORITHM 2 24 Sample size n = 3 and sum t = 1.5 Sample size n = 3 and sum t = X X 1 Figure 2.5: Histogram of the co-sufficient samples using Algorithm 2 for two different values for the sufficient statistic t, as well as the theoretical distribution.

31 Chapter 3 Gibbs sampler - Gamma distribution A method for generating co-sufficient samples for the gamma distribution, using a Gibbs sampler, is done in Loc07. This is an alternative to the methods previously discussed. Here we will try to replicate the method used. 3.1 Transformation of variables The density of a gamma distributed variable is 1 f X (x) = β α Γ(α) xα 1 e x β, x 0, where α and β are the shape and rate parameters, respectively. If (X 1,..., X n ) is a random sample where X i are drawn iid from a gamma distribution, then T = (s, p) is a sufficient statistic, where n s = X i, p = i=1 n X i. i=1 25

32 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION 26 In the algorithm below we need the joint density of (s, p, X 3,..., X n ), which can be found using the multivariate transformation formula f s,p,x3,...,x n (s, p, x 3,..., x n ) = f X1,...,X n (x 1,..., x n ) J(s, p, x 3,..., x n )), where J(s, p, x 3,..., x n )) is the Jacobian of the transformation. This is done in Lockhart2007, and the result is f s,p,x3,...,x n (s, p, x 3,..., x n ) = pα 1 e s β β nα (Γ(α)) n 2 n i=3 x i (s n i=3 x i ) 2 4 p n i=3 x i. (3.1) 3.2 Generation of new value The first step in the algorithm will be to generate a new value for x n, denoted x, conditioned on (s, p, x 3,..., x n 1 ). Using equation (3.1), it can be seen that this conditional distribution, denoted f c (x n ), must be proportional to f c (x n ) 1 x n (s n i=3 x i ) 2 4 p n i=3 x i. In Loc07 they mention two ways to generate this value, either by using a rejection algorithm or calculating the numerical cdf and then using inversion. Here we will use the former of these two approaches. As done in Loc07 we define C = s s and D = p/ p, where s = n 1 i=3 x i and p = n 1 i=3 x i. It can be seen that the conditional distribution can be written as 1 f c (x n ). x n (C x n ) 2 4D x n Transforming v = x n /C, this leads to a conditional density for v as f c (v) = K v v(1 v) 2 c, c = 4D C 3, (3.2)

33 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION 27 where K is found by normalization using numerical integration. The first step is to find out where the density is non-zero, which is where the value of h(v) = v(1 v) 2 c is positive. This is a polynomial of order 3, and the interval [a,b] is solved numerically. Because h(v) is continuous, with limit when v, limit when v and h (v) = 0 for v = 1 and v = 1, it follows that a is less than 1/3, b is between 1/3 and 1, and the last solution 3 d is larger than 1. The function h(v) is positive on the interval (a,b), as well as on (d, ), but because all the x i s are positive, it follows that x n < x 1 + x 2 + x n, and hence we get the restriction v < 1. So v (a,b) is the interval where the density in equation (3.2) is non-zero. Next we use the beta distribution as a proposal, which has distribution q(x) = 1 B(α b,β b ) xα b 1 (1 x) β b 1, 0 < x < 1. Here B(α b,β b ) is the Beta-function, and α b and β b are the parameters in the distribution. Next we define v = a + x(b a) to transform this distribution to the same interval as the target distibution. This leads to q(v) = (v a) αb 1 (b v) β b 1 (b a)1 α b β b, a < v < b. B(α b,β b ) Here the parameters in the beta distribution are chosen to be α b = 0.5 and β b = 0.5. To use a rejection algorithm it is necessary to find a constant k 1 such that kq(v) f c (v) v [a,b]. (3.3) This is done numerically, by finding the maximum of f c (v) q(v) = πk (v a)(b v) v(v(1 v) 2 c) on this interval, having used that B ( 1 2, 1 2) = π. This will be possible if the limits when v a and

34 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION 28 v b exists. Noting that ( ) (v a)(b v) lim v a v(v(1 v) 2 = b a ( (v a) lim c) a v a v(1 v) 2 c b a = a(3a 2 4a + 1), ) having used L Hôpital s rule, and similarly ( ) (v a)(b v) lim v b v(v(1 v) 2 c) b a = b(3b 2 4b + 1), we get f c (v) lim v a q(v) = πk b a a(3a 2 4a + 1), f c (v) lim v b q(v) = πk b a b(3b 2 4b + 1). These limits will be finite as long as a 1 3 and b 1, respectively. Hence there must be a k as required in equation (3.3). 3.3 Algorithm - Gibbs sampler The complete algorithm is described below. Note that this is a McMC-algorithm with acceptance probability equal to 1, because the generated value for x n is from the corrected conditional distribution. But this generated value is of course the result of a rejection step, where values are proposed until eventually one is accepted.

35 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION 29 Algorithm - Gibbs sampler Input: A random sample x = (x 1,..., x n ). Iterate n 2 times: 1: Generate a new value for x n, denoted x, conditioned on (s, p, x 3,..., x n 1 ). This is done using a rejection algorithm, as described earlier. 2: Replace x n by x. 3: Rotate the sample one step to the left and relabel. See Figure : Recalculate C and D based on the new sample. These values will be used in Step 1 of the new iteration. End iteration 5: Calculate the values of x 1 and x 2. x 1 x 2 x 3 x 4 x 5 x n-2 x n-1 x n x 1 x 2 x 3 x 4 x 5 x n-2 x n-1 x* x 1 x 2 x 4 x 5 x 6 x n-1 x* x 3 x 1 x 2 x 3 x 4 x 5 x n-2 x n-1 x n x 1 x 2 x 3 x 4 x 5 x n-2 x n-1 x* x 1 x 2 x 4 x 5 x 6 x n-1 x* x 3 New value Rotate Relabel New value Rotate x 1 x 2 x 3 x 4 x 5 x n-2 x n-1 x n Relabel x 1 x 2 x 3 x 4 x 5 x n-2 x n-1 x n Figure 3.1: Illustration of how the Gibbs sampler algorithm works. Here the letters in bold are the newly generated values. The last step is done by solving the equations x 1 + x 2 = s x 1 x 2 = n x i, i=3 p n i=3 x. i

36 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION 30 These equations have the solution x = s n i=3 x i ± (s n i=3 x i ) p n i=3 x i. The order of x 1 and x 2 is decided by flipping a coin. 3.4 Verification of rejection step Let x = (1,2,3,4,5) be a sample where we want to generate a new value v from the density in equation (3.2), to, after multiplying with C, replace x 5 = 5. In this case it is calculated that c = , and the theoretical density function can be found by numerically finding K and the zeros a and b of the term inside the square root. In this case the interval where f c (v) is non-zero is [ , ]. Now we use the rejection step algorithm to generate values for v, and the corresponding histogram is shown in Figure 3.2. The blue line is the theoretical distribution, and from this it seems ok to assume that the rejection step algorithm works fine Figure 3.2: Histogram of generated values from the rejection step, as well as the theoretical distribution function. v

37 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION Examples Now we try out the Gibbs sampler on two data sets, and see how they are distributed compared to how the original sample looks. These are the same data sets that were used in Chapter 1, and both of these data sets will be analyzed for gof in the next chapter. Example Ball bearing data The Gibbs sampler is used to generate co-sufficient samples, using the ball bearing data as input. The histogram of the co-sufficient samples and the original sample are shown in Figure x u Figure 3.3: Histogram of the co-sufficient samples generated from the ball bearing data (left) and the histogram of the original data (right). Example Premier League data Now we try the premier league data as input, and generate co-sufficient samples. The histogram of all these samples, as well as the histogram for the original sample, are shown in Figure 3.4.

38 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION x u Figure 3.4: Histogram of the co-sufficient samples generated from the Premier League data (left) and the histogram of the original data (right). 3.6 Convergence diagnostics The Gibbs sampler is a McMC-algorithm that in general will generate samples that are correlated. To analyze if the algorithm converges and how the data is correlated, we plot the trace plot of x 1 and the sample acfs of x 1 and x 3 from the generated samples in Example 3.2. This is shown in Figure 3.5. The trace plots for the other components are similar. The ACF seems to become insignificant for lags above 5. The ACFs for all the x i s are very similar, except for x 1 and x 2, which both look the one plotted for x 1 in Figure 3.5. The reason for this must be that the Gibbs sampler algorithm generates x 1 and x 2 in a different way than for the rest of the sample. In any case, we can conclude that the Gibbs sampler seems to converge, at least for this data set.

39 CHAPTER 3. GIBBS SAMPLER - GAMMA DISTRIBUTION 33 x Index ACF x ACF x Lag Lag Figure 3.5: Trace plot of the first 1000 iterations of x 1 and acfs for the generated samples in Example 3.2.

40 Chapter 4 Application: Goodness-Of-Fit Tests Co-sufficient samples can be used to test if a sample comes from a particular distribution. Several tests are described below, and these can be used on the co-sufficient samples by calculating the corresponding test statistic for each generated sample, and compare how these values are distributed with the value of the test statistic for the original sample. The distribution of the test statistics are the same for both the co-sufficient samples and the original sample, under the assumption that the original sample comes from a particular distribution. 4.1 Kolmogorov-Smirnov The Kolmogorov-Smirnov gof test is a simple test for analyzing whether or not a sample is from a particular probability distribution. It is based on (Handbook) calculating the maximum distance between the empirical cumulative distribution function (cdf) and the theoretical cdf for the proposed distribution. Because the cdf is a non-decreasing function, it turns out that the test statistic can be written as ( D = max F (Y i ) i 1 1 i N N, i ) N F (Y i ), (4.1) where F (y) is the theoretical cdf. The maximum likelihood estimate for θ is used in the theoretical cdf. 34

41 CHAPTER 4. APPLICATION: GOODNESS-OF-FIT TESTS 35 Figure 4.1 illustrates how this test works. Here one co-sufficient sample is generated using the ball bearing data as input sample, assuming an exponential distribution, and it is clear that the empirical cdf of the co-sufficient sample matches better with the theoretical cdf, than what is the case with the original sample. CDF Theoretical CDF Empirical CDF (sample) Empirical CDF (co suff. sample) t Figure 4.1: Theoretical and empirical cdfs used in the Kolmogorov-Smirnov gof test. The Kolmogorov-Smirnov test is used as a one-sided test, rejecing the null hypothesis if the value of the statistic is large. 4.2 Cramer-von Mises criterion An alternative to Kolmogorov-Smirnov is using the one-sample case of the Cramer-von Mises criterion. The statistic used is defined as (Encyclopediaofmath, 2016) ω 2 = [F n (x) F (x)] 2 df (x), where F n (x) is the empirical distribution function and F (x) is the theoretical cdf for particular distribution that is assumed under the zero hypothesis. If x 1,..., x n are the sample points, in

42 CHAPTER 4. APPLICATION: GOODNESS-OF-FIT TESTS 36 increasing order, then the statistic in the one-sample case is given by T = nω 2 = 1 12n + n i=1 [ ] 2i 1 2 2n F (x i ). The maximum likelihood estimates for α and β are used in the theoretical cdf. The Cramer-von Mises criterion is used as a one-sided test, rejecing the null hypothesis if the value of the statistic is large. 4.3 Other tests We also use some very simple test statistics, just to compare the p-values calculated with the ones calculated for Kolmogorov-Smirnov (KS) and Cramer-von Mises (CM). These are named T1,T2,T3,T4 and T5, and are defined as follows: T1: Maximum of sample T2: Median of sample T3: Proportion of sample less than the mean of the sample T4: Sum of the square of the components of the sample T5: Maximum of sample divided by minimum of sample 4.4 Examples Example Ball bearing data, exponential distribution Using the ball bearing data as input, we assume the null hypothesis that the data comes from an exponential distribution, and generate co-sufficient samples using Algorithm 1. These are used to calculate p-values, and the values are listed in Table 4.1. Figure 4.2 illustrates how the value of the test statistic is distributed for both Kolmogorov-Smirnov and Cramer-von Mises, calculated from the co-sufficient samples. The vertical line is the test

43 CHAPTER 4. APPLICATION: GOODNESS-OF-FIT TESTS 37 statistics for the original sample, and it is easy to see a big difference between the co-sufficient samples and the original sample. Hence the p-values are very small, and the null hypothesis is rejected using any reasonable significance level. we can therefore conclude that the ball bearing data set cannot be described well with an exponential distribution D T Figure 4.2: Histogram of test statistics for the co-sufficient statistics for Kolmogorov-Smirnov (left) and Cramer-von Mises (right), as well as the value of the test statistics for the ball bearing sample (red line), under the assumption of an exponential distribution. Example Ball bearing data, gamma distribution Using the ball bearing data as input, we assume the null hypothesis that the data comes from a gamma distribution, and generate co-sufficient samples using the Gibbs sampler algorithm. These are used to calculate p-values, and the values are listed in Table 4.1. Table 4.1: Calculated p-values for the ball bearing data. KS CM T1 T2 T3 T4 T5 Exponential Gamma The null hypothesis is not rejected, and we conclude that the ball bearing data can be described quite well by the gamma distribution. This seems quite reasonable when looking at Figure 3.3,

44 CHAPTER 4. APPLICATION: GOODNESS-OF-FIT TESTS 38 because the co-sufficient samples and the original sample looks to have similar shapes. Example Premier League Now we test the same null hypothesss for the Premier League data, and the results are shown in Table 4.2, assuming first an exponential distribution and then a gamma distribution. Here the number of co-sufficient samples used are , and when the p-value is exactly zero it means that the test statistic for the original sample is more extreme than for any of the generated samples. It is clear that the data doesn t come from an exponential distribution, which is not surprising considering how the histogram of the data looks (see Figure 1.7). The last row in Table 4.2 also shows that the data doesn t fit well with a gamma distribution. The p-values for the Kolmogorov-Smirnov test and the Cramer-von Mises test are both below 0.05, so we can reject the null hypothesis that the data comes from a gamma distribution when using a significance level of 5%. Looking at Figure 3.4, we observe that an important difference between the histograms are that the co-sufficient samples tails of for increasing values of x, but the original sample has a quite clear cut-off. This is because it is unlikely for the team finishing last in the Premier League to have substantially more than 35 points, because it would require many teams to be very close in the number of points, and that no team is particularly worse than the rest. It is probably this property that makes the p-values so small. Table 4.2: Calculated p-values for the Premier League data. KS CM T1 T2 T3 T4 T5 Exponential Gamma

45 Bibliography Altomfotball.no (2016). [Online; accessed 18-December- 2016]. Casella, G. and Berger, R. L. (2002). Statistical Inference. 2nd edition. Encyclopediaofmath (2016). Cramér von mises criterion. encyclopediaofmath.org/index.php/cram%c3%a9r-von_mises_test. [Online; accessed 18-October-2016]. Handbook, E. S. Engineering Statistics Handbook, kolmogorov-smirnov goodness-of-fit test. Accessed: Lindqvist, B. H. and Rannestad, B. (2011). Monte carlo exact goodness-of-fit tests for nonhomogeneous poisson processes. Applied Stochastic Models in Business and Industry, 27(3): Lindqvist, B. H. and Taraldsen, G. (2007). Conditional monte carlo based on sufficient statistics with applications. Advances in Statistical Modeling and Inference. Essays in Honor of Kjell A Doksum, pages Lockhart, R. A., O Reilly, F. J., and Stephens, M. A. (2007). Use of the gibbs sampler to obtain conditional tests, with applications. Biometrika, 94(4): TMA4275 (2016). Lifetime analysis. TMA4275-Slides pdf. [Online; accessed 18-December-2016]. 39

46 BIBLIOGRAPHY 40 WolframMathWorld (2016). html. [Online; accessed 16-December-2016].

Monte Carlo conditioning on a sufficient statistic

Monte Carlo conditioning on a sufficient statistic Seminar, UC Davis, 24 April 2008 p. 1/22 Monte Carlo conditioning on a sufficient statistic Bo Henry Lindqvist Norwegian University of Science and Technology, Trondheim Joint work with Gunnar Taraldsen,

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Conditional Sampling from a Gamma Distribution given Sufficient Statistics

Conditional Sampling from a Gamma Distribution given Sufficient Statistics Conditional Sampling from a Gamma Distribution given Sufficient Statistics Marius Fagerland Master of Science in Physics and Mathematics Submission date: June 2016 Supervisor: Bo Henry Lindqvist, MATH

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS EDWARD I. GEORGE and ZUOSHUN ZHANG The University of Texas at Austin and Quintiles Inc. June 2 SUMMARY For Bayesian analysis of hierarchical

More information

Investigation of goodness-of-fit test statistic distributions by random censored samples

Investigation of goodness-of-fit test statistic distributions by random censored samples d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

CONDITIONAL MONTE CARLO BASED ON SUFFICIENT STATISTICS WITH APPLICATIONS

CONDITIONAL MONTE CARLO BASED ON SUFFICIENT STATISTICS WITH APPLICATIONS Festschrift for Kjell Doksum CONDITIONAL MONTE CARLO BASED ON SUFFICIENT STATISTICS WITH APPLICATIONS By Bo Henry Lindqvist and Gunnar Taraldsen Norwegian University of Science and Technology and SINTEF

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

Stat 451 Lecture Notes Simulating Random Variables

Stat 451 Lecture Notes Simulating Random Variables Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

IE 303 Discrete-Event Simulation L E C T U R E 6 : R A N D O M N U M B E R G E N E R A T I O N

IE 303 Discrete-Event Simulation L E C T U R E 6 : R A N D O M N U M B E R G E N E R A T I O N IE 303 Discrete-Event Simulation L E C T U R E 6 : R A N D O M N U M B E R G E N E R A T I O N Review of the Last Lecture Continuous Distributions Uniform distributions Exponential distributions and memoryless

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017 Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Introduction to Bayesian Statistics 1

Introduction to Bayesian Statistics 1 Introduction to Bayesian Statistics 1 STA 442/2101 Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 42 Thomas Bayes (1701-1761) Image from the Wikipedia

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts 1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS

A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS Dimitrios Konstantinides, Simos G. Meintanis Department of Statistics and Acturial Science, University of the Aegean, Karlovassi,

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture

More information

Estimation of Quantiles

Estimation of Quantiles 9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles

More information

( ) ( ) Monte Carlo Methods Interested in. E f X = f x d x. Examples:

( ) ( ) Monte Carlo Methods Interested in. E f X = f x d x. Examples: Monte Carlo Methods Interested in Examples: µ E f X = f x d x Type I error rate of a hypothesis test Mean width of a confidence interval procedure Evaluating a likelihood Finding posterior mean and variance

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Bayesian Graphical Models

Bayesian Graphical Models Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference

More information

1 Introduction. P (n = 1 red ball drawn) =

1 Introduction. P (n = 1 red ball drawn) = Introduction Exercises and outline solutions. Y has a pack of 4 cards (Ace and Queen of clubs, Ace and Queen of Hearts) from which he deals a random of selection 2 to player X. What is the probability

More information

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study MATEMATIKA, 2012, Volume 28, Number 1, 35 48 c Department of Mathematics, UTM. The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study 1 Nahdiya Zainal Abidin, 2 Mohd Bakri Adam and 3 Habshah

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Theory of Statistical Tests

Theory of Statistical Tests Ch 9. Theory of Statistical Tests 9.1 Certain Best Tests How to construct good testing. For simple hypothesis H 0 : θ = θ, H 1 : θ = θ, Page 1 of 100 where Θ = {θ, θ } 1. Define the best test for H 0 H

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

1 Probability theory. 2 Random variables and probability theory.

1 Probability theory. 2 Random variables and probability theory. Probability theory Here we summarize some of the probability theory we need. If this is totally unfamiliar to you, you should look at one of the sources given in the readings. In essence, for the major

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Charles J. Geyer April 6, 2009 1 The Problem This is an example of an application of Bayes rule that requires some form of computer analysis.

More information

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain 152/304 CoDaWork 2017 Abbadia San Salvatore (IT) Modified Kolmogorov-Smirnov Test of Goodness of Fit G.S. Monti 1, G. Mateu-Figueras 2, M. I. Ortego 3, V. Pawlowsky-Glahn 2 and J. J. Egozcue 3 1 Department

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede Hypothesis Testing: Suppose we have two or (in general) more simple hypotheses which can describe a set of data Simple means explicitly defined, so if parameters have to be fitted, that has already been

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Three examples of a Practical Exact Markov Chain Sampling

Three examples of a Practical Exact Markov Chain Sampling Three examples of a Practical Exact Markov Chain Sampling Zdravko Botev November 2007 Abstract We present three examples of exact sampling from complex multidimensional densities using Markov Chain theory

More information

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/?? to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter

More information

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary CPSC 531: Random Numbers Jonathan Hudson Department of Computer Science University of Calgary http://www.ucalgary.ca/~hudsonj/531f17 Introduction In simulations, we generate random values for variables

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

The comparative studies on reliability for Rayleigh models

The comparative studies on reliability for Rayleigh models Journal of the Korean Data & Information Science Society 018, 9, 533 545 http://dx.doi.org/10.7465/jkdi.018.9..533 한국데이터정보과학회지 The comparative studies on reliability for Rayleigh models Ji Eun Oh 1 Joong

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 38 Goodness - of fit tests Hello and welcome to this

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

36-463/663Multilevel and Hierarchical Models

36-463/663Multilevel and Hierarchical Models 36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring Independent Events Two events are independent if knowing that one occurs does not change the probability of the other occurring Conditional probability is denoted P(A B), which is defined to be: P(A and

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Jonathan Marchini Department of Statistics University of Oxford MT 2013 Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 27 Course arrangements Lectures M.2

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

General Bayesian Inference I

General Bayesian Inference I General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for

More information

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract Bayesian analysis of a vector autoregressive model with multiple structural breaks Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus Abstract This paper develops a Bayesian approach

More information

Lecture Notes 3 Convergence (Chapter 5)

Lecture Notes 3 Convergence (Chapter 5) Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013 Eco517 Fall 2013 C. Sims MCMC October 8, 2013 c 2013 by Christopher A. Sims. This document may be reproduced for educational and research purposes, so long as the copies contain this notice and are retained

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

STAT STOCHASTIC PROCESSES. Contents

STAT STOCHASTIC PROCESSES. Contents STAT 3911 - STOCHASTIC PROCESSES ANDREW TULLOCH Contents 1. Stochastic Processes 2 2. Classification of states 2 3. Limit theorems for Markov chains 4 4. First step analysis 5 5. Branching processes 5

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Charles J. Geyer March 30, 2012 1 The Problem This is an example of an application of Bayes rule that requires some form of computer analysis.

More information

Chapter 6. Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer

Chapter 6. Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer Chapter 6 Estimation of Confidence Intervals for Nodal Maximum Power Consumption per Customer The aim of this chapter is to calculate confidence intervals for the maximum power consumption per customer

More information