Monte Carlo conditioning on a sufficient statistic

Size: px
Start display at page:

Download "Monte Carlo conditioning on a sufficient statistic"

Transcription

1 Seminar, UC Davis, 24 April 2008 p. 1/22 Monte Carlo conditioning on a sufficient statistic Bo Henry Lindqvist Norwegian University of Science and Technology, Trondheim Joint work with Gunnar Taraldsen, NTNU

2 Seminar, UC Davis, 24 April 2008 p. 2/22 Outline Definition of sufficiency Sufficiency in goodness-of-fit testing Conditional sampling given sufficient statistic: Basic algorithm Conditional sampling given sufficient statistic: Weighted sampling The Euclidean case Relation to Bayesian and fiducial statistics Other applications and concluding remarks

3 Seminar, UC Davis, 24 April 2008 p. 3/22 Sufficient statistics (X, T) pair of random vectors with joint distribution indexed by θ. Typically, X = (X 1,..., X n ) is a sample and T = T(X 1,..., X n ) is a statistic.

4 Seminar, UC Davis, 24 April 2008 p. 3/22 Sufficient statistics (X, T) pair of random vectors with joint distribution indexed by θ. Typically, X = (X 1,..., X n ) is a sample and T = T(X 1,..., X n ) is a statistic. T is assumed to be sufficient for θ compared to X, meaning that: The conditional distribution of X given T = t does not depend on θ.

5 Seminar, UC Davis, 24 April 2008 p. 3/22 Sufficient statistics (X, T) pair of random vectors with joint distribution indexed by θ. Typically, X = (X 1,..., X n ) is a sample and T = T(X 1,..., X n ) is a statistic. T is assumed to be sufficient for θ compared to X, meaning that: The conditional distribution of X given T = t does not depend on θ. or equivalently Conditional expectations E{φ(X) T = t} do not depend on the value of θ.

6 Seminar, UC Davis, 24 April 2008 p. 3/22 Sufficient statistics (X, T) pair of random vectors with joint distribution indexed by θ. Typically, X = (X 1,..., X n ) is a sample and T = T(X 1,..., X n ) is a statistic. T is assumed to be sufficient for θ compared to X, meaning that: The conditional distribution of X given T = t does not depend on θ. or equivalently Conditional expectations E{φ(X) T = t} do not depend on the value of θ. Useful criterion: Neyman s Factorization Theorem: T = (T 1,..., T k ) is sufficient for θ compared to X = (X 1,..., X n ) if the joint density can be factorized as f(x θ) = h(x)g(t(x) θ) i.e. f(x 1,..., x n θ) = h(x 1,..., x n )g(t 1 (x 1,..., x n ), T 2 (x 1,..., x n ),..., T k (x 1,..., x n ) θ)

7 Seminar, UC Davis, 24 April 2008 p. 4/22 Sufficiency Applications: construction of optimal estimators and tests, nuisance parameter elimination, goodness-of-fit testing.

8 Seminar, UC Davis, 24 April 2008 p. 4/22 Sufficiency Applications: construction of optimal estimators and tests, nuisance parameter elimination, goodness-of-fit testing. Motivation for present Monte Carlo approach: Usually difficult to derive the conditional distributions analytically. Simulation methods are therefore sought rather than formulas for conditional densities. Goal: To sample X conditionally given T = t.

9 Seminar, UC Davis, 24 April 2008 p. 5/22 Goodness-of-fit testing H 0 : observation X comes from a particular distribution indexed by θ

10 Seminar, UC Davis, 24 April 2008 p. 5/22 Goodness-of-fit testing H 0 : observation X comes from a particular distribution indexed by θ Suppose test statistic W W(X) given, large values expected when H 0 is violated. Let T T(X) be sufficient statistic under the null model.

11 Seminar, UC Davis, 24 April 2008 p. 5/22 Goodness-of-fit testing H 0 : observation X comes from a particular distribution indexed by θ Suppose test statistic W W(X) given, large values expected when H 0 is violated. Let T T(X) be sufficient statistic under the null model. Conditional test: Reject H 0, conditionally given T = t, when W k(t), where critical value k(t) is such that P H0 (W k(t) T = t) = α. k(t) is found from conditional distribution of W given T = t, which in principle is known.

12 Seminar, UC Davis, 24 April 2008 p. 5/22 Goodness-of-fit testing H 0 : observation X comes from a particular distribution indexed by θ Suppose test statistic W W(X) given, large values expected when H 0 is violated. Let T T(X) be sufficient statistic under the null model. Conditional test: Reject H 0, conditionally given T = t, when W k(t), where critical value k(t) is such that P H0 (W k(t) T = t) = α. k(t) is found from conditional distribution of W given T = t, which in principle is known. Equivalently, we can calculate the conditional p-value, p obs = P H0 (W w obs T = t), where w obs is observed value of W(X) Then reject H 0 if p obs α.

13 Seminar, UC Davis, 24 April 2008 p. 5/22 Goodness-of-fit testing H 0 : observation X comes from a particular distribution indexed by θ Suppose test statistic W W(X) given, large values expected when H 0 is violated. Let T T(X) be sufficient statistic under the null model. Conditional test: Reject H 0, conditionally given T = t, when W k(t), where critical value k(t) is such that P H0 (W k(t) T = t) = α. k(t) is found from conditional distribution of W given T = t, which in principle is known. Equivalently, we can calculate the conditional p-value, p obs = P H0 (W w obs T = t), where w obs is observed value of W(X) Then reject H 0 if p obs α. Remark: This test is also unconditionally an α-level test.

14 Seminar, UC Davis, 24 April 2008 p. 6/22 References Point of departure: ENGEN, S. and LILLEGÅRD, M. (1997). Stochastic simulations conditioned on sufficient statistics. Biometrika LINDQVIST, B. H., TARALDSEN, G., LILLEGÅRD, M., AND ENGEN, S. (2003). A counterexample to a claim about stochastic simulations. Biometrika

15 Seminar, UC Davis, 24 April 2008 p. 6/22 References Point of departure: ENGEN, S. and LILLEGÅRD, M. (1997). Stochastic simulations conditioned on sufficient statistics. Biometrika LINDQVIST, B. H., TARALDSEN, G., LILLEGÅRD, M., AND ENGEN, S. (2003). A counterexample to a claim about stochastic simulations. Biometrika Our papers: LINDQVIST, B. H. and TARALDSEN, G. (2005). Monte Carlo conditioning on a sufficient statistic. Biometrika LINDQVIST, B. H. and TARALDSEN, G. (2007). Conditional Monte Carlo Based on Sufficient Statistics with Applications. In: Advances in statistical modeling and inference. Essays in Honor of Kjell A. Doksum. (ed. Vijay Nair), pp , World Scientific, Singapore.

16 Seminar, UC Davis, 24 April 2008 p. 6/22 References Point of departure: ENGEN, S. and LILLEGÅRD, M. (1997). Stochastic simulations conditioned on sufficient statistics. Biometrika LINDQVIST, B. H., TARALDSEN, G., LILLEGÅRD, M., AND ENGEN, S. (2003). A counterexample to a claim about stochastic simulations. Biometrika Our papers: LINDQVIST, B. H. and TARALDSEN, G. (2005). Monte Carlo conditioning on a sufficient statistic. Biometrika LINDQVIST, B. H. and TARALDSEN, G. (2007). Conditional Monte Carlo Based on Sufficient Statistics with Applications. In: Advances in statistical modeling and inference. Essays in Honor of Kjell A. Doksum. (ed. Vijay Nair), pp , World Scientific, Singapore. Recent literature: DIACONIS, P., CHEN, Y., HOLMES, S. AND LIU, J. S. (2005). Sequential Monte Carlo Methods for Statistical Analysis of Tables. Journal of the American Statistical Association LANGSRUD, Ø. (2005). Rotation tests. Statistics and Computing LOCKHART, R. A., O REILLY, F. J. AND STEPHENS, M. A. (2007). Use of the Gibbs Sampler to Obtain Conditional Tests, with Applications Biometrika

17 Seminar, UC Davis, 24 April 2008 p. 7/22 General setup Given (X, T), with T sufficient compared to X for parameter θ. Basic assumption: There are given random vector U with known distribution, known functions χ(, ), τ(, ) such that (χ(u, θ), τ(u, θ)) θ (X, T).

18 Seminar, UC Davis, 24 April 2008 p. 7/22 General setup Given (X, T), with T sufficient compared to X for parameter θ. Basic assumption: There are given random vector U with known distribution, known functions χ(, ), τ(, ) such that (χ(u, θ), τ(u, θ)) θ (X, T). Interpretation: These are ways of simulating (X, T) for given θ.

19 Seminar, UC Davis, 24 April 2008 p. 7/22 General setup Given (X, T), with T sufficient compared to X for parameter θ. Basic assumption: There are given random vector U with known distribution, known functions χ(, ), τ(, ) such that (χ(u, θ), τ(u, θ)) θ (X, T). Interpretation: These are ways of simulating (X, T) for given θ. EXAMPLE: EXPONENTIAL SAMPLES. X = (X 1,..., X n ) are i.i.d. from Exp(θ), i.e. with hazard rate θ Then T = n i=1 X i is sufficient for θ. Let U = (U 1,..., U n ) be i.i.d. Exp(1) variables. Then: χ(u, θ) = (U 1 /θ,..., U n /θ), n τ(u, θ) = U i /θ. i=1

20 Seminar, UC Davis, 24 April 2008 p. 8/22 Conditional sampling of X given T = t EXAMPLE (continued) Want to sample X = (X 1, X 2,..., X n ) conditional on T = X i = t for given t. Idea: Draw U = (U 1,..., U n ) i.i.d. Exp(1).

21 Seminar, UC Davis, 24 April 2008 p. 8/22 Conditional sampling of X given T = t EXAMPLE (continued) Want to sample X = (X 1, X 2,..., X n ) conditional on T = X i = t for given t. Idea: Draw U = (U 1,..., U n ) i.i.d. Exp(1). Recall: χ(u, θ) = (U 1 /θ,..., U n /θ) n θ τ(u, θ) = U i /θ T. i=1 θ X,

22 Seminar, UC Davis, 24 April 2008 p. 8/22 Conditional sampling of X given T = t EXAMPLE (continued) Want to sample X = (X 1, X 2,..., X n ) conditional on T = X i = t for given t. Idea: Draw U = (U 1,..., U n ) i.i.d. Exp(1). Recall: χ(u, θ) = (U 1 /θ,..., U n /θ) n θ τ(u, θ) = U i /θ T. i=1 θ X, Solve: τ(u, θ) = t θ = ˆθ(U, t) = U i /t.

23 Seminar, UC Davis, 24 April 2008 p. 8/22 Conditional sampling of X given T = t EXAMPLE (continued) Want to sample X = (X 1, X 2,..., X n ) conditional on T = X i = t for given t. Idea: Draw U = (U 1,..., U n ) i.i.d. Exp(1). Recall: χ(u, θ) = (U 1 /θ,..., U n /θ) n θ τ(u, θ) = U i /θ T. i=1 θ X, Solve: τ(u, θ) = t θ = ˆθ(U, t) = U i /t. Conditional sample: X t (U) = χ(u, ˆθ(U, t)) = ( tu1 n i=1 U,..., i ) tu n n i=1 U, i

24 Seminar, UC Davis, 24 April 2008 p. 8/22 Conditional sampling of X given T = t EXAMPLE (continued) Want to sample X = (X 1, X 2,..., X n ) conditional on T = X i = t for given t. Idea: Draw U = (U 1,..., U n ) i.i.d. Exp(1). Recall: χ(u, θ) = (U 1 /θ,..., U n /θ) n θ τ(u, θ) = U i /θ T. i=1 θ X, Solve: τ(u, θ) = t θ = ˆθ(U, t) = U i /t. Conditional sample: X t (U) = χ(u, ˆθ(U, t)) = ( tu1 n i=1 U,..., i ) tu n n i=1 U, i which is known to have the correct distribution!

25 Seminar, UC Davis, 24 April 2008 p. 9/22 Algorithm 1: Conditional sampling of X given T = t The algorithm used in the example can more generally be described as follows:

26 Seminar, UC Davis, 24 April 2008 p. 9/22 Algorithm 1: Conditional sampling of X given T = t The algorithm used in the example can more generally be described as follows: Recall: χ(u, θ) θ X, τ(u, θ) θ T.

27 Seminar, UC Davis, 24 April 2008 p. 9/22 Algorithm 1: Conditional sampling of X given T = t The algorithm used in the example can more generally be described as follows: Recall: χ(u, θ) θ X, τ(u, θ) θ T. ALGORITHM 1 Generate U from the known density f(u).

28 Seminar, UC Davis, 24 April 2008 p. 9/22 Algorithm 1: Conditional sampling of X given T = t The algorithm used in the example can more generally be described as follows: Recall: χ(u, θ) θ X, τ(u, θ) θ T. ALGORITHM 1 Generate U from the known density f(u). Solve τ(u, θ) = t for θ. The (unique) solution is ˆθ(U, t).

29 Seminar, UC Davis, 24 April 2008 p. 9/22 Algorithm 1: Conditional sampling of X given T = t The algorithm used in the example can more generally be described as follows: Recall: χ(u, θ) θ X, τ(u, θ) θ T. ALGORITHM 1 Generate U from the known density f(u). Solve τ(u, θ) = t for θ. The (unique) solution is ˆθ(U, t). Return X t (U) = χ{u, ˆθ(U, t)}.

30 Seminar, UC Davis, 24 April 2008 p. 10/22 Problems with Algorithm 1 Algorithm 1 does not in general give samples from the correct distribution, even when ˆθ is unique.

31 Seminar, UC Davis, 24 April 2008 p. 10/22 Problems with Algorithm 1 Algorithm 1 does not in general give samples from the correct distribution, even when ˆθ is unique. There may not be a unique solution ˆθ for θ of τ(u, θ) = t. For discrete distributions the solutions for θ of the equation τ(u, θ) = t are typically intervals. For continuous distributions there may be a finite number of solutions, depending on u.

32 Seminar, UC Davis, 24 April 2008 p. 11/22 What may go wrong with Algorithm 1? Assume that for each fixed u and t the equation τ(u, θ) = t has the unique solution θ = ˆθ(u, t)

33 Seminar, UC Davis, 24 April 2008 p. 11/22 What may go wrong with Algorithm 1? Assume that for each fixed u and t the equation τ(u, θ) = t has the unique solution θ = ˆθ(u, t) Under Algorithm 1 we obtained conditional samples by X t = χ{u, ˆθ(U, t)}

34 Seminar, UC Davis, 24 April 2008 p. 11/22 What may go wrong with Algorithm 1? Assume that for each fixed u and t the equation τ(u, θ) = t has the unique solution θ = ˆθ(u, t) Under Algorithm 1 we obtained conditional samples by X t = χ{u, ˆθ(U, t)} A tentative proof that this gives samples from the conditional distribution of X given T = t can be given as follows.

35 Seminar, UC Davis, 24 April 2008 p. 11/22 What may go wrong with Algorithm 1? Assume that for each fixed u and t the equation τ(u, θ) = t has the unique solution θ = ˆθ(u, t) Under Algorithm 1 we obtained conditional samples by X t = χ{u, ˆθ(U, t)} A tentative proof that this gives samples from the conditional distribution of X given T = t can be given as follows. Let φ be any function. For all θ and t we can formally write: E{φ(X) T = t} = E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] = E(φ[χ{U, ˆθ(U, t)}] ˆθ(U, t) = θ) = E(φ[χ{U, ˆθ(U, t)}]) E{φ(X t )} If correct, this will imply that X t has the correct distribution, i.e. X t is distributed like the conditional distribution of X given T = t.

36 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ}

37 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ} Unproblematic if these events have positive probability, otherwise, equality may be invalid due to a Borel paradox

38 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ} Unproblematic if these events have positive probability, otherwise, equality may be invalid due to a Borel paradox Equality holds if the two events can be described by the same function of U (same σ-algebra).

39 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ} Unproblematic if these events have positive probability, otherwise, equality may be invalid due to a Borel paradox Equality holds if the two events can be described by the same function of U (same σ-algebra). SUFFICIENT CONDITION FOR ALGORITHM 1

40 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ} Unproblematic if these events have positive probability, otherwise, equality may be invalid due to a Borel paradox Equality holds if the two events can be described by the same function of U (same σ-algebra). SUFFICIENT CONDITION FOR ALGORITHM 1 The pivotal condition: Assume that τ(u, θ) depends on u only through a function r(u), where we have unique representation r(u) = v(θ, t) by solving τ(u, θ) = t.

41 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ} Unproblematic if these events have positive probability, otherwise, equality may be invalid due to a Borel paradox Equality holds if the two events can be described by the same function of U (same σ-algebra). SUFFICIENT CONDITION FOR ALGORITHM 1 The pivotal condition: Assume that τ(u, θ) depends on u only through a function r(u), where we have unique representation r(u) = v(θ, t) by solving τ(u, θ) = t. Thus v(θ, T) is a pivot in the classical sense.

42 Seminar, UC Davis, 24 April 2008 p. 12/22 The key equality a possible Borel paradox The key equality is E[φ{χ(U, θ)} τ(u, θ) = t] = E[φ{χ(U, θ)} ˆθ(U, t) = θ] Follows apparently from the equivalence of the events {τ(u, θ) = t} and {ˆθ(U, t) = θ} Unproblematic if these events have positive probability, otherwise, equality may be invalid due to a Borel paradox Equality holds if the two events can be described by the same function of U (same σ-algebra). SUFFICIENT CONDITION FOR ALGORITHM 1 The pivotal condition: Assume that τ(u, θ) depends on u only through a function r(u), where we have unique representation r(u) = v(θ, t) by solving τ(u, θ) = t. Thus v(θ, T) is a pivot in the classical sense. IN EXAMPLE: τ(u, θ) = n i=1 U i/θ r(u)/θ, so v(θ, t) = θt.

43 Seminar, UC Davis, 24 April 2008 p. 13/22 Algorithms 2 and 3 for weighted conditional sampling of X given T = t It turns out that a weighted sampling scheme is needed for the general case. Let Θ be a random variable with some conveniently chosen distribution, where Θ and U are independent. Key result is that conditional distribution of X given T = t is the same as that of χ(u,θ) given τ(u, Θ) = t. Notation: Let t W t (u) be the density of τ(u, Θ).

44 Seminar, UC Davis, 24 April 2008 p. 13/22 Algorithms 2 and 3 for weighted conditional sampling of X given T = t It turns out that a weighted sampling scheme is needed for the general case. Let Θ be a random variable with some conveniently chosen distribution, where Θ and U are independent. Key result is that conditional distribution of X given T = t is the same as that of χ(u,θ) given τ(u, Θ) = t. Notation: Let t W t (u) be the density of τ(u, Θ). ALGORITHM 2 Assume equation τ(u, θ) = t has unique solution θ = ˆθ(u, t) Generate V from a density proportional to W t (u)f(u). Return X t (V ) = χ(v, ˆθ(V, t)).

45 Algorithms 2 and 3 for weighted conditional sampling Seminar, UC Davis, 24 April 2008 p. 13/22 of X given T = t It turns out that a weighted sampling scheme is needed for the general case. Let Θ be a random variable with some conveniently chosen distribution, where Θ and U are independent. Key result is that conditional distribution of X given T = t is the same as that of χ(u,θ) given τ(u, Θ) = t. Notation: Let t W t (u) be the density of τ(u, Θ). ALGORITHM 2 Assume equation τ(u, θ) = t has unique solution θ = ˆθ(u, t) Generate V from a density proportional to W t (u)f(u). Return X t (V ) = χ(v, ˆθ(V, t)). ALGORITHM 3 General case Generate V from a density proportional to W t (u)f(u) and let the result be V = v. Generate Θ t from the conditional distribution of Θ given τ(v, Θ) = t. Return X t (V ) = χ(v,θ t ).

46 Seminar, UC Davis, 24 April 2008 p. 14/22 The weight function W t (u) in the Euclidean case X = (X 1,..., X n ) has distribution depending on a k-dimensional parameter θ T(X) is a k-dimensional sufficient statistic Choose density π(θ) for Θ; let f(u) be density of U Recall that W t (u) is density of τ(u, Θ)

47 Seminar, UC Davis, 24 April 2008 p. 14/22 The weight function W t (u) in the Euclidean case X = (X 1,..., X n ) has distribution depending on a k-dimensional parameter θ T(X) is a k-dimensional sufficient statistic Choose density π(θ) for Θ; let f(u) be density of U Recall that W t (u) is density of τ(u, Θ) Standard transformation formula using that τ(u, θ) = t θ = ˆθ(u, t) gives π(θ) W t (u) = π{ˆθ(u, t)} det tˆθ(u, t) = det θ τ(u, θ) θ=ˆθ(u,t).

48 Seminar, UC Davis, 24 April 2008 p. 14/22 The weight function W t (u) in the Euclidean case X = (X 1,..., X n ) has distribution depending on a k-dimensional parameter θ T(X) is a k-dimensional sufficient statistic Choose density π(θ) for Θ; let f(u) be density of U Recall that W t (u) is density of τ(u, Θ) Standard transformation formula using that τ(u, θ) = t θ = ˆθ(u, t) gives π(θ) W t (u) = π{ˆθ(u, t)} det tˆθ(u, t) = det θ τ(u, θ) θ=ˆθ(u,t). and further E{φ(X) T = t} = = π(θ) φ[χ{u, ˆθ(u, t)}] det θ τ(u,θ) θ=ˆθ(u,t) f(u)du π(θ) det θ τ(u,θ) θ=ˆθ(u,t) f(u)du E U (φ[χ{u, ˆθ(U, ) π(θ) t)}] det θ τ(u,θ) θ=ˆθ(u,t) ) π(θ) E U ( det θ τ(u,θ) θ=ˆθ(u,t)

49 Seminar, UC Davis, 24 April 2008 p. 14/22 The weight function W t (u) in the Euclidean case X = (X 1,..., X n ) has distribution depending on a k-dimensional parameter θ T(X) is a k-dimensional sufficient statistic Choose density π(θ) for Θ; let f(u) be density of U Recall that W t (u) is density of τ(u, Θ) Standard transformation formula using that τ(u, θ) = t θ = ˆθ(u, t) gives π(θ) W t (u) = π{ˆθ(u, t)} det tˆθ(u, t) = det θ τ(u, θ) θ=ˆθ(u,t). and further E{φ(X) T = t} = = π(θ) φ[χ{u, ˆθ(u, t)}] det θ τ(u,θ) θ=ˆθ(u,t) f(u)du π(θ) det θ τ(u,θ) θ=ˆθ(u,t) f(u)du E U (φ[χ{u, ˆθ(U, ) π(θ) t)}] det θ τ(u,θ) θ=ˆθ(u,t) ) π(θ) E U ( det θ τ(u,θ) θ=ˆθ(u,t) Can be computed by simulation using a pseudo-sample from the distribution of U

50 Seminar, UC Davis, 24 April 2008 p. 15/22 Example truncated exponential X = (X 1,..., X n ) are i.i.d. on [0, 1] with density f(x, θ) = { θe θx e θ 1 if θ 0 1 if θ = 0 for 0 x 1 T = n i=1 X i is sufficient compared to X.

51 Seminar, UC Davis, 24 April 2008 p. 15/22 Example truncated exponential X = (X 1,..., X n ) are i.i.d. on [0, 1] with density f(x, θ) = { θe θx e θ 1 if θ 0 1 if θ = 0 for 0 x 1 T = n i=1 X i is sufficient compared to X. Conditional distribution of X given T = t is that of n independent uniform [0, 1] random variables given their sum (for which there seems to be no simple expression).

52 Seminar, UC Davis, 24 April 2008 p. 15/22 Example truncated exponential X = (X 1,..., X n ) are i.i.d. on [0, 1] with density f(x, θ) = { θe θx e θ 1 if θ 0 1 if θ = 0 for 0 x 1 T = n i=1 X i is sufficient compared to X. Conditional distribution of X given T = t is that of n independent uniform [0, 1] random variables given their sum (for which there seems to be no simple expression). Simulation of data: Let U = (U 1, U 2,..., U n ) be i.i.d. uniform on [0, 1]. χ(u, θ) = τ(u, θ) = ( log{1 + (e θ 1)U 1 } n i=1 θ log{1 + (e θ 1)U i } θ,..., log{1 + ) (eθ 1)U n } θ

53 Seminar, UC Davis, 24 April 2008 p. 15/22 Example truncated exponential X = (X 1,..., X n ) are i.i.d. on [0, 1] with density f(x, θ) = { θe θx e θ 1 if θ 0 1 if θ = 0 for 0 x 1 T = n i=1 X i is sufficient compared to X. Conditional distribution of X given T = t is that of n independent uniform [0, 1] random variables given their sum (for which there seems to be no simple expression). Simulation of data: Let U = (U 1, U 2,..., U n ) be i.i.d. uniform on [0, 1]. χ(u, θ) = τ(u, θ) = ( log{1 + (e θ 1)U 1 } n i=1 θ log{1 + (e θ 1)U i } θ,..., log{1 + ) (eθ 1)U n } θ The equation τ(u, θ) = t has unique solution θ = ˆθ(u, t) However, X t does not have the correct distribution (e.g. let n = 2).

54 Seminar, UC Davis, 24 April 2008 p. 16/22 Computation Thus for computation of E{φ(X) T = t} we need to compute θ τ(u, θ) θ=ˆθ(u,t) = eˆθ(u,t) ˆθ(u, t) n i=1 u i t 1 + (eˆθ(u,t) 1)u i ˆθ(u, t).

55 Seminar, UC Davis, 24 April 2008 p. 16/22 Computation Thus for computation of E{φ(X) T = t} we need to compute θ τ(u, θ) θ=ˆθ(u,t) = eˆθ(u,t) ˆθ(u, t) n i=1 u i t 1 + (eˆθ(u,t) 1)u i ˆθ(u, t). To be substituted in E{φ(X) T = t} = E U (φ[χ{u, ˆθ(U, t)}] E U ( π(θ) det θ τ(u,θ) θ=ˆθ(u,t) π(θ) det θ τ(u,θ) θ=ˆθ(u,t) ) )

56 Seminar, UC Davis, 24 April 2008 p. 16/22 Computation Thus for computation of E{φ(X) T = t} we need to compute θ τ(u, θ) θ=ˆθ(u,t) = eˆθ(u,t) ˆθ(u, t) n i=1 u i t 1 + (eˆθ(u,t) 1)u i ˆθ(u, t). To be substituted in E{φ(X) T = t} = E U (φ[χ{u, ˆθ(U, t)}] E U ( π(θ) det θ τ(u,θ) θ=ˆθ(u,t) π(θ) det θ τ(u,θ) θ=ˆθ(u,t) ) ) In principle we can use any choice of π for which the integrals exist. Good choice is Jeffreys prior π(θ) = 1 θ (e θ e θ )

57 Seminar, UC Davis, 24 April 2008 p. 17/22 Distribution of weights Recall: π(θ) W t (u) = θ τ(u, θ) θ=ˆθ(u,t).

58 Seminar, UC Davis, 24 April 2008 p. 18/22 Discussion of method Disadvantage: Need to solve equation n log{1 + (e θ 1)U i } = θt i=1 at each step to find ˆθ(U, t).

59 Seminar, UC Davis, 24 April 2008 p. 18/22 Discussion of method Disadvantage: Need to solve equation n log{1 + (e θ 1)U i } = θt i=1 at each step to find ˆθ(U, t). Quicker: Use a Gibbs algorithm to simulate X = (X 1,..., X n ) i.i.d. uniform on [0, 1] given n i=1 X i = t:

60 Seminar, UC Davis, 24 April 2008 p. 18/22 Discussion of method Disadvantage: Need to solve equation n log{1 + (e θ 1)U i } = θt i=1 at each step to find ˆθ(U, t). Quicker: Use a Gibbs algorithm to simulate X = (X 1,..., X n ) i.i.d. uniform on [0, 1] given n i=1 X i = t: Start with X 0 i = t for i = 1,..., n n Given (X m 1,..., X m n ) with n i=1 Xm i = t. Draw integers i < j randomly. Compute a = X m i Draw X m+1 i = Let X m+1 j = a X m+1 i Continue with m m + 1 { + X m j uniform[0, a] if a 1 uniform[a 1, 1] if a > 1

61 Seminar, UC Davis, 24 April 2008 p. 19/22 Relationship to Bayesian and fiducial distributions. Simple case: Suppose θ is one-dimensional τ(u, θ) is strictly monotone in θ for fixed u Then distribution of ˆθ(U, t) (i.e. the θ solving τ(u, θ) = t) corresponds to Fisher s fiducial distribution.

62 Seminar, UC Davis, 24 April 2008 p. 19/22 Relationship to Bayesian and fiducial distributions. Simple case: Suppose θ is one-dimensional τ(u, θ) is strictly monotone in θ for fixed u Then distribution of ˆθ(U, t) (i.e. the θ solving τ(u, θ) = t) corresponds to Fisher s fiducial distribution. Lindley (1958): The distribution of ˆθ(U, t) is a posterior distribution for some (possibly improper) prior distribution for θ if and only if T, or a transformation of it, has a location distribution.

63 Seminar, UC Davis, 24 April 2008 p. 19/22 Relationship to Bayesian and fiducial distributions. Simple case: Suppose θ is one-dimensional τ(u, θ) is strictly monotone in θ for fixed u Then distribution of ˆθ(U, t) (i.e. the θ solving τ(u, θ) = t) corresponds to Fisher s fiducial distribution. Lindley (1958): The distribution of ˆθ(U, t) is a posterior distribution for some (possibly improper) prior distribution for θ if and only if T, or a transformation of it, has a location distribution. Fraser (1961): (Multiparameter case.)the fiducial distribution is a posterior distribution if the sample and parameter sets are transformation groups, and the distributions are given by means of density functions with respect to right Haar measure.

64 Seminar, UC Davis, 24 April 2008 p. 19/22 Relationship to Bayesian and fiducial distributions. Simple case: Suppose θ is one-dimensional τ(u, θ) is strictly monotone in θ for fixed u Then distribution of ˆθ(U, t) (i.e. the θ solving τ(u, θ) = t) corresponds to Fisher s fiducial distribution. Lindley (1958): The distribution of ˆθ(U, t) is a posterior distribution for some (possibly improper) prior distribution for θ if and only if T, or a transformation of it, has a location distribution. Fraser (1961): (Multiparameter case.)the fiducial distribution is a posterior distribution if the sample and parameter sets are transformation groups, and the distributions are given by means of density functions with respect to right Haar measure. The cases above essentially correspond to the cases when Algorithm 1 can be used (pivotal property).

65 Seminar, UC Davis, 24 April 2008 p. 20/22 Example: Multivariate normal distribution Generation of multivariate normal samples conditional on sample mean and empirical covariance matrix: X = (X 1,..., X n ) is sample from N p (µ, Σ) T = ( X, S) is sufficient compared to X, with X = n 1 n i=1 X i S = (n 1) 1 n i=1 (X i X)(X i X) Reparameterise from (µ, Σ) to θ (µ, A), where Σ = AA is the Cholesky decomposition. Simulate by letting U = (U 1,..., U n ) be i.i.d. N p (0, I): χ(u, θ) = (µ + AU 1,..., µ + AU n ), τ(u, θ) = (µ + AŪ, AS UA ), where Ū and S U are defined in the same way as X and S. Pivotal condition holds and the desired conditional sample is given by χ{u, ˆθ(U, t)} = ( x + cl 1 U (U 1 Ū),..., x + cl 1 U (U n Ū)) where s = cc and S U = L U L U are Cholesky decompositions.

66 Other applications Seminar, UC Davis, 24 April 2008 p. 21/22

67 Seminar, UC Davis, 24 April 2008 p. 21/22 Other applications Inverse Gaussian samples given the sufficient statistics: Standard algorithm for generation of inverse Gaussian variates leads to multiple roots of τ(u, θ) = t. Algorithm 3 must be used.

68 Seminar, UC Davis, 24 April 2008 p. 21/22 Other applications Inverse Gaussian samples given the sufficient statistics: Standard algorithm for generation of inverse Gaussian variates leads to multiple roots of τ(u, θ) = t. Algorithm 3 must be used. Type II censored exponential samples: Equation τ(u, θ) = t has one or no solutions for θ. Algorithm 3 can be used.

69 Seminar, UC Davis, 24 April 2008 p. 21/22 Other applications Inverse Gaussian samples given the sufficient statistics: Standard algorithm for generation of inverse Gaussian variates leads to multiple roots of τ(u, θ) = t. Algorithm 3 must be used. Type II censored exponential samples: Equation τ(u, θ) = t has one or no solutions for θ. Algorithm 3 can be used. Discrete distributions (e.g. Poisson distribution, logistic regression): Solutions for θ of equation τ(u, θ) = t are typically intervals. Algorithm 3 is essentially used.

70 Seminar, UC Davis, 24 April 2008 p. 22/22 Concluding remarks The idea of weighted sampling (Algorithm 2) is similar to the classical conditional Monte Carlo approach of Trotter and Tukey (1956).

71 Seminar, UC Davis, 24 April 2008 p. 22/22 Concluding remarks The idea of weighted sampling (Algorithm 2) is similar to the classical conditional Monte Carlo approach of Trotter and Tukey (1956). Trotter and Tukey essentially consider the case of unique solution of the equation τ(u, θ) = t, but works without assuming sufficiency of conditioning variable.

72 Seminar, UC Davis, 24 April 2008 p. 22/22 Concluding remarks The idea of weighted sampling (Algorithm 2) is similar to the classical conditional Monte Carlo approach of Trotter and Tukey (1956). Trotter and Tukey essentially consider the case of unique solution of the equation τ(u, θ) = t, but works without assuming sufficiency of conditioning variable. The approach presented here can also be used for computation of conditional expectations E{φ(X) T = t} in non-statistical problems: Construct artificial statistical models for which the conditioning variable T is sufficient. E.g., use exponential models like f(x, θ) = c(θ)h(x)e θt(x), where h(x) is the density of X and T(X) is sufficient for θ.

73 Seminar, UC Davis, 24 April 2008 p. 22/22 Concluding remarks The idea of weighted sampling (Algorithm 2) is similar to the classical conditional Monte Carlo approach of Trotter and Tukey (1956). Trotter and Tukey essentially consider the case of unique solution of the equation τ(u, θ) = t, but works without assuming sufficiency of conditioning variable. The approach presented here can also be used for computation of conditional expectations E{φ(X) T = t} in non-statistical problems: Construct artificial statistical models for which the conditioning variable T is sufficient. E.g., use exponential models like f(x, θ) = c(θ)h(x)e θt(x), where h(x) is the density of X and T(X) is sufficient for θ. Methods based on conditional distributions given sufficient statistics are not of widespread use. The literature is scarce even for the normal and multinormal distributions. However, there seems to be an increasing interest in the recent literature.

CONDITIONAL MONTE CARLO BASED ON SUFFICIENT STATISTICS WITH APPLICATIONS

CONDITIONAL MONTE CARLO BASED ON SUFFICIENT STATISTICS WITH APPLICATIONS Festschrift for Kjell Doksum CONDITIONAL MONTE CARLO BASED ON SUFFICIENT STATISTICS WITH APPLICATIONS By Bo Henry Lindqvist and Gunnar Taraldsen Norwegian University of Science and Technology and SINTEF

More information

Exact Statistical Inference in. Parametric Models

Exact Statistical Inference in. Parametric Models Exact Statistical Inference in Parametric Models Audun Sektnan December 2016 Specialization Project Department of Mathematical Sciences Norwegian University of Science and Technology Supervisor: Professor

More information

A lemma on conditional Monte Carlo Bo Henry Lindqvist, Norwegian University of Science and Technology Gunnar Taraldsen, SINTEF Telecom and Informatics

A lemma on conditional Monte Carlo Bo Henry Lindqvist, Norwegian University of Science and Technology Gunnar Taraldsen, SINTEF Telecom and Informatics NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET A lemma on conditional Monte Carlo by Bo Henry Lindqvist and Gunnar Taraldsen PREPRINT STATISTICS NO. 10/2001 NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY

More information

Conditional Sampling from a Gamma Distribution given Sufficient Statistics

Conditional Sampling from a Gamma Distribution given Sufficient Statistics Conditional Sampling from a Gamma Distribution given Sufficient Statistics Marius Fagerland Master of Science in Physics and Mathematics Submission date: June 2016 Supervisor: Bo Henry Lindqvist, MATH

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

1. Fisher Information

1. Fisher Information 1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang Bayesian Dropout Tue Herlau, Morten Morup and Mikkel N. Schmidt Discussed by: Yizhe Zhang Feb 20, 2016 Outline 1 Introduction 2 Model 3 Inference 4 Experiments Dropout Training stage: A unit is present

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Nuisance parameters and their treatment

Nuisance parameters and their treatment BS2 Statistical Inference, Lecture 2, Hilary Term 2008 April 2, 2008 Ancillarity Inference principles Completeness A statistic A = a(x ) is said to be ancillary if (i) The distribution of A does not depend

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Patterns of Scalable Bayesian Inference Background (Session 1)

Patterns of Scalable Bayesian Inference Background (Session 1) Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles

More information

Lecture 16 November Application of MoUM to our 2-sided testing problem

Lecture 16 November Application of MoUM to our 2-sided testing problem STATS 300A: Theory of Statistics Fall 2015 Lecture 16 November 17 Lecturer: Lester Mackey Scribe: Reginald Long, Colin Wei Warning: These notes may contain factual and/or typographic errors. 16.1 Recap

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

On Generalized Fiducial Inference

On Generalized Fiducial Inference On Generalized Fiducial Inference Jan Hannig jan.hannig@colostate.edu University of North Carolina at Chapel Hill Parts of this talk are based on joint work with: Hari Iyer, Thomas C.M. Lee, Paul Patterson,

More information

Remarks on Improper Ignorance Priors

Remarks on Improper Ignorance Priors As a limit of proper priors Remarks on Improper Ignorance Priors Two caveats relating to computations with improper priors, based on their relationship with finitely-additive, but not countably-additive

More information

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS EDWARD I. GEORGE and ZUOSHUN ZHANG The University of Texas at Austin and Quintiles Inc. June 2 SUMMARY For Bayesian analysis of hierarchical

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

Principles of Statistics

Principles of Statistics Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for

More information

Minimum Message Length Analysis of the Behrens Fisher Problem

Minimum Message Length Analysis of the Behrens Fisher Problem Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

Fiducial Inference and Generalizations

Fiducial Inference and Generalizations Fiducial Inference and Generalizations Jan Hannig Department of Statistics and Operations Research The University of North Carolina at Chapel Hill Hari Iyer Department of Statistics, Colorado State University

More information

Simulating Random Variables

Simulating Random Variables Simulating Random Variables Timothy Hanson Department of Statistics, University of South Carolina Stat 740: Statistical Computing 1 / 23 R has many built-in random number generators... Beta, gamma (also

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

A simple analysis of the exact probability matching prior in the location-scale model

A simple analysis of the exact probability matching prior in the location-scale model A simple analysis of the exact probability matching prior in the location-scale model Thomas J. DiCiccio Department of Social Statistics, Cornell University Todd A. Kuffner Department of Mathematics, Washington

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Spring, 2006 1. DeGroot 1973 In (DeGroot 1973), Morrie DeGroot considers testing the

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

The Surprising Conditional Adventures of the Bootstrap

The Surprising Conditional Adventures of the Bootstrap The Surprising Conditional Adventures of the Bootstrap G. Alastair Young Department of Mathematics Imperial College London Inaugural Lecture, 13 March 2006 Acknowledgements Early influences: Eric Renshaw,

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Stat 451 Lecture Notes Simulating Random Variables

Stat 451 Lecture Notes Simulating Random Variables Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Markov Chain Monte Carlo Lecture 4

Markov Chain Monte Carlo Lecture 4 The local-trap problem refers to that in simulations of a complex system whose energy landscape is rugged, the sampler gets trapped in a local energy minimum indefinitely, rendering the simulation ineffective.

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Chapter 8.8.1: A factorization theorem

Chapter 8.8.1: A factorization theorem LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

P Values and Nuisance Parameters

P Values and Nuisance Parameters P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Hybrid Dirichlet processes for functional data

Hybrid Dirichlet processes for functional data Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

1 Probability Model. 1.1 Types of models to be discussed in the course

1 Probability Model. 1.1 Types of models to be discussed in the course Sufficiency January 11, 2016 Debdeep Pati 1 Probability Model Model: A family of distributions {P θ : θ Θ}. P θ (B) is the probability of the event B when the parameter takes the value θ. P θ is described

More information

Statistical Theory MT 2006 Problems 4: Solution sketches

Statistical Theory MT 2006 Problems 4: Solution sketches Statistical Theory MT 006 Problems 4: Solution sketches 1. Suppose that X has a Poisson distribution with unknown mean θ. Determine the conjugate prior, and associate posterior distribution, for θ. Determine

More information

Modern Methods of Statistical Learning sf2935 Auxiliary material: Exponential Family of Distributions Timo Koski. Second Quarter 2016

Modern Methods of Statistical Learning sf2935 Auxiliary material: Exponential Family of Distributions Timo Koski. Second Quarter 2016 Auxiliary material: Exponential Family of Distributions Timo Koski Second Quarter 2016 Exponential Families The family of distributions with densities (w.r.t. to a σ-finite measure µ) on X defined by R(θ)

More information

General Bayesian Inference I

General Bayesian Inference I General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Second Workshop, Third Summary

Second Workshop, Third Summary Statistical Issues Relevant to Significance of Discovery Claims Second Workshop, Third Summary Luc Demortier The Rockefeller University Banff, July 16, 2010 1 / 23 1 In the Beginning There Were Questions...

More information

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. BEST TESTS Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. 1. Most powerful test Let {f θ } θ Θ be a family of pdfs. We will consider

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Rank Regression with Normal Residuals using the Gibbs Sampler

Rank Regression with Normal Residuals using the Gibbs Sampler Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

Overall Objective Priors

Overall Objective Priors Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Data Analysis and Uncertainty Part 2: Estimation

Data Analysis and Uncertainty Part 2: Estimation Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable

More information

Previously Monte Carlo Integration

Previously Monte Carlo Integration Previously Simulation, sampling Monte Carlo Simulations Inverse cdf method Rejection sampling Today: sampling cont., Bayesian inference via sampling Eigenvalues and Eigenvectors Markov processes, PageRank

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde

More information

Theory of Statistical Tests

Theory of Statistical Tests Ch 9. Theory of Statistical Tests 9.1 Certain Best Tests How to construct good testing. For simple hypothesis H 0 : θ = θ, H 1 : θ = θ, Page 1 of 100 where Θ = {θ, θ } 1. Define the best test for H 0 H

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Classical and Bayesian inference

Classical and Bayesian inference Classical and Bayesian inference AMS 132 January 18, 2018 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 18, 2018 1 / 9 Sampling from a Bernoulli Distribution Theorem (Beta-Bernoulli

More information

STAT 830 Bayesian Estimation

STAT 830 Bayesian Estimation STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Subjective and Objective Bayesian Statistics

Subjective and Objective Bayesian Statistics Subjective and Objective Bayesian Statistics Principles, Models, and Applications Second Edition S. JAMES PRESS with contributions by SIDDHARTHA CHIB MERLISE CLYDE GEORGE WOODWORTH ALAN ZASLAVSKY \WILEY-

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

New Insights into History Matching via Sequential Monte Carlo

New Insights into History Matching via Sequential Monte Carlo New Insights into History Matching via Sequential Monte Carlo Associate Professor Chris Drovandi School of Mathematical Sciences ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Bayesian Inference II. Course lecturer: Loukia Meligkotsidou

Bayesian Inference II. Course lecturer: Loukia Meligkotsidou Bayesian Inference II Course lecturer: Loukia Meligkotsidou 2014 2 Contents 1 Introduction 5 1.1 Scope of course..................................... 5 1.2 Computers as inference machines...........................

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2008 Paper 94 The Highest Confidence Density Region and Its Usage for Inferences about the Survival Function with Censored

More information

The comparative studies on reliability for Rayleigh models

The comparative studies on reliability for Rayleigh models Journal of the Korean Data & Information Science Society 018, 9, 533 545 http://dx.doi.org/10.7465/jkdi.018.9..533 한국데이터정보과학회지 The comparative studies on reliability for Rayleigh models Ji Eun Oh 1 Joong

More information

JEREMY TAYLOR S CONTRIBUTIONS TO TRANSFORMATION MODEL

JEREMY TAYLOR S CONTRIBUTIONS TO TRANSFORMATION MODEL 1 / 25 JEREMY TAYLOR S CONTRIBUTIONS TO TRANSFORMATION MODELS DEPT. OF STATISTICS, UNIV. WISCONSIN, MADISON BIOMEDICAL STATISTICAL MODELING. CELEBRATION OF JEREMY TAYLOR S OF 60TH BIRTHDAY. UNIVERSITY

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions

Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions Biometrika (92), 9, 2, p. Printed in Great Britain Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions BY J. F. LAWLESS* University

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Bayesian Ingredients. Hedibert Freitas Lopes

Bayesian Ingredients. Hedibert Freitas Lopes Normal Prior s Ingredients Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

The formal relationship between analytic and bootstrap approaches to parametric inference

The formal relationship between analytic and bootstrap approaches to parametric inference The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Mathematical statistics

Mathematical statistics October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Censoring mechanisms

Censoring mechanisms Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff

Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff David Gerard Department of Statistics University of Washington gerard2@uw.edu May 2, 2013 David Gerard (UW)

More information

STAT 830 Hypothesis Testing

STAT 830 Hypothesis Testing STAT 830 Hypothesis Testing Richard Lockhart Simon Fraser University STAT 830 Fall 2018 Richard Lockhart (Simon Fraser University) STAT 830 Hypothesis Testing STAT 830 Fall 2018 1 / 30 Purposes of These

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information