ST 740: Multiparameter Inference

Size: px

Start display at page:

Download "ST 740: Multiparameter Inference"

Gwendoline Stewart
6 years ago
Views:

1 ST 740: Multiparameter Inference Alyson Wilson Department of Statistics North Carolina State University September 23, 2013 A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

2 Multinomial Model Generalization of the binomial model, for the case where observations can have more than two possible values. Sampling distribution: multinomial with parameters (θ 1,..., θ k ), the probabilities associated to each of the k possible outcomes. Example: In a survey, respondents may: Strongly Agree, Agree, Disagree, Strongly Disagree, or have No Opinion when presented with a statement such as 8am is too early to have class. A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

3 Multinomial Model Sampling Distribution Formally y = (y 1, y 2,..., y k ), a k-vector of counts of the number of observations for each outcome θ j : probability of jth outcome k j=1 θ j = 1 and k j=1 y j = n The sampling distribution has the form f (y θ) Π k j=1θ y j j A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

4 Multinomial Model Conjugate Prior Distribution The conjugate prior distribution for the multinomial sampling distribution is the Dirichlet distribution, which is a multivariate generalization of the beta distribution. with α j > 0 and α 0 = k j=1 α j. π(θ α) k j=1 E[θ j ] = α j α 0 θ α j 1 j Var[θ j ] = α j(α 0 α j ) α0 2(α 0 + 1) Cov(θ i, θ j ) = α iα j α0 2(α 0 + 1) A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

5 Multinomial Model Other Interesting Properties of the Dirichlet Distribution The marginal distribution of any subvector of parameters is also Dirichlet: e.g., (θ i, θ j, 1 θ i θ j ) is Dirichlet(α i, α j, α 0 α i α j ) The marginal distribution of θ j is Beta(α j, α 0 α j ). The marginal distribution of θ i + θ j is Beta(α i + α j, α 0 α i α j ) Common choices for noninformative prior distributions for the Dirichlet are α j = 0, 0.5, or 1, j. α 0 is commonly thought of as the prior sample size. A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

6 Multinomial Model Conjugate Prior Distribution The posterior distribution is also Dirichlet. π(θ α, y) j i=1 θ α j +y j 1 j A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

7 Multinomial Model Example: In a survey, respondents may: Strongly Agree, Agree, Disagree, Strongly Disagree, or have No Opinion when presented with a statement such as 8am is too early to have class. Our sampling distribution is multinomial, with parameters θ SA, θ A, θ D, θ SD, and θ NO, where θ SA + θ A + θ D + θ SD + θ NO = 1. Our prior distribution on the parameters is a noninformative Dirichlet(0.5,0.5,0.5,0.5,0.5), π(θ) θ 0.5 SA θ 0.5 A θ 0.5 SD θ 0.5 D θ 0.5 NO A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

8 Multinomial Model We observe n = 29 responses y = (15, 8, 2, 2, 2). Our posterior distribution is π(θ y) θ SA θ A θ D θ SD θ NO The two probabilities we are interested in are the probability that a student Strongly Agrees that 8am is too early for class, and the probability that a student either Disagrees or Strongly Disagrees that 8am is too early for class. A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

9 Multinomial Model This means that we are interested in the marginal posterior distribution of θ SA, π(θ SA y), and the posterior distribution of θ D + θ SD, π(θ D + θ SD y). We can find these marginal distributions two ways. We can find the marginal distributions analytically using properties of the Dirichlet distribution. In particular, θ SA y Beta(α SA + y SA, α j + n α SA y SA ) = Beta(15.5, 16) θ D + θ SD y Beta(α D + α SD + y D + y SD, αj + n α D α SD y D y SD ) = Beta(5, 26.5) A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

10 Generating Random Samples from Dirichlet Distributions We can simulate directly from the joint posterior Dirichlet distribution (in R, rdirichlet from the LearnBayes package). Gamma method: Two steps for each θ j : 1 Draw x 1, x 2,..., x k from independent Gamma(δ, (α j + y j )) for any common δ 2 Set θ j = x j / k i=1 x i A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

11 Marginal Posterior Distributions Marginal Posterior Posterior P(Strongly Agree) P(Strongly Disagree or Disagree) A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

12 Helpful Hint for Multiparameter Priors A useful idea is that of the induced prior. Suppose that I have a sampling distribution f (y α, β) that depends on two parameters, α and β. However, my prior information is about a function of α and β, say βγ(1 + 1 α ) We want to specify a prior distribution π(α, β) that reflects our prior information. Specifying a prior on π(α, β) induces (specifies) a prior βγ(1 + 1 α ). A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

13 Weibull Parameters alpha <- rgamma(10000,val1,val2) beta <- rgamma(10000,val3,val4) mw <- beta*gamma(1 + 1/alpha) simy <- rweibull(10000,alpha,beta) A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

14 Constraints on the Parameters Start with an unconstrained problem and Bayes Theorem. π(θ y) = f (y θ)π(θ) f (y θ)π(θ)dθ Suppose we want to find the posterior distribution when θ Θ C, where Θ C π(θ)dθ > 0. Define the constrained prior density as π C (θ) = π(θ) Θ C π(θ)dθ, θ Θ C A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

15 Constraints on the Parameters Start with an unconstrained problem and Bayes Theorem. π(θ y) = f (y θ)π(θ) f (y θ)π(θ)dθ Suppose we want to find the posterior distribution when θ Θ C, where Θ C π(θ)dθ > 0. Define the constrained prior density as π C (θ) = π(θ) Θ C π(θ)dθ, θ Θ C A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

16 Constraints on the Parameters Start with an unconstrained problem and Bayes Theorem. π(θ y) = f (y θ)π(θ) f (y θ)π(θ)dθ Suppose we want to find the posterior distribution when θ Θ C, where Θ C π(θ)dθ > 0. Define the constrained prior density as π C (θ) = π(θ) Θ C π(θ)dθ, θ Θ C A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

17 Constraints on the Parameter Using Bayes Theorem π(θ y, θ Θ C ) = = = = f (y θ)π C (θ) Θ C f (y θ)π C (θ)dθ, θ Θ C f (y θ) Θ C f (y θ) π(θ) Θ π(θ)dθ C f (y θ)π(θ) Θ f (y θ)π(θ)dθ π(θ) Θ π(θ)dθ dθ, θ Θ C C Θ C f (y θ)π(θ) Θ f (y θ)π(θ)dθ dθ, θ Θ C π(θ y) Θ C π(θ y)dθ, θ Θ C General idea: Do the unconstrained problem, truncate, renormalize. A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

18 Constraints on the Parameter Using Bayes Theorem π(θ y, θ Θ C ) = = = = f (y θ)π C (θ) Θ C f (y θ)π C (θ)dθ, θ Θ C f (y θ) Θ C f (y θ) π(θ) Θ π(θ)dθ C f (y θ)π(θ) Θ f (y θ)π(θ)dθ π(θ) Θ π(θ)dθ dθ, θ Θ C C Θ C f (y θ)π(θ) Θ f (y θ)π(θ)dθ dθ, θ Θ C π(θ y) Θ C π(θ y)dθ, θ Θ C General idea: Do the unconstrained problem, truncate, renormalize. A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

19 Constraints on the Parameter Example We are trying to estimate the probability that Barry Bonds hits a home run in We observe 73 home runs in 476 at bats. Our prior distribution for p is Beta(1,5) with p (0, 0.15). p Density A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

20 Constraints on the Parameter Example The unconstrained posterior distribution is Beta(1 + 73, ). p Density A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

21 Less Nice Example Suppose that our sampling distribution is Gamma(α, β) and we choose independent marginal prior distributions π(α) Gamma(2, 1) and π(β) Gamma(5, 1). π(α, β y) βnα Γ(α) n exp( β y i )( y i ) α 1 exp( (α + β))αβ 4 βnα+4 Γ(α) n exp( β( y i + 1))α exp( α)( y i ) α 1 We want to make posterior inferences about α, β, and the predictive distribution for the next observation. A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

22 Predictive Distribution The predictive distribution is the distribution of the next observation conditional on the data observed so far. The uncertainty about the parameters is integrated out. f (y n+1 y) = f (y n+1 α, β)π(α, β y)dαdβ where f (y n+1 α, β) = βα Γ(α) exp( βy n+1)(y n+1 ) α 1 A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

23 Contour Plot of Posterior Distribution β α A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

24 Joint, Marginal, and Conditional Distributions Posterior? π(α, β y) βnα+4 Γ(α) n exp( β( y i + 1))α exp( α)( y i ) α 1 Conditionals? π(α β, y) hard. β α, y Gamma(nα + 5, y i + 1) Marginals? π(β y) hard. π(α y) α exp( α) Γ(α) n Γ(nα+5) ( y i +1) nα+5 ( y i ) α 1 A. Wilson (NCSU Statistics) Multiparameter Inference September 23, / 21

ST 740: Model Selection

ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model