1 Hierarchical Bayes and Empirical Bayes. MLII Method.

Size: px
Start display at page:

Download "1 Hierarchical Bayes and Empirical Bayes. MLII Method."

Transcription

1 ISyE8843A, Brani Vidakovic Handout 8 Hierarchical Bayes and Empirical Bayes. MLII Method. Hierarchical Bayes and Empirical Bayes are related by their goals, but quite different by the methods of how these goals are achieved. The attribute hierarchical refers mostly to the modeling strategy, while empirical is referring to the methodology. Both methods are concerned in specifying the distribution at prior level, hierarchical via Bayes inference involving additional degrees of hierarchy (hyperpriors and hyperparameters), while empirical Bayes is using data more directly. In expanding Bayesian models and inference to more complex problems, going beyond the simple likelihood-prior-posterior scheme, a hierarchy of models may be needed. The parameter(s) of interest considered are entering the model via their realizations which are modeled similarly as they were measurements. The common name parameter population distribution is indicative of the nature of the approach.. Hierarchical Bayesian Analysis Hierarchical Bayesian Analysis is a convenient representation of a Bayesian model, in particular the prior π, via a conditional hierarchy of so called hyper-priors π,..., π n+, π(θ) = π (θ θ )π (θ θ )... π n (θ n θ n )π n+ (θ n ) dθ dθ... dθ n. () Operationally, the model is equivalent to the model [x θ] f(x θ), [θ θ ] π (θ θ ), [θ n θ n ] π n (θ θ ), [θ n ] π n+ (θ n ). () [x θ] f(x θ), [θ] π(θ), as the inference on θ is concerned. Notice that in the hierarchy of data, parameters and hyperparameters, X θ θ θ... θ n X and θ i are independent, given θ. That means, [X θ, θ,... ] d = [X θ], [θ i θ, X] d = [θ i θ], where d = is equality in distribution. The joint distribution [X, θ, θ,..., θ n ] which by definition is [X, θ, θ,..., θ n ] = [X θ, θ,..., θ n ] [θ θ,..., θ n ] [θ θ,..., θ n ]... [θ n θ n ] [θ n ] can be represented as [X, θ, θ,..., θ n ] = [X θ][θ θ ] [θ θ ]... [θ n θ n ] [θ n ], thus, to fully specify the model, only neighbouring conditionals [X θ], [θ θ ], [θ θ ],..., [θ n θ n ] and the closure distribution [θ n ] are needed. Why then decompose the prior, as in () and use the model (). Here are some of the reasons: Modeling requirements may lead to the hierarchy in the prior. For example Bayesian models in meta analysis;

2 The prior information may be separated into the structural part and the subjective/noninformative part at higher level of hierarchy; Robustness and objectiveness let the data talk about the hyperparameters; Calculational issues (utilizing hidden mixtures, mixture priors, missing data, MCMC format). Sometimes it is not calculatingly feasible to carry out the analysis by reducing the sequence of hyperpriors () to a single prior (). Rather, Bayes rule is obtained (by using Fubini s theorem) as repeated integral with respect to more convenient conditional distributions. Here is the result involving the model (f), conditional prior (π (θ θ )), and hyperprior π (θ )). Suppose the hierarchical model is given as [X θ] f(x θ), [θ θ ] π (θ θ ), and [θ ] π (θ ), then the posterior distribution can be written as π(θ x) = π(θ x, θ )π(θ x)dθ. Θ The densities under the integral are π(θ x, θ ) = f(x θ)π (θ θ ) m (x θ ), and π(θ x) = m (x θ )π (θ ) m(x), where m (x θ ) = Θ f(x θ)π (θ θ )dθ is the marginal likelihood, and m(x) = Θ m (x θ )π (θ )dθ marginal. Now, for any function of the parameter h, E θ x h(θ) = E θ x [E θ θ,x h(θ)]. (3) Example: Suppose you pool n people about their favorite presidential candidate and X of them favor the candidate A. The likelihood is [X p] B(n, p), and the proportion p is the patrameter of interest. You believe that proportion is close to /, but not quite sure. The appropriate prior on p would be Beta Be(k, k), k N (see Figure (a)); it is symmetric about /, but you are reluctant to specify the natural number k. Thus, [p k] Be(k, k). Finally, you put a hyperprior on k, π (k) k(k ). The hyperprior probability mass function is in fact π (k) =, k =,,.... log()k(k ) What is the Bayes estimator for p if n = and X =. The unconditional prior on p is π(p) = p k ( p) k p = B(k, k) log()k(k ) 4 log()p( p). (4) k= This prior, depicted in Figure (b), is symmetric about /. The posterior does not have a finite form (can be expressed in terms of special functions) and the Bayes rule has to be found by numerical integration. δ π () = p f( p)π(p)dp = f( p)π(p)dp

3 (a) (b) Figure : Beta(k, k) densities for k =,, 3, 5,, and. The unconditional prior π(p) from (4). Mathematica programming is particularly convenient in this case, and the integrals have exact solution given n and X. In our case the exact result is 884/ The above approach to produce the Bayes estimator is direct, the unconditional prior is obtained as mixture of Betas. Thus we utilized transition from equations in () to prior in () Now we consider the same problem by utilizing hierarchical Bayes model discussed previously and equation (3). The hierarchical Bayes approach is as follows. The Bayes rule is δ π (x) = E k x E p k,x p, where the expectation E k x is taken with respect to π (k x) and E p k,x with respect to π (p k, x). The distribution for [p x, k] is again Beta, π (p k, x) p x ( p) n x p k ( p) k, with parameters x + k and n x + k. The expectation of p is The distribution π (k x) = m (x k)π (k) m(x) x+k (x+k)+(n x+k) = x+m k+n. does not have a finite form. m (x k) = f(x p)π (p k)dp is Beta-Binomial, given by m (x k) = Γ(k) Γ(k)Γ(k) Γ(n + ) Γ(x + )Γ(n x + ) Γ(k + x)γ(k + n x). Γ(k + n) Of course, m (x k) can be simplified a bit more. Now, the marginal m(x) is k= m (x k) log()k(k ) and it does not have a finite form although can be expressed in terms of hypergeometric pf q special functions. However, δ π () = E k x E p k,x p = E k x x + k k + n k= x+k k+n = m (x k) log()k(k ) k= m (x k) log()k(k ) = x( 3F ([ /, n x, + x], [/ + n/, + n/], ) ), n( 3 F ([ /, n x, x], [/ + n/, n/], ) ) is a ratio of two hypergeometric pf q functions with argument z =, thus representing an infinite sum of ratios of Gamma functions. This ratio is easily numerically evaluated, see mathematica notebook. The result The hypergeometric pf q function is defined as pf q([a,..., a p], [b,..., b q], z) = (a ) k...(a p ) k m= a(a + )... (a + n ) = Γ(a+n). Γ(a) 3 z k (b ) k...(b q) k k!, where (a)n =

4 for n = and X = is δ π () = 884/ Notice slight shrinkage toward / compared with the MLE estimator ˆp = / =.6. You may have noticed that even in the simple hierarchy, as in the previous example the Bayesian analysis may not be computationally easy. This is true even if we have perfect conjugate structure at different levels of hierarchy and normal models. The following Theorem is adapted from Berger (985) and illustrated on IQ adventures of our old friend Jeremy. Berger (985), Section 4.6 pages 8 95 contains an excellent account on hierarchical models with detailed proofs. The following model, in addition to its educational value, could be quite is useful as a modeling tool, if one believes in normality at various stages of the hierarchy. Assume that X = (X,..., X p ) is a p-dimensional observation, with the multivariate normal likelihood f(x θ) MVN p (θ, σ f I). Parameter θ = (θ,..., θ t ) is of interest and positive scalar σ f is assumed known. The first stage prior on θ, π (θ µ π, σ π), is multivariate normal MVN p (µ, σ πi), where µ = µ π and is p vector of s. The hyperparameter in this case is two-dimensional, λ = (µ π, σ π ). To complete the model, assume π (λ) = π (µ π ) π (σ π) with π (µ π ) N (β, τ ) and appropriate π (σ π). Of course β, τ and possible parameters in π (σ π) are assumed known, i.e., the hierarchy stops here. The following theorem gives an explicit (up to a univariate numerical integration) Bayes estimator of θ and its covariance matrix. The Theorem. The posterior mean is where δ π (x) = E θ x θ = E σ π x δ (x, σ π). δ (x, σπ) = x σ f σf σf + (x x) + σ π σ π σf + ( x β). (5) σ π + pτ The posterior covariance matrix is [ V = E σ π x σf I σ4 f σf 4 σπ + σf I + τ ] (σπ + σf )(pτ + σπ + σf )J + (δ (x, σπ) δ π (x))(δ (x, σπ) δ π (x)) The distribution for [σπ x], important for the above expectations, satisfies { [ ]} τ exp s π (σπ x) + p( x β) σπ +σ f pτ +σπ +σ f (σπ + σf )(p )/ (pτ + σπ + σf π (σπ), )/ where s = p i= (x i x). The expectation with respect to [σ π x] needs to be carried out numerically. It is important that the marginal posterior π (σ π x) is finite. This is true whenever the closure prior π (σ π) is bounded and p 3. Exercise: Suppose that Jeremy, IQ concerned fellow, has taken 5 IQ tests in the last 5 years and have obtained a vector score X = (,, 96, 9, 98). Assume that each measurement X i N (θ i, 8), i = 4

5 ,..., 5 and θ i s represent realizations of a random variable representing Jeremy s true ability. Unlike before, we assume that his true ability randomly changes in time, however, the underlying law from which such time-varying abilities are generated is common. Thus, θ i N (µ π, σ π), i =,..., 5. Finally the model is closed by assuming that µ π N (, ), and π, (σ π) =. Find Bayes estimator of θ and the covariance matrix of the estimate. Solution: The estimator for θ can be found coordinatewise. The Theorem. is directly applicable with: σ f = 8, β =,τ =, and p = 5. The result (see MATHEMATICA program jeremy.nb on the web site) is: ˆθ = (4.645,.39,.88, 8.56,.47) and V = Figure : Marginal posterior π (σ π x) when π (σ π) = in the IQ example. Figure shows π (σ π x) when π (σ π) =. Notice that even when the closure prior π (σ π) is improper, the marginal posterior is proper density, although quite flat.. Empirical Bayes. ML II Method Empirical Bayes has several formulations. Original formulation of empirical Bayes assumes that past values of X i and corresponding parameter θ i are known to the statistician who then on basis of current observation X n+ tries to make inference on unobserved θ n+. Of course, the parameters θ i are seldom known. However, it may be assumed that the past (and current) θ s are realizations from the same unknown prior distribution. Empirical Bayes is an approach to inference in which the observations are used to select the prior, usually via the marginal distribution. Once the prior is specified, the inference proceed in a standard Bayesian fashion. The use of data to estimate the prior in addition to subsequent use for the inference in empirical Bayes is criticized by subjectivists who consider the prior information exogenous to observations. The repeated use of data is also loaded with perils since it can underestimate modeling errors. Any data is going to be complacent with a model which used the same data to specify some of its features. 5

6 An excellent and comprehensive monograph on empirical Bayes methodology is Maritz and Lwin (989). Empirical Bayes and compound decision problems were introduced by Robbins in the 95 s, but first implicit empirical Bayes approach goes back to Von Mises (943). (a) (b) Figure 3: Herbert Robbins, 95, Richard von Mises, Von Mises was considering the problem of testing drinking water for a particular bacteria contamination. In a single experiment 5 small batches of water are taken. The experiment is positive if at least one batch is positive, i.e., contains at least one bacteria. Denote by θ the probability that a single batch contains at least one bacteria. Let X be the number of positive batches among 5 taken. Here ( ) 5 m(x) = θ x ( θ) 5 x π(θ)dθ, x and π(θ) is not specified. Von Mises was interested in P ( θ θ X = ), for an appropriate θ. For the rest of the story and solution consult Von Mises (943). We modify Von Mises problem to an exercise and provide step-by-step solution. Suppose that in 6 independent experiments, each containing 5 batches of water the following number of batches was found positive for bacteria: Experiment # of positive 3 Consider these 6 measurements historic and coming from B(5, θ i ) distributions, where θ i s are different. For example the experiments may have been conducted at different places. Take 7th experiment and measure X 7. Assume now that θ i, i =,..., 7 have Beta(α, β) distribution. Evaluate P ( θ 7. X 7 = ). Here are the steps in the solution: (i) We find the marginal for X i in the experiment i. The result is Beta-Binomial, ( ) 5 B(k + α, 5 k + β) P (X i = k α, β) =, k =,..., 5. k B(α, β) What is critical here, this distribution is free of variable variables θ i (Pun not intended). 6

7 (ii) Estimate α and β in the marginal. The method of moments is most straightforward. The theoretical moments E(X i ) = 5α/(α + β) and E(Xi ) = 5a(5a+5+b) (a+b)(a+b+) are replaced by the empirical moments based on historic data, m = 6 6 i= X i =.875 and m = 6 6 i= X i =.65, and equations solved with respect to α and β. Solutions are ˆα = m (m 5m ) m (4m + 5) 5m = 3.5 ˆβ = (m 5)(m 5m ) m (4m + 5) 5m = 6.5. (iii) Express P ( θ 7. X 7 = ) in terms of incomplete Beta functions with estimated hyperparameters ˆα and ˆβ. For example, in MATHEMATICA: Beta[., 3.5, 6.5]/Beta[3.5, 6.5]= Unlike the von Mises example, the original Robbins formulation of empirical Bayes was non-parametric. In the following example the Bayes rule with respect to an unknown prior will be expressed in terms of marginal distribution. Nonparametric empirical Bayes follows and uses the historic data to estimate the marginal distribution in nonparametric fashion. The estimated marginal distributions are then plugged in the formal Bayes rule. Example: Assume that X θ Poi(θ) and that π(θ) is to be specified. The Bayes rule is δ π (x) = θ x+ x! e θ π(θ)dθ θ x x! e θ π(θ)dθ = (x + ) θ x+ (x+)! e θ π(θ)dθ θ x x! e θ π(θ)dθ = (x + ) m π(x + ), m π (x) i.e., the Bayes rule is depends on the model only via the marginal (prior predictive) distribution m π (x). Let X i θ i Poi(θ i ), i =,..., n +, and all θ i are parameters having the same distribution. We are interested in estimating θ n+, using X n+ and X,..., X n. The estimator X + +X n+ n+ cannot be used since underlying θ s are different. Thus the MLE for θ n+ is X n+. The empirical Bayes ˆm rule is δ π (x n+ ) = δ π (x n+ ; x,..., x n ) = (x n+ + ) n(x+) /n+ ˆm n (x n+ ), where ˆm n(x n+ : x,..., x n ) = #{x i, i=,...,n x n+ =x i } n. The performance of the estimator could be improved if the estimators of marginals ˆm n are smoothed. The matlab file eb.m demonstrates the NPEB estimator. Description of this m-file and simulations in it will be added soon. In the meanwhile, please take a look at matlab file, since it is annotated. Now, consider the same setup, X i θ i Poi(θ i ), but this time the prior distribution of θ s is assumed known up to a hyperparameter. Assume that θ i s have exponential distribution with density π(θ i λ) = λe λθ i (θ i ), λ R +. A Negative Binomial is a marginal distribution for Poisson likelihood and Gamma prior. If the shape parameter in the Gamma prior is then the Negative Binomial becomes Geometric distribution (Handout ), ( ) xi λ m π (x i ) = + λ + λ. The MLE estimator of λ, based on x,..., x n is ˆλ MLE = x. Now, the Bayes estimator of θ n+ is x n++ λ+ because of the Poisson/Gamma conjugate structure. The parametric empirical Bayes estimator is δ(x n+ ; x,..., x n ) = x x+ (x n+ + ). When n, ˆm n (x) m π (x), in probability, and empirical Bayes rule is consistent, δ π (x; x,..., x n ) δ π (x). 7

8 5 θ PEB NPEB X Figure 4: Comparison of MLE, NPEB, and PEB. The circle is the true parameter θ. Exercise: If X θ Geom( θ) with f(x θ) = θ x ( θ), x =,,,... The prior π(θ) is unknown. Show δ π (x) = mπ(x+) m π (x). For X θ N B(n, θ) = Γ(n+x) Γ(x+)Γ(n) θx ( θ) n, the Bayes rule is δ π (x) = x+ m π(x+) n+x m π (x). James Stein Estimator and its EB Justification. Consider the estimation of θ in a model X MVN p (θ, I) under squared error loss L(θ, a) = i (θ i a i ). For p = and, estimator ˆθ = X is admissible (as unique minimax), i.e., no estimator has uniformly better risk. However, for p 3 X is neither unique minimax nor admissible. A better estimator is δ JS (X) = ( p p i= X i ) X, known as James-Stein estimator. The empirical Bayes justification for δ JS (X) is provided next. Suppose that θ has a prior distribution θ MVN (, τ I), where hyperparameter τ is not known and will be estimated from the sample, X in this case. The Bayes rule, under squared error loss is δ B (X) = τ ( + τ X = ) + τ X. Marginal (prior predictive) distribution for X is MVN p (, ( + τ )I). For such X, the random variable T = p i= X i + τ 8

9 has χ p distribution. Then, Thus E T = = = = E + τ p i= X i t Γ(p/ ) Γ(p/) p/ p. = p t p/ e t/ Γ(p/) dt p/ E p p i= X i t p/ e t dt Γ(p/ )p/ = + τ. Therefore, the method-of-moments estimator of +τ is δ EB (X) = p p, which yields an empirical Bayes estimator i= X i ( p ) p X. i= X i.. ML II We already have seen the spirit of ML II method in Parametric Empirical Bayes. The ML II approach was proposed by I. J. Good, a statistician that was in the team who broke German code in the WWII. The idea is to mimic the maximum likelihood estimation at the marginal level: Select a prior π that maximizes m π (x), given the data. Figure 5: I. J. Good, born 96 Exercise: Suppose that Bayesian have chosen a prior π but wants to look at all priors close to π. One such family of priors, close to π is contamination family, Γ = {( ɛ)π + ɛq, q Q}. Suppose that Q is a family of all distributions. Determine empirical Bayes choice. 9

10 Hint: Observe that for π Γ the marginal is m π (x) = ( ɛ)m (x) + ɛm q (x). Also, for any model f(x θ) a weighted average is smaller than mode, i.e., f(x θ)q(θ)dθ f(x ˆθ MLE ) = f(x θ)δ(ˆθ MLE ), where δ(ˆθ MLE ) is a point mass distribution concentrated at θ MLE. Thus, the empirical Bayes choice is π(θ) = ( ɛ)π (θ) + ɛδ(ˆθ MLE ). Exercises. Assume X θ is exponential E(/θ) with density f(x θ) = θ e x/θ, x. Let F be the cdf corresponding to f. Assume a prior on θ, π(θ). Let m π (x) = Θ f(x θ)π(θ)dθ be the marginal and M π(x) = x m π(t)dt be the corresponding c.d.f. (a) Show that θ = F (x θ) f(x θ). (b) Show that Bayes estimator of θ with respect π is δ(x) = Mπ(x) m π (x). [Hint. You will need to use a version of Fubini s theorem (Tonelli theorem) and change order of integration. Tonelli theorem allows for change when integrands are nonnegative.] (c) Suppose you observe X i θ i E(/θ i ), i =,..., n +. Explain how would you estimate θ n+ in the empirical Bayes fashion, using the result in (b).. Assume [X θ] N (θ, ) and [θ µ, τ ] N (µ, τ ). (a) Find the marginal for X. (b) What are the moment matching estimators of µ and τ, if the sample X i N (θ i, ), i =,..., n, is available. [Hint. Find the moments of the marginal and be careful, the estimator of the variance need to be nonnegative.] (c) Propose an empirical Bayes estimator of θ based on the considerations in (b). (d) What modifications in (b) are needed if you use MLE II estimator of µ and τ. [Sol. moment matching. Marginal is N (µ, + τ ). ˆµ = X and estimator of + τ is s. So ˆτ = max{, s }. For MLE ˆτ = max{, n n s }. 3. If the data X i f(x i θ) can not be reduced by a sufficient statistics, then so called pseudo-bayes approach is possible. Let T be an estimator of θ for which distribution g(t θ) is known and π the adopted prior. Instead of finding the Bayes rule δ π (x,..., x n ) = Θ θ n i= f(x i θ)π(θ)dθ Θ n i= f(x i θ)π(θ)dθ one finds the pseudo-bayes rule as Suppose that you have model δ π(t) = Θ tg(t θ)π(θ)dθ Θ g(t θ)π(θ)dθ. X i N (θ, ), θ N (µ, τ ).

11 For n large, the distribution of the sample median m = Med(X, X,..., X n ), is approximately N (θ, π n ). Write down pseudo-bayes estimator of θ using the median estimator of θ and its normal approximation. 4. Another way to justify Stein shrinkage estimator is to mimic Exercise. Suppose you have the model (i) Find the marginal distribution. (ii) Show that in (i) the MLE of τ is ˆτ = X MVN p (θ, I), θ MVN (, τ I). { x i (iii) Replacing the MLE in the Bayes estimator p, x i > p, else δ π (x) = τ x + τ the truncated James-Stein estimator is obtained, ( δ EB (x) = p ) x i where (x) + = max{x, } is the positive part of x. + x, References [] Berger, J. (985). Statistical Decision Theory and Bayesian Analysis, Second Edition, Springer Verlag. [] Maritz and Lwin (989) Empirical Bayes Methods, Second Edition, Chapman and Hall, New York, [3] Robert, C. (). Bayesian Choice, Second Edition, Springer Verlag. [4] Von Mises, R. (943). On the correct use of Bayes formula. Ann. Math. Statist., 3, Appendix Mathematica code for finding the hierarchical Bayes estimator in the example about the political pool. Integrate[ p Gamma[n + ]/(Gamma[x + ] Gamma[n - x + ]) pˆx ( - p)ˆ(n - x) ( - Abs[ - p])/(4 Log[] p ( - p)) /. {n ->, x -> }, {p,, }]/ Integrate[ Gamma[n + ]/(Gamma[x + ] Gamma[n - x + ]) pˆx ( - p)ˆ(n - x) ( - Abs[ - p])/(4 Log[] p ( - p)) /. {n ->, x -> }, {p,, }] // N

1 Priors, Contd. ISyE8843A, Brani Vidakovic Handout Nuisance Parameters

1 Priors, Contd. ISyE8843A, Brani Vidakovic Handout Nuisance Parameters ISyE8843A, Brani Vidakovic Handout 6 Priors, Contd.. Nuisance Parameters Assume that unknown parameter is θ, θ ) but we are interested only in θ. The parameter θ is called nuisance parameter. How do we

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

General Bayesian Inference I

General Bayesian Inference I General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for

More information

Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions

Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions The Korean Communications in Statistics Vol. 14 No. 1, 2007, pp. 71 80 Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions R. Rezaeian 1)

More information

9 Bayesian inference. 9.1 Subjective probability

9 Bayesian inference. 9.1 Subjective probability 9 Bayesian inference 1702-1761 9.1 Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win

More information

STAT 830 Bayesian Estimation

STAT 830 Bayesian Estimation STAT 830 Bayesian Estimation Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Bayesian Estimation STAT 830 Fall 2011 1 / 23 Purposes of These

More information

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3. Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae

More information

Overall Objective Priors

Overall Objective Priors Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Lecture 7 October 13

Lecture 7 October 13 STATS 300A: Theory of Statistics Fall 2015 Lecture 7 October 13 Lecturer: Lester Mackey Scribe: Jing Miao and Xiuyuan Lu 7.1 Recap So far, we have investigated various criteria for optimal inference. We

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS

POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS POSTERIOR PROPRIETY IN SOME HIERARCHICAL EXPONENTIAL FAMILY MODELS EDWARD I. GEORGE and ZUOSHUN ZHANG The University of Texas at Austin and Quintiles Inc. June 2 SUMMARY For Bayesian analysis of hierarchical

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Eco517 Fall 2004 C. Sims MIDTERM EXAM Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering

More information

Chapter 5. Bayesian Statistics

Chapter 5. Bayesian Statistics Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Hierarchical Models & Bayesian Model Selection

Hierarchical Models & Bayesian Model Selection Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or

More information

Carl N. Morris. University of Texas

Carl N. Morris. University of Texas EMPIRICAL BAYES: A FREQUENCY-BAYES COMPROMISE Carl N. Morris University of Texas Empirical Bayes research has expanded significantly since the ground-breaking paper (1956) of Herbert Robbins, and its province

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in

More information

Econ 2140, spring 2018, Part IIa Statistical Decision Theory

Econ 2140, spring 2018, Part IIa Statistical Decision Theory Econ 2140, spring 2018, Part IIa Maximilian Kasy Department of Economics, Harvard University 1 / 35 Examples of decision problems Decide whether or not the hypothesis of no racial discrimination in job

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Part 7: Hierarchical Modeling

Part 7: Hierarchical Modeling Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

1. Fisher Information

1. Fisher Information 1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon Final Examination Saturday, 2001 May 5, 9:00am 12:00 noon This is an open-book examination, but you may not share materials. A normal distribution table, a PMF/PDF handout, and a blank worksheet are attached

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Econ 2148, spring 2019 Statistical decision theory

Econ 2148, spring 2019 Statistical decision theory Econ 2148, spring 2019 Statistical decision theory Maximilian Kasy Department of Economics, Harvard University 1 / 53 Takeaways for this part of class 1. A general framework to think about what makes a

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

COM336: Neural Computing

COM336: Neural Computing COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk

More information

Seminar über Statistik FS2008: Model Selection

Seminar über Statistik FS2008: Model Selection Seminar über Statistik FS2008: Model Selection Alessia Fenaroli, Ghazale Jazayeri Monday, April 2, 2008 Introduction Model Choice deals with the comparison of models and the selection of a model. It can

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017 Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries

More information

Hybrid Censoring; An Introduction 2

Hybrid Censoring; An Introduction 2 Hybrid Censoring; An Introduction 2 Debasis Kundu Department of Mathematics & Statistics Indian Institute of Technology Kanpur 23-rd November, 2010 2 This is a joint work with N. Balakrishnan Debasis Kundu

More information

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your

More information

Probability and Statistics

Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Chapter 3: Parametric families of univariate distributions CHAPTER 3: PARAMETRIC

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??

Introduction to Bayesian Methods. Introduction to Bayesian Methods p.1/?? to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter

More information

STAT215: Solutions for Homework 2

STAT215: Solutions for Homework 2 STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Bayesian statistics: Inference and decision theory

Bayesian statistics: Inference and decision theory Bayesian statistics: Inference and decision theory Patric Müller und Francesco Antognini Seminar über Statistik FS 28 3.3.28 Contents 1 Introduction and basic definitions 2 2 Bayes Method 4 3 Two optimalities:

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer

More information

Lecture notes on statistical decision theory Econ 2110, fall 2013

Lecture notes on statistical decision theory Econ 2110, fall 2013 Lecture notes on statistical decision theory Econ 2110, fall 2013 Maximilian Kasy March 10, 2014 These lecture notes are roughly based on Robert, C. (2007). The Bayesian choice: from decision-theoretic

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Default priors and model parametrization

Default priors and model parametrization 1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)

More information

SOLUTION FOR HOMEWORK 6, STAT 6331

SOLUTION FOR HOMEWORK 6, STAT 6331 SOLUTION FOR HOMEWORK 6, STAT 633. Exerc.7.. It is given that X,...,X n is a sample from N(θ, σ ), and the Bayesian approach is used with Θ N(µ, τ ). The parameters σ, µ and τ are given. (a) Find the joinf

More information

Statistical Theory MT 2006 Problems 4: Solution sketches

Statistical Theory MT 2006 Problems 4: Solution sketches Statistical Theory MT 006 Problems 4: Solution sketches 1. Suppose that X has a Poisson distribution with unknown mean θ. Determine the conjugate prior, and associate posterior distribution, for θ. Determine

More information

Principles of Statistics

Principles of Statistics Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for

More information

Lecture 20 May 18, Empirical Bayes Interpretation [Efron & Morris 1973]

Lecture 20 May 18, Empirical Bayes Interpretation [Efron & Morris 1973] Stats 300C: Theory of Statistics Spring 2018 Lecture 20 May 18, 2018 Prof. Emmanuel Candes Scribe: Will Fithian and E. Candes 1 Outline 1. Stein s Phenomenon 2. Empirical Bayes Interpretation of James-Stein

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

BAYESIAN ESTIMATION IN DAM MONITORING NETWORKS

BAYESIAN ESTIMATION IN DAM MONITORING NETWORKS BAYESIAN ESIMAION IN DAM MONIORING NEWORKS João CASACA, Pedro MAEUS, and João COELHO National Laboratory for Civil Engineering, Portugal Abstract: A Bayesian estimator with informative prior distributions

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

1 Introduction. P (n = 1 red ball drawn) =

1 Introduction. P (n = 1 red ball drawn) = Introduction Exercises and outline solutions. Y has a pack of 4 cards (Ace and Queen of clubs, Ace and Queen of Hearts) from which he deals a random of selection 2 to player X. What is the probability

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Outline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson.

Outline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson. s s Practitioner Course: Portfolio Optimization September 24, 2008 s The Goal of s The goal of estimation is to assign numerical values to the parameters of a probability model. Considerations There are

More information

Confidence Interval Estimation of Small Area Parameters Shrinking Both Means and Variances

Confidence Interval Estimation of Small Area Parameters Shrinking Both Means and Variances Confidence Interval Estimation of Small Area Parameters Shrinking Both Means and Variances Sarat C. Dass 1, Tapabrata Maiti 1, Hao Ren 1 and Samiran Sinha 2 1 Department of Statistics & Probability, Michigan

More information

ST 740: Model Selection

ST 740: Model Selection ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information