Bayesian Statistics Adrian Raftery and Jeff Gill One-day course for the American Sociological Association August 15, 2002

Size: px
Start display at page:

Download "Bayesian Statistics Adrian Raftery and Jeff Gill One-day course for the American Sociological Association August 15, 2002"

Transcription

1 Bayesian Statistics Adrian Raftery and Jeff Gill One-day course for the American Sociological Association August 15, 2002 Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

2 Outline 1. Bayes s theorem 2. Bayesian estimation One parameter case Conjugate priors Noninformative priors Multiparameter case Integrating out parameters Asymptotic approximations When is Bayes useful? Example: regression in macrosociology 3. Bayesian testing and model selection Bayesian testing: Bayes factors Bayesian model selection: posterior model probabilities Bayesian model averaging: Accounting for model uncertainty Examples 4. Further reading Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

3 Purposes of Statistics Scientific inference: Find causes Quantify effects Compare competing (causal) theories Prediction: Policy-making Forecasting (e.g. future population, results of legislation) Control of processes Decision-making Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

4 Standard (frequentist) Statistics Estimation is based on finding a good point estimate, and assessing its performance under repetitions of the experiment (or survey) that gave rise to the data The best point estimate is often the maximum likelihood estimator. In large samples, for regular models, this is the most efficient estimator (i.e. the one with the smallest mean squared error). In relatively simple models, the MLE is often the obvious estimator. For example, for estimating the mean of the normal distribution, the MLE is just the sample mean. For testing one hypothesis against another one within which it is nested (i.e. of which it is a special case), the best test is often the likelihood ratio test. Standard statistical methods for testing nonnested models against one another, or for choosing among many models, are not well developed. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

5 Bayesian Statistics Based on the idea of expressing uncertainty about the (unknown) state of nature in terms of probability. You start with a probability distribution reflecting your current state of knowledge. When new data become available, you update your probability distribution in light of the new data. In a probability framework, there is only one way to do this: via Bayes s theorem. This solves many of the technical problems of standard statistics: nonregular models, testing nonnested models, choosing among many models. It also provides a way of incorporating external information (outside the current data set). Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

6 The key idea is subjective probability. The current distribution of the state of nature reflects your opinion. This has been criticized as non-scientific. However, it turns out that when there a moderate amount of evidence, even people who disagree violently initially end up in substantial agreement, so long as they follow Bayes s theorem. And if there isn t enough evidence, it s reasonable for people who disagreed to start with to go on disagreeing (although not as much as at first). Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

7 Bayes s Theorem: Notation Bayes s theorem relates to the problem of adjudicating between competing hypotheses given observations. Suppose is an event, i.e. something that either happens or that doesn t. Suppose are other events that form a partition. This means that their union is the certain event (i.e. at least one of them is sure to be the case), and their intersections are zero. Mathematically: where is the certain event, and where is the null event. can be thought of as competing hypotheses to explain the event observed,. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

8 Bayes s Theorem Bayes s Theorem: In that case, the conditional probability of given is To calculate, we may need a further result, the Law of Total Probability: The overall, or marginal probability of the event,, can be expressed in terms of the probabilities of and the conditional probabilities of given each of the s, as follows: Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

9 Bayes s Theorem: An Example Example 1: An item is produced in 3 different factories,. The proportions produced in the 3 factories, and the proportions defective in each, are as follows: Factory % produced % defective An item is purchased and found to be defective. This is event. What is the probability that it was from factory? First, we find the overall probability of a defective,, from the Law of Total Probability: Then, Bayes s theorem tells us the probability that the Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

10 item was from factory : This makes intuitive sense: Before we found out that the item was defective, we knew that the probability it was from factory was.50. Then we found out it was defective. Factory has a lower rate of defectives than the other two, so finding out that the item was defective made it less likely to be from factory, i.e. to have a probability lower than.50. And, indeed, so it is:.37 instead of.50. Another Version of Bayes s Theorem: where means proportional to. To implement this, we calculate for each, add them up, and then divide by the sum so that they add up to 1 (which they have to, because they re Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

11 probabilities of a partition). Example 1 (ctd):! " # $ # &% '% (&) +*,!*-!*$. (&/ Then Another way of looking at this is that,, are the possible states of nature, and that is the data. (datum) We then use the data to decide how likely the different states of nature are relative to one another. This is the idea that underlies Bayesian statistics. nature. is the probability of the data given the state of This is called the likelihood of. is the probability that it was from before we knew whether or not it was defective, i.e. before we observed the data. This is called the prior probability of. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

12 is called the marginal probability of the data, or, for reasons we will see later, the integrated likelihood. is called the posterior probability of given. The set of posterior probabilities is called the posterior distribution of the state of nature. In Bayesian statistics, all inference is based on the posterior distribution. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

13 Bayesian Estimation of One Parameter Now, we consider the situation where the state of nature is a parameter to be estimated, denoted by. For now, we ll just consider the case where is one-dimensional, i.e. where there s only one parameter. An example is the mean of a distribution. This is like the factories and defectives Example 1, but with the difference that the possible states of nature form a contintuum, at least approximately, instead of a small number of discrete values. The same basic theory applies, though, with probabilities replaced by probability densities, and sums replaced by integrals. We assume that for each possible value of, we know what is. As before, this is called the likelihood. We also assume that we have a probability density function (pdf),, that tells us the relative probability of each value of before observing the data. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

14 As before, this is called the prior distribution. This can come from prior knowledge. Often it s specified roughly so that the prior distribution covers the range of plausible values and is fairly flat over that range. We ll see that there is a sense in which the precise form of the prior distribution doesn t matter too much for estimation. We ll give examples in a bit. Bayes s Theorem for Parameter Estimation: Version 1: The posterior distribution of given data is given by likelihood prior integrated likelihood where all values of is the integrated likelihiood. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

15 Version 2: i.e. likelihood prior This gives the posterior distribution only up to a multiplicative constant, but often this is enough, and avoids the difficulty of evaluating the integrated likelihood (also called the normalizing constant in this context). Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

16 Example: Normal Mean with Known Variance and One Observation Example 2: (Box and Tiao 1973): Two physicists, A and B, are trying to estimate a physical constant,. They each have prior views based on their professional experience, their reading of the literature, and so on. We will approximate the prior distribution of by a normal distribution Suppose now that an unbiased method of experimental measurement is available, and that an observation made by this method approximately follows a normal distribution with mean, and variance, where is known from calibration studies. Then the likelihood is Then it can be shown that the posterior distribution of Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

17 0 0 0 given,, is also a normal distribution, with mean and variance such that where and The reciprocal of the variance of a distribution is often called its precision, because the bigger the variance, the lower the precision. Thus is the prior precision, and is the observation precision. The posterior mean is a weighted average of the prior mean and the observation, with the weights being proportional to the associated precisions. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

18 This is an appealing result. The posterior precision is the sum of the prior and observation precisions, reflecting the fact that the two sources of information are pooled together. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

19 Normal Mean with Multiple Observations Now suppose that, instead of one measurement, we have independent measurements with the same experimental method,. Then are conditionally independent given. This means that, if we knew, knowing the value of would tell us nothing about, and similarly for any pair of values. Is this true if we don t know? Why? are also said to be exchangeable. Then the likelihood is got by multiplying up the likelihoods for the individual s: It can be shown that this is proportional (as a function of ) to a normal distribution with mean deviation. and standard Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

20 0 0 Then the posterior distribution is again normal, with mean and variance such that where and Thus the posterior mean is again a weighted average of the prior mean and the mean of the data. The weight associated with the mean of the data is proportional to the number of data points. The weight associated with the prior remains constant as the amount of data increases. Thus, with large samples the prior matters very little. This is a very general result for Bayesian statistics, and helps to justify its use. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

21 Inference: Summarizing the Posterior Distribution In Bayesian statistics, the posterior distribution is all ye know on Earth, and all ye need to know. It tells us the probability that the parameter of interest lies in any interval, given all our current information. A plot of the posterior density is often useful. Point Estimation: The search for a point estimate is meaningless, except in the context of a specific decision context (and most decisions don t call for point estimates). A numerical value can be useful for saying where the center of the distribution is. The posterior mode (the most likely value) is the most intuitive summary. But often the posterior mean is the most easily available. The posterior mode and mean are usually close together, but not always. Example: Estimating a hard-to-find population (the Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

22 number of homeless, the number of unregistered guns in America, etc.) Interval Estimation: The most intuitive interval estimate is formed by the lower 2.5th percentile and the upper 97.5th percentile of the posterior distribution for a 95% interval (and similarly for other intervals). There are other proposals in the Bayesian literature, like the highest posterior density region, but in my view these do not have much scientific interest. Roughly summarizing the posterior distribution: Often, in practice, the posterior mean and posterior standard deviation are reported. These are like the MLE and standard error, and are often close to them numerically. Posterior mean 2 posterior standard deviations is a rough 95% confidence interval. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

23 Conjugate Priors In the physical constant example, the prior was normal, and the posterior was too. So the data updated the parameters of the prior distribution, but not its form. This can be very useful in practical work. A prior distribution that has this property is called a conjugate prior. Often priors of this form are flexible enough to represent prior knowledge fairly well. Most priors used in applied Bayesian work are conjugate. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

24 Examples of Conjugate Priors Some examples of conjugate priors for one parameter models: Model Normal with known variance Normal with known mean Binomial Poisson Prior distribution Normal (for the mean) Gamma (for the variance) Beta Gamma Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

25 Noninformative Priors There have been many efforts to find priors that carry no information, or noninformative priors. In general, this has turned out to be a modern version of the Philosopher s Stone. There are some very simple problems for which there are agreed reference priors. One example is the normal mean problem, for which a flat prior is often used. This is an improper prior, i.e. it does not integrate up to 1, because it is constant over the whole real line. Instead, it integrates up to infinity. Nevertheless, the resulting posterior distribution is proper. When there is more than one parameter, though, noninformative priors turn out to be very informative about some aspects of the problem, in an unexpected way. 0 Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

26 Improper noninformative priors can lead to paradoxes and strange behavior, and should be used with extreme caution. The current trend in applied Bayesian statistical work is towards informative and, if necessary, spread out but proper prior distributions. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

27 More Than One Parameter Suppose that we have two parameters in the model, and. One example is the normal distribution (mean and variance). Then we have a joint prior distribution,. Often, parameters are independent a priori. We also have a joint likelihood,. And so we have a joint posterior distribution, exactly as in the one-parameter case: Usually, we re interested in parameters individually. To get the posterior distribution of on its own, for example, we must integrate out, as follows: (1) This follows from the Law of Total Probability. (1) is called the marginal posterior distribution of. We can then summarize the posterior distribution of Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

28 in the same way as when there s only one parameter (posterior mean or mode, posterior standard deviation, posterior percentiles, plot of the posterior density). The same approach holds when there are more than two parameters (e.g. in regression). Then the integral in (1) is a multiple integral over all the parameters except. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

29 Integrating Out Other Parameters Sometimes the integral in (1) can be evaluated analytically (i.e. a mathematical expression found for it in terms of. Often it cannot, especially when there are many parameters. Here are some ways to evaluate it: Analytical evaluation: This is the best, if it can be done. Asymptotic approximation: Approximately, in large samples, for regular models, the posterior distribution is multivariate normal with mean at the MLE and the same covariance matrix as the MLE, i.e. the inverse of the Fisher information matrix. Then the marginal distribution of each parameter is just normal, with the variance equal to the diagonal element of the inverse Fisher information matrix. Direct simulation: Sometimes it is possible to simulate from the posterior distribution directly, Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

30 even if it is hard to integrate it out. Then you can simulate a big sample, and just strip out the values. This gives you a sample from the marginal posterior distribution of, which can be used to estimate the posterior mean, standard deviation, percentiles, and so on. This is the case in the normal distribution with both mean and variance unknown. Then the posterior distribution has the form: Gamma where is the mean and (reciprocal of the variance). is the precision This can be simulated using an algorithm such as: Repeat many times: 1. Simulate a value of from the gamma distribution. This can be done directly using available software. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

31 2. Simulate a value of from the normal distribution, using the value of simulated in step 1. Markov chain Monte Carlo (MCMC) simulation: See Jeff Gill s lecture this afternoon. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

32 When is Bayes Better? We have seen that Bayesian statistics gives very similar results to standard statistics when three conditions hold: 1. The model is regular (i.e., roughly, the MLE is asymptotically normal, which requires, for example, that the likelihood be smooth and that the amount of information about each parameter increase as ), 2. There s at least a moderate amount of data, and 3. We re doing estimation, rather than testing or model selection Bayesian statistics takes more work in standard situations, because you have to assess the prior and investigate sensitivity to it. Thus, when these 3 conditions hold, Bayesian statistics involves more work than standard statistics (mostly MLE and asymptotic standard errors), but yields similar results. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

33 So it doesn t seem too worthwhile in this case. Bayesian statistics can be better in other situations. Irregular models: The Bayesian solution is immediate. Bayesian statistics doesn t need regularity conditions to work. Examples include: estimating population size; change-point models; hierarchical models (see Jeff s lecture). Not much data: Here we can get bad solutions, and prior information can help a lot. Examples abound in macrosociology Testing and model selection: Here Bayesian solutions seem more general and avoid many difficulties with standard methods (nonnested models, many models, failure to consider power when setting significance levels.) Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

34 Example: Bayesian Inference in Comparative Research (Western and Jackman, 1994, APSR) Problems in comparative research (macrosociology): Few cases (e.g. the 23 OECD countries) Quite a few parameters in regressions Collinearity Result: Weak inferences Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

35 Example: Explaining Union Density Data: 20 democratic countries Dependent variable: Union density Independent variables: Left government, labor-force size, economic concentration Method: Linear regression Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

36 Bayesian Model Selection How probable is a model given the data, conditionally on a set of models considered,? Posterior model probability given data : Integrated likelihood of a model: likelihood prior This comes from the law of total probability. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

37 Bayesian Model Selection (ctd) Posterior odds for 0 against : where is the prior probability of (often taken to be equal), Bayes factor ( 0 ) prior odds Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

38 Properties Theorem 1: For two nested models, model choice based on the Bayes factor minimizes the Total Error Rate (= Type I Error Rate + Type II error rate), on average over data sets drawn from the prior. Different interpretation of prior: The set of parameter values over which we would like good performance (cf simulation studies). Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

39 Bayesian Model Averaging Suppose is a quantity of interest which has the same interpretation over the models considered, e.g. it is an observable quantity that can be predicted, at least asymptotically. Then if there are several models, its posterior distribution is a weighted average over the models: Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

40 Estimation via Bayesian Model Averaging Estimation: The BMA estimate of a parameter is where denotes posterior mean (often Theorem 2: MLE). minimises MSE among point estimators, where MSE is calculated for data sets drawn from the prior. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

41 Comments on Bayesian Model Selection/Averaging Deals easily with multiple ( ) models Deals easily with nonnested models For significance tests, a way of choosing the size of the test to balance power and significance. Threshold increases slowly with. Deals with model uncertainty (datamining) Point null hyotheses approximate interval nulls, so long as the width is less than about 1/2 standard error (Berger and Delampady 1987) Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

42 The BIC Approximation BIC = 2 log maximized likelihood no of parameters. Theorem 3: 0 i.e. BIC approximates the Bayes factor to within no matter what the prior is. The term is unimportant in large samples, so 0, so that BIC is consistent. (Cox and Hinkley 1978, Schwarz 1978) Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

43 The BIC Approximation (ctd) Theorem 4: If 0 where is the expected information matrix for one observation, the unit information prior (UIP), then 0 0 i.e. the approximation is much better for the UIP (Kass and Wasserman 1995). What if the prior is wrong? 4 slides here: UIP plot, criticism, BF vs prior sd plot, response. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

44 Small Simulation Study based on Weakliem example. Formulate table as a loglinear model with ANOVA parametrization Keep main effects constant at values in Weakliem data Set log-odds ratio = 0 ( ), or LOR 0 ( ) (Weakliem recommendation) Simulated Odds Ratios Proportion Odds Ratio Figure 1: Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

45 Tests Assessed Test LRT 5% BIC BF: default GLIB (scale = 1.65) BF: right prior Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

46 Tests: Total Error Rates Total error rate = Type I error rate + Type II error rate Total Error Rate Test ( 1000) LRT 5% 163 BIC 160 BF: default GLIB 154 BF: right prior 153 Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

47 Calibration of Tests Of those data sets for which (one star), what proportion actually had an odds ratio that was different from 1? We might hope, somewhere in the region 95% 99%. Actually, it was 39%. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

48 Calibration of Bayes Factors Of those data sets for which the posterior probability of an association is between 50% and 95% (weak to positive evidence), what proportion actually had an odds ratio that was different from 1? We might hope, somewhere in the region 50% 95%. (Halfway = 73%). Actually, it was: BIC: 94% GLIB default: 71% GLIB right prior: 73% Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

49 When BIC and a 5% Test Disagree: Is BIC Really Too Conservative? Consider those data sets for which a 5% test rejects independence (i.e. (i.e. ), but BIC does not ). If BIC were really too conservative, we would expect association to be present in most of these cases, probably not far from 95% of these cases. Actually, it was present in only 48% Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

50 of these cases. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

51 Estimators Estimator Full model 1. MLE 2. Bayes: GLIB 3. Bayes: right prior Model selection 4. 5% LRT MLE 5. BIC MLE 6. Bayes GLIB 7. Bayes right prior BMA 8. BMA: BIC MLE 9. BMA: GLIB 10. BMA: right prior Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

52 Estimators: MSEs 0 Total MSE Estimator ( ) Full model 1. MLE Bayes: GLIB Bayes: right prior 48 Model selection 4. 5% LRT MLE BIC MLE Bayes GLIB Bayes right prior 34 BMA 8. BMA: BIC MLE BMA: GLIB BMA: right prior 32 Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

53 Estimation: Comments Overall, BMA Model selection Full model. Different trade-offs between MSEs under the two models. Right prior (slightly) GLIB default BIC MLE LRT MLE Full model Full model less good (MLE and Bayes) This can guide choice of in BIC. E.g. for event history models, it is better to choose the number of events than the number of individuals, or of exposure times. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

54 The Hazelrigg-Garnier Data Revisited Australia Belgium France Hungary Italy Japan Philippines Spain United States West Germany West Malaysia Yugoslavia Denmark Finland Norway Sweden Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

55 The Quasi-Symmetry Model Accounts for 99.7% of the deviance under independence. Theoretically grounded. No easily discernable patterns in the residuals. BUT 0 on 16 d.f, so 0 apparently good model is rejected.. An BIC seems to resolve the dilemma: BIC = favors the QS model. A more refined analysis using Weakliem s prior for parameter 4 gives the same conclusion, with a more exact BIC (from GLIB) = 2. The conclusion is insensitive to the prior standard deviation. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

56 Further Model Search One should continue to search for better models if the deviance from the BIC-best model is big enough: # Model Deviance d.f. BIC 1 Independence Quasi-symmetry Saturated Explanatory Farm asymmetry Weakliem s preferred model is # 5, which is also preferred by BIC, but rejected by a 5% significance test. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

57 Concluding Remarks Bayes factors seem to perform well as tests (in terms of total error rate). This seems fairly robust to the prior used. They also seem well calibrated. In the small example considered, Bayes factors based on good priors did better than BIC, which did better than a 5% LRT. The GLIB default prior had similar performance to the optimal. For estimation, BMA did better in MSE terms than model selection estimators, which did better than estimation for the full model. These results were robust to the prior, and BIC did almost as well as more exact Bayes factors. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

58 Concluding Remarks (ctd) When the model doesn t hold, we can assess methods using out-of-sample predictive performance. BMA has consistently done better than model selection methods (Bayes or non-bayes). (e.g. Volinsky et al 1995) It s important to assess whether any of the models considered fit the data well. Diagnostics are useful to suggest better models, but do not necessarily rule out the use of a model that is better than others by Bayes factors. Even if a Bayes factor prefers one model to another, the search for better models should continue (as in the Hazelrigg-Garnier example). Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

59 Papers and Software Research Bayesian Model Selection BMA Homepage: volinsky/bma.html Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

60 Further Reading: Books Introductory: Peter Lee (1989). Bayesian Statistics: An Introduction. Theory: José Bernardo and Adrian Smith (1994). Bayesian Theory. Applied: Andrew Gelman et al (1995). Bayesian Data Analysis. Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

61 Further Reading: Review articles Bayesian estimation: W. Edwards, H. Lindeman and L. Savage (1963). Bayesian statistical inference for psychological research. Psych. Bull. 70, Bayesian testing: R. Kass and A. Raftery (1995). Bayes factors. J. Amer. Statist. Ass. 95, Bayesian model selection: A. Raftery (1995). Bayesian model selection in social research (with discussion). Sociological Methodology 25, Bayesian model averaging: J.A. Hoeting et al (1999). Bayesian model averaging: A tutorial (with discussion). Statistical Science 14, Bayes Course, ASA Meeting, August 2002 c Adrian E. Raftery

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

Bayesian Inference: Concept and Practice

Bayesian Inference: Concept and Practice Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks (9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Illustrating the Implicit BIC Prior. Richard Startz * revised June Abstract

Illustrating the Implicit BIC Prior. Richard Startz * revised June Abstract Illustrating the Implicit BIC Prior Richard Startz * revised June 2013 Abstract I show how to find the uniform prior implicit in using the Bayesian Information Criterion to consider a hypothesis about

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 1) Fall 2017 1 / 10 Lecture 7: Prior Types Subjective

More information

Bayesian Inference for Normal Mean

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

A Discussion of the Bayesian Approach

A Discussion of the Bayesian Approach A Discussion of the Bayesian Approach Reference: Chapter 10 of Theoretical Statistics, Cox and Hinkley, 1974 and Sujit Ghosh s lecture notes David Madigan Statistics The subject of statistics concerns

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Bayesian Estimation An Informal Introduction

Bayesian Estimation An Informal Introduction Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

Theory and Methods of Statistical Inference

Theory and Methods of Statistical Inference PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:

More information

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods

Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods PhD School in Statistics XXV cycle, 2010 Theory and Methods of Statistical Inference PART I Frequentist likelihood methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Inference for a Population Proportion

Inference for a Population Proportion Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

Recap on Data Assimilation

Recap on Data Assimilation Concluding Thoughts Recap on Data Assimilation FORECAST ANALYSIS Kalman Filter Forecast Analysis Analytical projection of the ANALYSIS mean and cov from t-1 to the FORECAST mean and cov for t Update FORECAST

More information

Bayesian analysis in nuclear physics

Bayesian analysis in nuclear physics Bayesian analysis in nuclear physics Ken Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory Tutorials presented at LANSCE Los Alamos Neutron Scattering Center July 25 August

More information

Fundamental Probability and Statistics

Fundamental Probability and Statistics Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

More information

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain A BAYESIAN MATHEMATICAL STATISTICS PRIMER José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014 ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What

More information

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery Statistical Methods in Particle Physics Lecture 2: Limits and Discovery SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

Bayesian Econometrics

Bayesian Econometrics Bayesian Econometrics Christopher A. Sims Princeton University sims@princeton.edu September 20, 2016 Outline I. The difference between Bayesian and non-bayesian inference. II. Confidence sets and confidence

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Spring, 2006 1. DeGroot 1973 In (DeGroot 1973), Morrie DeGroot considers testing the

More information

Bayesian Assessment of Hypotheses and Models

Bayesian Assessment of Hypotheses and Models 8 Bayesian Assessment of Hypotheses and Models This is page 399 Printer: Opaque this 8. Introduction The three preceding chapters gave an overview of how Bayesian probability models are constructed. Once

More information

Deciding, Estimating, Computing, Checking

Deciding, Estimating, Computing, Checking Deciding, Estimating, Computing, Checking How are Bayesian posteriors used, computed and validated? Fundamentalist Bayes: The posterior is ALL knowledge you have about the state Use in decision making:

More information

Deciding, Estimating, Computing, Checking. How are Bayesian posteriors used, computed and validated?

Deciding, Estimating, Computing, Checking. How are Bayesian posteriors used, computed and validated? Deciding, Estimating, Computing, Checking How are Bayesian posteriors used, computed and validated? Fundamentalist Bayes: The posterior is ALL knowledge you have about the state Use in decision making:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Why Try Bayesian Methods? (Lecture 5)

Why Try Bayesian Methods? (Lecture 5) Why Try Bayesian Methods? (Lecture 5) Tom Loredo Dept. of Astronomy, Cornell University http://www.astro.cornell.edu/staff/loredo/bayes/ p.1/28 Today s Lecture Problems you avoid Ambiguity in what is random

More information

Bayesian Statistics as an Alternative for Analyzing Data and Testing Hypotheses Benjamin Scheibehenne

Bayesian Statistics as an Alternative for Analyzing Data and Testing Hypotheses Benjamin Scheibehenne Bayesian Statistics as an Alternative for Analyzing Data and Testing Hypotheses Benjamin Scheibehenne http://scheibehenne.de Can Social Norms Increase Towel Reuse? Standard Environmental Message Descriptive

More information

Primer on statistics:

Primer on statistics: Primer on statistics: MLE, Confidence Intervals, and Hypothesis Testing ryan.reece@gmail.com http://rreece.github.io/ Insight Data Science - AI Fellows Workshop Feb 16, 018 Outline 1. Maximum likelihood

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling 2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction and Motivating

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

Lecture 5: Bayes pt. 1

Lecture 5: Bayes pt. 1 Lecture 5: Bayes pt. 1 D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Bayes Probabilities

More information

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation PRE 905: Multivariate Analysis Spring 2014 Lecture 4 Today s Class The building blocks: The basics of mathematical

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Checking for Prior-Data Conflict

Checking for Prior-Data Conflict Bayesian Analysis (2006) 1, Number 4, pp. 893 914 Checking for Prior-Data Conflict Michael Evans and Hadas Moshonov Abstract. Inference proceeds from ingredients chosen by the analyst and data. To validate

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information

Divergence Based priors for the problem of hypothesis testing

Divergence Based priors for the problem of hypothesis testing Divergence Based priors for the problem of hypothesis testing gonzalo garcía-donato and susie Bayarri May 22, 2009 gonzalo garcía-donato and susie Bayarri () DB priors May 22, 2009 1 / 46 Jeffreys and

More information

Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009

Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009 Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, 14 27 June 2009 Glen Cowan Physics Department Royal Holloway, University

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017 Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

Bayesian Information Criterion as a Practical Alternative to Null-Hypothesis Testing Michael E. J. Masson University of Victoria

Bayesian Information Criterion as a Practical Alternative to Null-Hypothesis Testing Michael E. J. Masson University of Victoria Bayesian Information Criterion as a Practical Alternative to Null-Hypothesis Testing Michael E. J. Masson University of Victoria Presented at the annual meeting of the Canadian Society for Brain, Behaviour,

More information

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017 Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Investigation into the use of confidence indicators with calibration

Investigation into the use of confidence indicators with calibration WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings

More information

Introduction to Bayesian Statistics. James Swain University of Alabama in Huntsville ISEEM Department

Introduction to Bayesian Statistics. James Swain University of Alabama in Huntsville ISEEM Department Introduction to Bayesian Statistics James Swain University of Alabama in Huntsville ISEEM Department Author Introduction James J. Swain is Professor of Industrial and Systems Engineering Management at

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

All models are wrong but some are useful. George Box (1979)

All models are wrong but some are useful. George Box (1979) All models are wrong but some are useful. George Box (1979) The problem of model selection is overrun by a serious difficulty: even if a criterion could be settled on to determine optimality, it is hard

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

HYPOTHESIS TESTING: FREQUENTIST APPROACH. HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous

More information

Introduction to Applied Bayesian Modeling. ICPSR Day 4

Introduction to Applied Bayesian Modeling. ICPSR Day 4 Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the

More information

Seminar über Statistik FS2008: Model Selection

Seminar über Statistik FS2008: Model Selection Seminar über Statistik FS2008: Model Selection Alessia Fenaroli, Ghazale Jazayeri Monday, April 2, 2008 Introduction Model Choice deals with the comparison of models and the selection of a model. It can

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Prior information, but no MCMC:

Prior information, but no MCMC: Physikalisch-Technische Bundesanstalt Braunschweig and Berlin National Metrology Institute Prior information, but no MCMC: A Bayesian Normal linear regression case study K. Klauenberg, G. Wübbeler, B.

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some

More information