Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples
|
|
- John Stone
- 5 years ago
- Views:
Transcription
1 Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples
2 Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments, maximum likelihood Sketch main ideas of maximum likelihood, an important approach that underlies statistical inferences for many common models: Linear and nonlinear regression Generalized linear models (logistic, Poission regression) Repeated measures models (SAS PROC MIXED, NLMIXED Survival analysis proportional hazards models
3 Likelihood methods Statistical model + data Likelihood Two general approaches based on likelihood maximum likelihood inference (for large samples) Bayesian inference (better for small samples): log(likelihood) + log(prior) = log(posterior) Methods do not require rectangular data sets can be applied to incomplete data 3
4 Definition of Likelihood Data Y Statistical model yields probability density for Y with unknown parameters Likelihood function is then a function of L( Y ) const f ( Y ) f( Y ) Loglikelihood is often easier to work with: ( Y ) log L( Y ) const log{ f ( Y )} Constants can depend on data but not on parameter 4
5 Example: Normal sample univariate iid normal sample Y ( y1,..., y n ) (, ) n/ n (, ) exp yi i1 f Y 1 n (, ) ln n Y i i 1 1 y 5
6 Example: Multinomial sample Y ( y1,..., y n ) univariate K-category multinomial sample n j y i = number of equal to j (j=1,,k) (,..., ); K 1 K 1 K1 K 1 n! n j f( Y 1,..., K 1) j ( K 1) n1!... nk! j1 K 1 ( 1,..., K 1 Y) nj log jnk log( K 1) j1 n K 6
7 Maximum Likelihood Estimate The maximum likelihood (ML) estimate ˆ of maximizes the likelihood, or equivalently the loglikelihood L( ˆ Y) L( Y) for all The ML estimate is the value of the parameter that makes the data most likely The ML estimate is not necessarily unique, but is for many regular problems given enough data 7
8 Computing the ML estimate In regular problems, the ML estimate can be found by solving the likelihood equation S( Y) 0 where S is the score function, defined as the first derivative of the loglikelihood: S( Y) log L( Y) For some models (e.g. multiple linear regression), likelihood equation has an explicit solution; for others (e.g. logistic regression) numerical optimization methods are needed 8
9 Normal Examples Univariate Normal sample Y ( y1,..., y n ) (, ) n n 1 ˆ y 1 y ˆ i ( yi y) n n i1 (Note the lack of a correction for degrees of freedom) Multivariate Normal sample T y yi y yi y n i1 Normal Linear Regression (possibly weighted) i1 n ˆ 1 ˆ, ( )( ) p ( yi xi1,..., xip) ~ N( 0 jxij, / ui) j1 ˆ ( ˆ, ˆ,..., ˆ ) weighted least squares estimates 0 1 p ˆ (weighted residual sum of squares)/n 9
10 Multinomial Example Y ( y,..., y ); y ~ MNOM(,..., ) 1 n i 1 K n number of y equal to j ( j 1,..., K) j i Likelihood Equations: l n j nk 0, j 1,..., K 1 j j K 1 Hence ML estimate is ˆ n / n, j 1,..., K j j 10
11 Logistic regression exp f ( ) i Pr( yi 1 xi 1,..., xip) i( ) 1 exp f ( ) f ( ) x i 0 j ij j1 n ( ) y ( ) (1 y )(1 ( ) i1 p i i i i ML estimation requires iterative methods like method of scoring i 11
12 ML for mixed-effects models y ( y, y ) : k-dimensional vector of repeated measures i obs, i mis, i ( yi Xi, i)~ Nk( X1 i Xi, ) are fixed effects; are random effects: ~ N (0, ) Missing Data Mechanism: missing at random ML requires iterative algorithms e.g. Harville (1977), Laird and Ware (198), SAS Proc Mixed Very flexible mean and covariance structures Normality not a major assumption if N large, and recent programs allow for non-normal outcomes i q 1
13 Properties of ML estimates Under assumed model, ML estimate is: Consistent (not necessarily unbiased) Efficient for large samples not necessarily the best for small samples ML estimate is transformation invariant If is the ML estimate of Then () is the ML estimate of ( ) 13
14 Large-sample ML Inference Basic large-sample approximation: for regular problems, ˆ ~ N(0, C) where C is a covariance matrix estimated from the sample Frequentist treats as random, as fixed; equation defines the sampling distribution of Bayesian treats as random, as fixed; equation defines posterior distribution of 14
15 Forms of precision matrix The precision of the ML estimate is measured by Some forms for this are: Observed information (recommended) L Y C I Y 1 log ( ) ( ) Expected information (not as good, may be simpler) C 1 1 C J() E I( Y, ) Some other approximation to curvature of loglikelihood in the neighborhood of the ML estimate 15
16 Interval estimation 95% (confidence, probability) interval for scalar is:,where 1.96 is 97.5 pctile of normal distribution Example: univariate normal sample L n I J N M O / 0 n Q P 4 0 /( ) Hence some 95% intervals are: / L N M O C Q P 4 n 0 0 / n 16
17 Significance Tests Tests based on likelihood ratio (LR) or Wald (W) statistics: ( (1), () ); (1)0 null value of (1) ; = other parameters unrestricted ML estimate (, ); ML estimate of given (1)0 () () () (1) (1)0 LR statistic: LR( ˆ, ) ( ˆ Y) ( Y) 1 Wald statistic: ˆ ˆ T W(, ) ( ) C ( ˆ ) (1)0 (1) (11) (1)0 (1) C(11) covariance matrix of ( (1) ˆ (1) ) yield P-values P pr ˆ q D(, ) D = LR or Wald statistic; q = dimension of q = Chi-squared distribution with q degrees of freedom 0 17
18 Bayesian Modeling Bayesian model adds a prior distribution for the parameters: ( Y, ) ~ ( Z) f( Y Z, ), ( Z) prior distribution Inference about is based on posterior distribution from Bayes Theorem: p( Z, Yinc) ( Z) L( Z, Yinc ), L = likelihood Inference about finite population quantitity QY ( ) based on pqy ( ( ) Yinc) posterior predictive distribution I Z Y Y inc of Q given sample values Y inc p( QY ( ) ZY, inc) pqy ( ( ) ZY, inc, ) p( ZY, inc) d (Integrates out nuisance parameters ) In the super-population modeling approach, parameters are considered fixed and estimated In the Bayesian approach, parameters are random and assigned prior distributions leads to better small-sample inference Yˆ exc
19 Bayes Inference Inferences about Q are based on its posterior predictive distribution: estimate is posterior mean: q E( QY inc ) standard error is posterior sd: s Var( Q Y inc ) 95% posterior probability (or credibility) interval plays role of confidence interval (but with simpler interpretation) In large samples, a 95% interval is qˆ 1.96s In small samples, can use highest posterior density (hpd) interval, or.5 th to 97.5 th percentiles of posterior distribution (often simulated using MCMC draws from the posterior distribution) 19
20 Inference about population quantities Inferences about Q are conveniently obtained by first conditioning on and then averaging over posterior of. In particular, the posterior mean is: EQY ( inc) EEQY ( inc, ) Yinc and the posterior variance is: Var( Q Y ) E Var( Q Y, ) Y Var E( Q Y, ) Y inc inc inc inc inc Value of this technique will become clear in applications Finite population corrections are automatically obtained as differences in the posterior variances of Q and Inferences based on full posterior distribution useful in small samples (e.g. provides t corrections ) 0
21 Example: linear regression The normal linear regression model: p ( yi xi1,..., xip) ~ N( 0 jxij, ) j1 with non-informative Jeffreys prior: p ( 0,..., p,log ) const. yields the posterior distribution of ( as multivariate 0,..., p) T with mean given by the least squares estimates ( ˆ 0,..., ˆ p ) T 1 covariance matrix ( X X) s, where X is the design matrix, and degrees of freedom n - p - 1. Resulting posterior probability intervals are equivalent to standard t confidence intervals. 1
22 Simulating Draws from Posterior Distribution With problems with high-dimensional, it is often easier to draw values from the posterior distribution, and base inferences on these draws For example, if ( d ) ( 1 : d 1,..., D) is a set of draws from the posterior distribution for a scalar parameter 1, then 1 D ( d ) D approximates posterior mean s 1 d D ( d ) D d ( 1) ( ) approximates posterior variance ( s ) or.5th to 97.5th percentiles of draws approximates 95% posterior credibility interval for
23 Example: Posterior Draws for Normal Linear Regression ˆ s (, ) ls estimates of slopes and resid variance ( d ) np1 ( d) T ( d) np1 ( n p1) s / ˆ Az = chi-squared deviate with n p1 df T z ( z,..., z ), z ~ N(0,1) 1 p1 i T 1 A upper triangular Cholesky factor of ( X X) : T AA T ( X X) 1 Easily extends to weighted regression: see Example
24 Consulting Example In India, any person possessing a radio, transistor or television has to pay a license fee. In a densely populated area with mostly makeshift houses practically no one was paying these fees. It was determined that for enforcement to be fiscally meaningful, the proportion of households possessing one or more of these devices must exceed certain limit. 4
25 Consulting example (continued) N Y i Population Size 1, if household i has a device 0, otherwise N Q Y / N Proportion of households with a device i1 i Question of Interest: Pr( Q 0.3) If the probability of Q exceeding 0.3 is very high then enforcement might be fiscally sensible Conduct a small scale survey to answer the question of interest Note that question only makes sense under Bayes paradigm 5
26 i1 Consulting example srs of size n, Y { Y,..., Y }, Y { Y,..., Y } Y i x ~ iid Bernoulli( ) n Y i f( x ) ( ) (1 ) inc 1 n exc n1 N n x nx x ( ) 1 (0,1) N N QYi / N xyi/ N i1 in1 Model for observable Prior distribution Estimand 6
27 The posterior distribution is Binomial Example f ( x ) ( ) p( x) f( x ) ( ) f( x ) ( ) d p( x) n x nx ( ) (1 ) 1 x n x nx ( ) (1 ) d x x~ Beta( x1, nx1) 7
28 For N, Infinite Population N Pr( Y 0.3 x) Pr( 0.3 x) N Y Compute using cumulative distribution function of a beta distribution which is a standard function in most software such as SAS, R What is the maximum proportion of households in the population with devices that can be said with great certainty? Pr(? x) 0.9 Inverse CDF of Beta Distribution 8
29 Point Estimates Point estimate is often used as a single summary best value for the unknown Q Some choices are the mean, mode or the median of the posterior distribution of Q For symmetrical distributions an intuitive choice is the center of symmetry For asymmetrical distributions the choice is not clear. It depends upon the loss function. 9
30 Interval Estimation Better summary is an interval estimate Fix the coverage rate 1- in advance and determine the highest posterior density region C to include most likely values of Q totaling 1- posterior probability Fix the value Q o in advance, determine C by the collection of values of Q more likely than Q o and calculate the coverage 1- as the posterior probability of this C 30
31 Interval Estimates C is such that p Q Y p Q Y ' (1) ( inc) ( inc) ' QC Q C, () Pr( QC Y ) 1 inc Most likely is usually defined by highest posterior density Highest Posterior Density Region For symmetric unimodal posterior distributions, HPD interval is (q,q )where Pr(Q q In the Binomial example, the beta density of used to determine the interval estimate of Q 31
32 Normal simple random sample Yi ~ iid N(, ); i 1,,..., N (, ) simple random sample results in (,..., ) ny ( N n) Y QY N f y(1 f) Y exc exc Derive posterior distribution of Q Yinc y1 y n 3
33 Normal Example Posterior distribution of ( ) p(, Y ) ( ) exp ( y ) n/ i inc iinc 1 ( ) exp ( y y) / n( y) / n/1 i iinc 1 The above expressions imply that (1) Y ~ ( y y) / inc i n1 iinc () Yinc, ~ N( y, / n) 33
34 Posterior Distribution of Q Y exc, ~ (, ) N N n Yexc, Yinc ~ Ny, N n n (1 f ) n Q f y(1 f) Y exc (1 f ) Q, Yinc ~ N y, n s Yexc Yinc ~ tn 1 y, (1 f ) n (1 f ) s Q Yinc ~ tn 1 y, n 34
35 HPD Interval for Q Note the posterior t distribution of Q is symmetric and unimodal -- values in the center of the distribution are more likely than those in the tails. Thus a (1-)100 HPD interval is: y t n1,1 / (1 f) s n Like frequentist confidence interval, but recovers the t correction 35
36 Some other Estimands Suppose Q=Median or some other percentile One is better off inferring about all non-sampled values As we will see later, simulating values of Y exc add enormous flexibility for drawing inferences about any finite population quantity Modern Bayesian methods heavily rely on simulating values from the posterior distribution of the model parameters and predictive-posterior distribution of the nonsampled values Computationally, if the population size, N, is too large then choose any arbitrary value K large relative to n, the sample size National sample of size 000 US population size 306 million For numerical approximation, we can choose K=000/f, for some small f=0.01 or
37 Comparison of Two Populations Population 1 Population Population size N Sample size Y ind N(, ) 1i 1 1 (, ) n 1 1 Population size N Sample size Y ind N(, ) i (, ) n Sample Statistics : ( y, s ) n i Posterior distributions : ( n 1) s / ~ ~ N( y, / n ) Y ~ N(, ), iexc Sample Statistics : ( y, s ) n 1 i Posterior distributions : ( n 1) s / ~ ~ N( y, / n ) Y ~ N(, ), iexc 37
38 Examples Estimands Y1 Y (Finite sample version of Behrens-Fisher Problem) Difference Pr( Y1 c) Pr( Y c) Difference in the population medians Ratio of the means or medians Ratio of Variances It is possible to analytically compute the posterior distribution of some these quantities It is a whole lot easier to simulate values of nonsampled Y1 in Population 1 and Y in Population ' s ' s 38
39 Bayesian Nonparametric Inference Population: Y1, Y, Y3,..., YN All possible distinct values: Model: Pr( Yi dk) k 1 Prior: 1 Mean and Variance: d, d,..., 1 (,,..., ) if 1 EY ( ) d i k k k k k k k k Var( Y ) d i k k k d K 39
40 Bayesian Nonparametric Inference (continued) SRS of size n with n k equal to number of d k in the sample Objective is to draw inference about the population mean: Q f y(1 f) Y As before we need the posterior distribution of and exc 40
41 Nonparametric Inference (continued) Posterior distribution of is Dirichlet: Y n n ( n 1 inc) k if 1 and k k k k k k Posterior mean, variance and covariance of nk nk( n nk) E( k Yinc), Var( k Yinc) n n ( n1) Cov(, Y ) k l inc n nn k l ( n1) 41
42 nk E( Yinc) dk y n E Yinc s n k Inference for Q s n1 1 Var( Yinc) ; s ( y y) nn1 1 ( ) n 1 1 i n iinc Hence posterior mean and variance of Q are: EQY ( ) fy(1 f) E( Y ) y inc inc Var( Q Y ) (1 f ) inc s n1 nn1 4
43 Ratio and Regression Estimates Population: (y i,x i ; i=1,, N) Sample: (y i, iinc, x i, i=1,,,n). For now assume SRS Objective: Infer about the population mean Q N i1 Excluded Y s are missing values y i... y y 1 y n x x... x x x... x 1 n n1 n N 43
44 Model Specification g ( Yi xi,, ) ~ ind N( xi, xi ) i1,,..., N g known Prior distribution: (, ) g=1/: Classical Ratio estimator. Posterior variance equals randomization variance for large samples g=0: Regression through origin. The posterior variance is nearly the same as the randomization variance. g=1: HT model. Posterior variance equals randomization variance for large samples. Note that, no asymptotic arguments have been used in deriving Bayesian inferences. Makes small sample corrections and uses t- distributions. 44
45 Some Remarks For large samples, estimate and its variance under nonparametric model assumptions are very nearly the same as those under the normal model assumptions For large N, the population size, the finite population quantity is very nearly same as the model parameter (Q ). For large samples, Q E Q Y ( inc) ~ (0,1) Var( Q Y ) inc N 45
46 Remarks (Continued) Bayesian Interpretation: Summary of the excluded portion of the population has approximate normal distribution conditional on the observed data. That is Y inc is fixed and Q is random. Frequentist Interpretation: Under repeated sampling, the distribution of estimates of Q. That is Q is fixed and Y inc is random. For large samples, the frequentist and Bayes will nearly give the same numerical answers but interpretations would differ. 46
47 Remarks In much practical analysis the prior information is diffuse, and the likelihood dominates the prior information. Jeffreys (1961) developed noninformative priors based on the notion of very little prior information relative to the information provided by the data. Jeffreys derived the noninformative prior requiring invariance under parameter transformation. In general, 1/ ( ) J ( ) where log f( y ) J( ) E t 47
48 Examples of noninformative priors Normal: (, ) Binomial: ( ) (1 ) Poisson: ( ) 1/ 1/ 1/ Normal regression with slopes : (, ) In simple cases these noninformative priors result in numerically same answers as standard frequentist procedures 48
49 Comments iid (normal) model for simple random sampling is based on exchangeability ideas of De Finetti other off-the-shelf models are more appropriate for other sample designs -- hence the design influences the choice of model, as we shall see Even in this simple normal problem, Bayes is useful: t-inference is recovered for small samples by putting a prior on the unknown variance Bayes is even more attractive for more complex problems, as discussed later. 49
50 Summary Considered Bayesian predictive inference for population quantities Focused here on the population mean, but other posterior distribution of more complex finite population quantities Q can be derived Key is to compute the posterior distribution of Q conditional on the data and model Summarize the posterior distribution using posterior mean, variance, HPD interval etc Modern Bayesian analysis uses simulation technique to study the posterior distribution Models need to incorporate complex design features like unequal selection, stratification and clustering 50
Stat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationMeasurement error as missing data: the case of epidemiologic assays. Roderick J. Little
Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods
More informationGeneralized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationTABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1
TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationDepartment of Statistical Science FIRST YEAR EXAM - SPRING 2017
Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationLeast Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions
Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationA Few Special Distributions and Their Properties
A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationUnobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:
Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationBayesian Inference for Regression Parameters
Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown
More informationChapter 4 Multi-factor Treatment Designs with Multiple Error Terms 93
Contents Preface ix Chapter 1 Introduction 1 1.1 Types of Models That Produce Data 1 1.2 Statistical Models 2 1.3 Fixed and Random Effects 4 1.4 Mixed Models 6 1.5 Typical Studies and the Modeling Issues
More informationFinancial Econometrics
Financial Econometrics Estimation and Inference Gerald P. Dwyer Trinity College, Dublin January 2013 Who am I? Visiting Professor and BB&T Scholar at Clemson University Federal Reserve Bank of Atlanta
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationPart 2: One-parameter models
Part 2: One-parameter models 1 Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationSample size determination for logistic regression: A simulation study
Sample size determination for logistic regression: A simulation study Stephen Bush School of Mathematical Sciences, University of Technology Sydney, PO Box 123 Broadway NSW 2007, Australia Abstract This
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationHypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33
Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationChapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1
Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationDeccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY. SECOND YEAR B.Sc. SEMESTER - III
Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTOMONY SECOND YEAR B.Sc. SEMESTER - III SYLLABUS FOR S. Y. B. Sc. STATISTICS Academic Year 07-8 S.Y. B.Sc. (Statistics)
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationCOPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition
Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15
More informationQuiz 1. Name: Instructions: Closed book, notes, and no electronic devices.
Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the
More informationEstimation of Operational Risk Capital Charge under Parameter Uncertainty
Estimation of Operational Risk Capital Charge under Parameter Uncertainty Pavel V. Shevchenko Principal Research Scientist, CSIRO Mathematical and Information Sciences, Sydney, Locked Bag 17, North Ryde,
More informationLinear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model
Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationSolutions to the Spring 2015 CAS Exam ST
Solutions to the Spring 2015 CAS Exam ST (updated to include the CAS Final Answer Key of July 15) There were 25 questions in total, of equal value, on this 2.5 hour exam. There was a 10 minute reading
More informationLecture 3: Basic Statistical Tools. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013
Lecture 3: Basic Statistical Tools Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 1 Basic probability Events are possible outcomes from some random process e.g., a genotype is AA, a phenotype
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationGeneralized Linear Models 1
Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationStat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016
Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016 1 Theory This section explains the theory of conjugate priors for exponential families of distributions,
More informationContents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1
Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationLikelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square
Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Yuxin Chen Electrical Engineering, Princeton University Coauthors Pragya Sur Stanford Statistics Emmanuel
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationBasics of Modern Missing Data Analysis
Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing
More information4.2 Estimation on the boundary of the parameter space
Chapter 4 Non-standard inference As we mentioned in Chapter the the log-likelihood ratio statistic is useful in the context of statistical testing because typically it is pivotal (does not depend on any
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationConfidence Distribution
Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept
More informationEstimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk
Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationThe binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.
The binomial model Example. After suspicious performance in the weekly soccer match, 37 mathematical sciences students, staff, and faculty were tested for the use of performance enhancing analytics. Let
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationMaximum-Likelihood Estimation: Basic Ideas
Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationFREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE
FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia
More informationNormal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,
Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability
More informationQuiz 1. Name: Instructions: Closed book, notes, and no electronic devices.
Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1.(10) What is usually true about a parameter of a model? A. It is a known number B. It is determined by the data C. It is an
More informationSPRING 2007 EXAM C SOLUTIONS
SPRING 007 EXAM C SOLUTIONS Question #1 The data are already shifted (have had the policy limit and the deductible of 50 applied). The two 350 payments are censored. Thus the likelihood function is L =
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationTesting Linear Restrictions: cont.
Testing Linear Restrictions: cont. The F-statistic is closely connected with the R of the regression. In fact, if we are testing q linear restriction, can write the F-stastic as F = (R u R r)=q ( R u)=(n
More informationSample size calculations for logistic and Poisson regression models
Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National
More informationA Discussion of the Bayesian Approach
A Discussion of the Bayesian Approach Reference: Chapter 10 of Theoretical Statistics, Cox and Hinkley, 1974 and Sujit Ghosh s lecture notes David Madigan Statistics The subject of statistics concerns
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More information1 Exercises for lecture 1
1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )
More informationNoninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions
Communications for Statistical Applications and Methods 03, Vol. 0, No. 5, 387 394 DOI: http://dx.doi.org/0.535/csam.03.0.5.387 Noninformative Priors for the Ratio of the Scale Parameters in the Inverted
More informationGeneralized Models: Part 1
Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes
More informationSession 3 The proportional odds model and the Mann-Whitney test
Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session
More informationDefine characteristic function. State its properties. State and prove inversion theorem.
ASSIGNMENT - 1, MAY 013. Paper I PROBABILITY AND DISTRIBUTION THEORY (DMSTT 01) 1. (a) Give the Kolmogorov definition of probability. State and prove Borel cantelli lemma. Define : (i) distribution function
More information