Biostat 2065 Analysis of Incomplete Data

Size: px
Start display at page:

Download "Biostat 2065 Analysis of Incomplete Data"

Transcription

1 Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005

2 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies that ( θ θ) N(0, C), where C = J 1 (θ) is the inverse of the expected information. Therefore a natural estimate of C is C 1 = J 1 ( θ). An alternative estimate is C 2 = I 1 (θ). When misspecification is plausible, a robust estimate of the variance is where C 3 = I 1 ( θ)k( θ)i 1 ( θ), K(θ) = l(θ) l(θ) T. θ θ In Newton-Raphson algorithm, C 2 is computed as part of an iteration step. The components of C 3 can be easily obtained from the score and C 2 in NR algorithm. When the EM algorithm or one of its variants is used for ML estimation, additional steps are needed to compute standard errors of the estimates. For example, further calculation C 2 or Louis s formular may be necessary.

3 2. Supplemented EM algorithm Supplemented EM(SEM) (Meng & Rubin, 1991) is a way to calculate the large-sample covariance matrix of θ using only 1. code for the E and M steps of EM. 2. code for the large-sample complete-data variance-covariance matrix, V c. 3. standard matrix operations. Recall that DM = I i obs i 1 com, where DM is the derivative of the EM mappping, i com is the complete information and i obs = I(θ Y obs ) is the observed information. Therefore, I 1 (θ Y obs ) = i 1 obs = i 1 com(i DM) 1 = V com (I DM) 1. Denote V obs = I 1 (θ Y obs ), then V obs = V com (I DM + DM)(I DM) 1 = V com {I + DM(I DM) 1 } := V com + V. where V = V com DM(I DM) 1 is the increase in variance due to missing data. Though the map M has no explicit form, its derivative DM can be effectively approximated using some extra EM steps.

4 3. Implementation of SEM First obtain the MLE θ and then run a sequence of SEM iterations as following: the initival value is close to θ and at step t, the currence estimate is θ (t), 1. Run the usual E and M steps to obtain θ (t+1). 2. Fis i = 1, calculate θ (t) (i) = ( θ 1,..., θ i 1, θ (t) i, θ i+1,..., θ d ), 3. Treating θ (t) (i) as the current estimate of θ, run one iteration of EM to obtain θ (t+1) (i). 4. Obtain the ratio r (t) ij = θ (t+1) j (i) θ j θ (t) i θ, for j = 1,..., d. i 5. Repeat steps 2 to 4 for i = 2,..., d. The output θ (t+1) and {r (t) ij : i, j = 1,..., d}. With t, the elements of r (t) ij and the limit will be approximation of DM. converge

5 4. Other methods Bootstrapping the observed data Bayesian methods: using the posterior variance under a flat prior

6 5. The ECM algorithm There are situations where the M step does not have an explicit solution even the complete data are from an exponential family. Usually iterative procedures are required for each M step. Sometimes the M-step can be modified into several conditional maximization steps in order to avoid iterative M-steps. This modification is called ECM algorithm. Suppose the parameter θ = {θ 1, θ 2,..., θ S }, and the current estimate is θ (t) = {θ (t) 1,..., θ (t) S }, then the M-step consists of S CM-steps: 1. At the (t + 1/S)th CM-step, let θ = {θ 1, θ (t) 2,..., θ (t) S }, then maximize Q( θ; θ (t) ) with respect to θ 1. Denote the maximizer as θ (t+1/s) Similarly, at the (t + 2/S)th CM-step, let θ = {θ (t+1/s) 1, θ 2, θ (t) 3,..., θ (t) S }, maximize Q( θ; θ (t) ) with respect to θ 2. Denote the maximizer as θ (t+2/s) Repeat sequentially by maxmizing the Q function with respect to θ s and all other parameters are fixed at the previous values, s = 1, 2,..., S. After all, θ (t+1) = {θ (t+1/s) 1,..., θ (t+s/s) S is the updated estimate of θ for the subsequent E-step. Since each CM step increases Q, ECM is a GEM algorithm and monotonically increases the likelihood of θ. Under the same conditions that guarantee the convergence of EM, ECM converges to a stationary point of the likelihood, i.e., a solution to the score equation of θ.

7 Example 8.6. A multivariate normal regression model with incomplete data. Model: y i N K (X i β, Σ), i = 1, 2,..., n.

8 6. Univariate t with unknown degrees of freedom Suppose that the observed data consist of a random sample X = (x 1, x 2,..., x n ) from a Student s t distribution with center µ, scale parameter σ, and unknown degrees of freedom ν, with density f(x i ; θ) = Γ(ν/2 + 1/2) (πνσ 2 ) 1/2 Γ(ν/2){1 + (x i µ) 2 /(νσ 2 )} (ν+1)/2. An augmented complete dataset can be defined as Y = (Y obs, Y mis ), where Y obs = X and Y mis = W = (w 1, w 2,..., w n ) is a vector of unobserved positive quantities, such that pairs (w i, x i ) are independent across units i, with distribution (x i w i ; θ) N(µ, σ 2 /w i ), (w i ; θ) χ 2 ν/ν. The M step is complicated by the estimation of ν. It can be replaced by two CM-steps: 1. CM1: For current parameters θ (t) = (µ (t), σ (t), ν (t) ), maximize Q with respect to (µ, σ) and fixed ν = ν (t). It yields (µ, σ) = (µ (t+1), σ (t+1) ). 2. CM2: Maximize Q with respect to ν with (µ, σ) = (µ (t+1), σ (t+1) ). The completedata loglikelihood is n l(µ, σ 2, ν; Y ) = n/2 log σ 2 1/2 w i (x i µ) 2 /σ 2 + nν/2 log(ν/2) n log Γ(ν/2) + (ν/2 1) i=1 n log w i ν/2 i=1 n w i. i=1

9 Sufficient statistics are... Since ν is a scalar, the maximizer ν (t+1) can be found by an iterative one-dimensional search.

10 7. ECME algorithm The ECME(Expectation/Conditional Maximization Either) algorithm replaces some of the CM steps of ECM, which maximize the contrained expected complete-data loglikelihood function (the Q-function), with steps that maximize the correspondingly constrained actual likelihood function. 1. ECME shares the stable monotone convergence and simplicity of implementation. 2. ECME can have a substantially faster convergence rate than EM or ECM: (a) In some of ECME s M-steps, the actual likelihood (rather than an approximation of it) is being conditionally maximized. (b) ECME allows faster computation with constrained maximization. Example 8.9. Univariate t with unknown degrees of freedom. An ECME algorithm is obtained by retaining the E and CM1 steps of the previous example, but replacing the CM2 step by maximizing the observed loglikelihood with repect to ν.

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

EM for ML Estimation

EM for ML Estimation Overview EM for ML Estimation An algorithm for Maximum Likelihood (ML) Estimation from incomplete data (Dempster, Laird, and Rubin, 1977) 1. Formulate complete data so that complete-data ML estimation

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Computational statistics

Computational statistics Computational statistics EM algorithm Thierry Denœux February-March 2017 Thierry Denœux Computational statistics February-March 2017 1 / 72 EM Algorithm An iterative optimization strategy motivated by

More information

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Wishart Priors Patrick Breheny March 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Introduction When more than two coefficients vary, it becomes difficult to directly model each element

More information

Bootstrap and Parametric Inference: Successes and Challenges

Bootstrap and Parametric Inference: Successes and Challenges Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist

More information

Estimating the parameters of hidden binomial trials by the EM algorithm

Estimating the parameters of hidden binomial trials by the EM algorithm Hacettepe Journal of Mathematics and Statistics Volume 43 (5) (2014), 885 890 Estimating the parameters of hidden binomial trials by the EM algorithm Degang Zhu Received 02 : 09 : 2013 : Accepted 02 :

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

1 One-way analysis of variance

1 One-way analysis of variance LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.

More information

Maximum likelihood estimation via the ECM algorithm: A general framework

Maximum likelihood estimation via the ECM algorithm: A general framework Biometrika (1993), 80, 2, pp. 267-78 Printed in Great Britain Maximum likelihood estimation via the ECM algorithm: A general framework BY XIAO-LI MENG Department of Statistics, University of Chicago, Chicago,

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

STATISTICAL ANALYSIS WITH MISSING DATA

STATISTICAL ANALYSIS WITH MISSING DATA STATISTICAL ANALYSIS WITH MISSING DATA SECOND EDITION Roderick J.A. Little & Donald B. Rubin WILEY SERIES IN PROBABILITY AND STATISTICS Statistical Analysis with Missing Data Second Edition WILEY SERIES

More information

THE UNIVERSITY OF CHICAGO CONSTRUCTION, IMPLEMENTATION, AND THEORY OF ALGORITHMS BASED ON DATA AUGMENTATION AND MODEL REDUCTION

THE UNIVERSITY OF CHICAGO CONSTRUCTION, IMPLEMENTATION, AND THEORY OF ALGORITHMS BASED ON DATA AUGMENTATION AND MODEL REDUCTION THE UNIVERSITY OF CHICAGO CONSTRUCTION, IMPLEMENTATION, AND THEORY OF ALGORITHMS BASED ON DATA AUGMENTATION AND MODEL REDUCTION A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential

More information

PARAMETER CONVERGENCE FOR EM AND MM ALGORITHMS

PARAMETER CONVERGENCE FOR EM AND MM ALGORITHMS Statistica Sinica 15(2005), 831-840 PARAMETER CONVERGENCE FOR EM AND MM ALGORITHMS Florin Vaida University of California at San Diego Abstract: It is well known that the likelihood sequence of the EM algorithm

More information

ECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015

ECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015 Reading ECE 275B Homework #2 Due Thursday 2/12/2015 MIDTERM is Scheduled for Thursday, February 19, 2015 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example

More information

Label Switching and Its Simple Solutions for Frequentist Mixture Models

Label Switching and Its Simple Solutions for Frequentist Mixture Models Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 8 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics March 5, 2018 Distributional approximations 2 / 8 Distributional approximations are useful for quick inferences, as starting

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Fitting Narrow Emission Lines in X-ray Spectra

Fitting Narrow Emission Lines in X-ray Spectra Outline Fitting Narrow Emission Lines in X-ray Spectra Taeyoung Park Department of Statistics, University of Pittsburgh October 11, 2007 Outline of Presentation Outline This talk has three components:

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Outline Maximum likelihood (ML) Priors, and

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling François Bachoc former PhD advisor: Josselin Garnier former CEA advisor: Jean-Marc Martinez Department

More information

Bayesian inference for multivariate skew-normal and skew-t distributions

Bayesian inference for multivariate skew-normal and skew-t distributions Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential

More information

Approximations based on posterior modes

Approximations based on posterior modes CHAPTER 12 Approximations based on posterior modes The early chapters of the book describe simulation approaches that work in low-dimensional problems. With complicated models, it is rare that samples

More information

Heriot-Watt University

Heriot-Watt University Heriot-Watt University Heriot-Watt University Research Gateway Prediction of settlement delay in critical illness insurance claims by using the generalized beta of the second kind distribution Dodd, Erengul;

More information

Maximum Likelihood Estimation. only training data is available to design a classifier

Maximum Likelihood Estimation. only training data is available to design a classifier Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional

More information

Model Checking and Improvement

Model Checking and Improvement Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Some general observations.

Some general observations. Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

MODEL BASED CLUSTERING FOR COUNT DATA

MODEL BASED CLUSTERING FOR COUNT DATA MODEL BASED CLUSTERING FOR COUNT DATA Dimitris Karlis Department of Statistics Athens University of Economics and Business, Athens April OUTLINE Clustering methods Model based clustering!"the general model!"algorithmic

More information

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

MAXIMUM LIKELIHOOD ESTIMATION OF FACTOR ANALYSIS USING THE ECME ALGORITHM WITH COMPLETE AND INCOMPLETE DATA

MAXIMUM LIKELIHOOD ESTIMATION OF FACTOR ANALYSIS USING THE ECME ALGORITHM WITH COMPLETE AND INCOMPLETE DATA Statistica Sinica 8(1998), 79-747 MAXIMUM LIKELIHOOD ESTIMATION OF FACTOR ANALYSIS USING THE ECME ALGORITHM WITH COMPLETE AND INCOMPLETE DATA Chuanhai Liu and Donald B. Rubin Bell Labs and Harvard University

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh September 13 & 15, 2005 1. Complete-case analysis (I) Complete-case analysis refers to analysis based on

More information

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Lecture Notes. Introduction

Lecture Notes. Introduction 5/3/016 Lecture Notes R. Rekaya June 1-10, 016 Introduction Variance components play major role in animal breeding and genetic (estimation of BVs) It has been an active area of research since early 1950

More information

ANALYSIS OF TWO-LEVEL STRUCTURAL EQUATION MODELS VIA EM TYPE ALGORITHMS

ANALYSIS OF TWO-LEVEL STRUCTURAL EQUATION MODELS VIA EM TYPE ALGORITHMS Statistica Sinica 8(1998), 749-766 ANALYSIS OF TWO-LEVEL STRUCTURAL EQUATION MODELS VIA EM TYPE ALGORITHMS Sik-Yum Lee and Wai-Yin Poon Chinese University of Hong Kong Abstract: In this paper, the maximum

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood PRE 906: Structural Equation Modeling Lecture #3 February 4, 2015 PRE 906, SEM: Estimation Today s Class An

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

analysis of incomplete data in statistical surveys

analysis of incomplete data in statistical surveys analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin

More information

CS Lecture 18. Expectation Maximization

CS Lecture 18. Expectation Maximization CS 6347 Lecture 18 Expectation Maximization Unobserved Variables Latent or hidden variables in the model are never observed We may or may not be interested in their values, but their existence is crucial

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

A Bayesian Treatment of Linear Gaussian Regression

A Bayesian Treatment of Linear Gaussian Regression A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

ECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012

ECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012 Reading ECE 275B Homework #2 Due Thursday 2-16-12 MIDTERM is Scheduled for Tuesday, February 21, 2012 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example 7.11,

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute

More information

Yaming Yu Department of Statistics, University of California, Irvine Xiao-Li Meng Department of Statistics, Harvard University.

Yaming Yu Department of Statistics, University of California, Irvine Xiao-Li Meng Department of Statistics, Harvard University. Appendices to To Center or Not to Center: That is Not the Question An Ancillarity-Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency Yaming Yu Department of Statistics, University of

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 4: Types of errors. Bayesian regression models. Logistic regression Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1. Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

A STRATEGY FOR STEPWISE REGRESSION PROCEDURES IN SURVIVAL ANALYSIS WITH MISSING COVARIATES. by Jia Li B.S., Beijing Normal University, 1998

A STRATEGY FOR STEPWISE REGRESSION PROCEDURES IN SURVIVAL ANALYSIS WITH MISSING COVARIATES. by Jia Li B.S., Beijing Normal University, 1998 A STRATEGY FOR STEPWISE REGRESSION PROCEDURES IN SURVIVAL ANALYSIS WITH MISSING COVARIATES by Jia Li B.S., Beijing Normal University, 1998 Submitted to the Graduate Faculty of the Graduate School of Public

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information