Quasi-likelihood Scan Statistics for Detection of

Size: px
Start display at page:

Download "Quasi-likelihood Scan Statistics for Detection of"

Transcription

1 for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December / 25

2 Outline for / 25

3 for Motivation: A study of enterovirus pattern in Taiwan. Collected by CDC of Taiwan from 1999 to In 2003, cases reported in 89 administrative regions of the North Taiwan. Would temperature/humidity affect the disease rate? How to identify geographic clusters incorporated with spatial correlation and environmental covariates? 3 / 25

4 Enterovirus Data for (a) Disease Rate Missing Figur: (a) Grey-scale map of the observed enterovirus rates. A region with a slash mark means missing value. (b) Locations of estimated clusters by the QL scan statistic. 4 / 25

5 Literature Review for Two main methodologies to find geographic clusters: Likelihood ratio tests for localized clusters (e.g. Chan, 2009, Annals of ). Disease mapping on a hierarchical Bayesian model (e.g. Gangnon and Clayton, 2003, in Medicine). However, current methods could neither straight incorporate ecological covariates nor account for spatial correlation (Zhang and Lin, 2009, Biometrics). We explore a scan method based on quasi-likelihood functions for localized clusters. 5 / 25

6 for Basic concept: moving windows across space or time. Each distinct window induces a statistical model: difference in risk inside v.s. outside window Many potential choices for windows in space: circles, ellipses, arbitrary shapes We use circular windows for power consideration. 6 / 25

7 Potential Clusters for We use centroid s k to denote region k. For each s k, we sort the distances from s k to other centroids, say r k,(1) < r k,(2) < < r k,(mk ). Then B{s k, r k,(j) } is a potential cluster, j = 1,..., m k. Each potential cluster contains at most 35% of the total population. 7 / 25

8 Latent Process for Y i : responses correlated through a latent Gaussian process ɛ. Y i ɛ i Poisson(θi ), where { θ i = E i exp x iβ + } K ξ k δ Bk (s i ) + ɛ i. (1) k=1 B k and ξ k denote the kth cluster and associated coefficient, and x i and β correspond to covariates. {( ) ( 0 σ 2 σ (ɛ i, ɛ j ) N, 2 )} ρ i,j 0 σ 2 ρ i,j σ 2. 8 / 25

9 Linear Mixed Model for By moment generating functions, (1) becomes θ i = E(Y i ) = E i exp { 0.5σ 2 + x iβ + with a covariance structure given by } K ξ k δ Bk (s i ) k=1 var(y i ) = θ i + θ 2 i {exp(σ 2 ) 1}, (2) cov(y i, Y j ) = θ i θ j {exp(σ 2 ρ i,j ) 1}. (3) In (2), ξ i could be positive or negative. So we can detect hot-spot and cool-spot clusters. 9 / 25

10 Two-stage Estimation for To estimate parameters, the first step is to find a working correlation model: 1. Obtain an initial estimate { ˆβ (0), ˆξ (0) } by the traditional GLM method. 2. Let ˆɛ (0) i = log(y i ) log(ˆθ (0) i ), and obtain ˆσ (1) and ˆρ (1) i,j by variogram. 3. Plug ˆσ and ˆρ (1) i,j to (3) to get a working covariance matrix. 10 / 25

11 Quasi-likelihood Estimating Equation for The second estimation step is to use a QL estimating equation Q( ˆβ, ˆξ; ˆσ, ˆρ i,j ) = D ( ˆβ, ˆξ; ˆσ)V 1 ( ˆβ, ˆξ; ˆσ, ˆρ i,j ){Y θ( ˆβ, ˆξ; ˆσ)} = 0. We derive a new estimate by a Newton-Raphson iteration { ˆβ(k+1), ˆξ (k+1)} { = ˆβ(k), ˆξ (k)} + [ (D V 1 D) 1 D V 1 {Y θ(β, ξ; ˆσ)} ] {, ˆβ(k), ˆξ (k),ˆσ,ˆρ i,j } 11 / 25

12 Asymptotic Properties for Assumption: (a) Number of misspecified regions to a cluster is o(n). (b) Working correlation models ρ i,j = o(h d ). Theorem: Under the assumptions, the QL estimates ( ˆβ, ˆξ) are consistent. Furthermore, if n 1/2 θ θ 1 0 uniformly, then n 1/2 {( ˆβ, ˆξ) (β, ξ)} N{0, I 1 1 (β )I 0 (θ )I 1 1 (θ )} 12 / 25

13 Multiple Testing Problem for Let H denote the null hypothesis of no geographic cluster. When using the QL scan statistic, we would face a multiple testing problem. We adapt a parametric bootstrapping approach to find a correction p-value p / 25

14 Parametric Bootstrapping for (a) We compute E i = n i ˆp, where ˆp = Y i / n i, and fit a null marginal model θ i = E i exp(β 0 + β 1 x i ). Let ( β 0, β 1 ) and ( σ, ρ i,j ) be the two-stage estimates. (b) We simulate Y (j) from models (2) and (3) based on ( β 0, β 1 ) and ( σ, ρ i,j ), j = 1,..., n 0. (c) For the jth simulated data, we use the QL scan statistic get p-values of all potential clusters. 14 / 25

15 Estimated Clusters for (d) Let ˆp (j) be the smallest p-value for the jth simulated data, j = 1,..., n 0. (e) We sort P = {ˆp (1),..., ˆp (n 0) } and set p 0 to be the 95% quantile of P. (f) For the real data, we compare ˆp i,(j) and p 0. A potential cluster B{s i, r i,(j) } enters the final screening if ˆp i,(j) is less than p / 25

16 Quasi-deviance Criteria for Let B = { ˆB 1,..., ˆB m} be a set of estimated clusters from bootstrapping. Some of them would have overlapping regions. For the overlapping clusters, we screen them with a quasi-deviance (Lin, 2010) D(ˆξ i, ˆξ j ) = 0.5{θ(ˆξ i ; ) θ(ˆξ j ; )} T [ ] V 1 (ˆξ i ; ){Y θ(ˆξ i )} + V 1 (ˆξ j ; ){Y θ(ˆξ j )}. 16 / 25

17 Cluster Reformation for Using the quasi-deviance criteria to overlapping clusters, we discard ˆB j if D(ˆξ i, ˆξ j ) > χ2 1,0.95, or, combine ˆB i and ˆB j if D(ˆξ i, ˆξ j ) χ2 1,0.95, and sequentially screen clusters from the one with a smallest p-value to the one with a largest p-value. For the final K disjoint clusters ˆB 1,..., ˆB K, the two-stage estimation procedure gives { } ˆθ i = E i exp 0.5ˆσ 2 + x ˆβ K i + ˆξ k δ ˆBk (s i ). k=1 17 / 25

18 for Scenario: ɛ: GRF with mean 0, σ 2 = 0.25 and ρ = 0.7. Y i ɛ i : Poisson with intensity θ i = exp{β 0 + ξ 1 δ B1 (s i ) + ɛ i }. parameters: β 0 = 0.6, ξ = 0, 0.5, 0.75, 1.0, or 1.5. cluster: B 1 is a cluster centered at Zhong-li city with 15 adjacent regions. 18 / 25

19 Kulldorff s LR Test for When no covariates, Kulldorff (1997) suggested to take a maximum value of likelihood ratio over all possible clusters. That is, where LR max = max k,j LR k,rk,(j), LR k,rk,(j) = sup L(Θ 1 y)/ sup L(Θ 0 y). We compare the QL scan statistic with LR. 19 / 25

20 Comparison of Power Functions for Figur: A comparison of power based on 1000 simulations. (a) The pre-specified cluster. (b) The estimated power functions. (a) (b) Latitude % Pop 15 Regions power QL LR Longitude coefficient 20 / 25

21 for Tabel: Conditional sample means and standard errors for QL and GLM scan statistics. QL GLM ˆβ 1 ˆβ2 ˆξ1 ˆξ2 ˆβ1 ˆβ2 ˆξ1 ˆξ2 (β1, β 2, ξ 1, ξ 2 ) = ( 0.475, 0.5, 0.5, 0.5) Estimate Asym SE (0.189) (0.266) (0.365) (0.360) (0.132) (0.284) (0.441) (0.432) Sample SE (0.158) (0.254) (0.330) (0.328) (0.115) (0.253) (0.416) (0.377) (β1, β 2, ξ 1, ξ 2 ) = ( 0.475, 0.5, 1.0, 1.0) Estimate Asym SE (0.195) (0.248) (0.348) (0.336) (0.156) (0.291) (0.379) (0.348) Sample SE (0.156) (0.231) (0.332) (0.321) (0.110) (0.262) (0.295) (0.318) (β1, β 2, ξ 1, ξ 2 ) = ( 0.475, 0.5, 1.5, 1.5) Estimate Asym SE (0.206) (0.232) (0.357) (0.332) (0.306) (0.337) (0.323) (0.385) Sample SE (0.196) (0.245) (0.322) (0.310) (0.293) (0.298) (0.272) (0.390) 21 / 25

22 Data for 2003 North Taiwan Data: 21 regions without observations. Environmental covariates temperature and humidity were standardized to be in (-1,1). We use the QL scan statistic for the 2003 data to get Parameter Estimate Std. Err p-value Intercept Temp Humidity Cluster Cluster / 25

23 Data for The estimated model for the disease rate of the ith region is ˆθ i =0.0032n i exp{14.24(temp) i (humidity i ) 0.75δ B1 (s i ) δ B2 (s i )}, with a correlation estimate ˆρ i,j = 0.94 exp( s i s j /3.32) and ˆσ 2 = / 25

24 Result for (a) Disease Rate Missing Figur: (a) Grey-scale map of the observed enterovirus rates. A region with a slash mark means missing value. (b) Locations of estimated clusters by the QL scan statistic. 24 / 25

25 Future Work for Variogram may not work for some potential cluster models. Non-stationary correlation models for latent processes. Add temporal random effects for a Bayesian dynamic linear model. 25 / 25

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Cluster Detection Based on Spatial Associations and Iterated Residuals in Generalized Linear Mixed Models

Cluster Detection Based on Spatial Associations and Iterated Residuals in Generalized Linear Mixed Models Biometrics 65, 353 360 June 2009 DOI: 10.1111/j.1541-0420.2008.01069.x Cluster Detection Based on Spatial Associations and Iterated Residuals in Generalized Linear Mixed Models Tonglin Zhang 1, and Ge

More information

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III) Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable

More information

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017 Poisson Regression Gelman & Hill Chapter 6 February 6, 2017 Military Coups Background: Sub-Sahara Africa has experienced a high proportion of regime changes due to military takeover of governments for

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00 / 5 Administration Homework on web page, due Feb NSERC summer undergraduate award applications due Feb 5 Some helpful books STA 44/04 Jan 6, 00... administration / 5 STA 44/04 Jan 6,

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Eduardo Elias Ribeiro Junior 1 2 Walmes Marques Zeviani 1 Wagner Hugo Bonat 1 Clarice Garcia

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

STAT 526 Spring Midterm 1. Wednesday February 2, 2011

STAT 526 Spring Midterm 1. Wednesday February 2, 2011 STAT 526 Spring 2011 Midterm 1 Wednesday February 2, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points

More information

Scalable Bayesian Event Detection and Visualization

Scalable Bayesian Event Detection and Visualization Scalable Bayesian Event Detection and Visualization Daniel B. Neill Carnegie Mellon University H.J. Heinz III College E-mail: neill@cs.cmu.edu This work was partially supported by NSF grants IIS-0916345,

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Statistics Ph.D. Qualifying Exam: Part II November 9, 2002

Statistics Ph.D. Qualifying Exam: Part II November 9, 2002 Statistics Ph.D. Qualifying Exam: Part II November 9, 2002 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

A stationarity test on Markov chain models based on marginal distribution

A stationarity test on Markov chain models based on marginal distribution Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia 646 A stationarity test on Markov chain models based on marginal distribution Mahboobeh Zangeneh Sirdari 1, M. Ataharul Islam 2, and Norhashidah Awang

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

MIT Spring 2016

MIT Spring 2016 Generalized Linear Models MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Generalized Linear Models 1 Generalized Linear Models 2 Generalized Linear Model Data: (y i, x i ), i = 1,..., n where y i : response

More information

Score test for random changepoint in a mixed model

Score test for random changepoint in a mixed model Score test for random changepoint in a mixed model Corentin Segalas and Hélène Jacqmin-Gadda INSERM U1219, Biostatistics team, Bordeaux GDR Statistiques et Santé October 6, 2017 Biostatistics 1 / 27 Introduction

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data An application of the GAM-PCA-VAR model to respiratory disease and air pollution data Márton Ispány 1 Faculty of Informatics, University of Debrecen Hungary Joint work with Juliana Bottoni de Souza, Valdério

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

A Bivariate Weibull Regression Model

A Bivariate Weibull Regression Model c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression

More information

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00 Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Spatial Modeling and Prediction of County-Level Employment Growth Data

Spatial Modeling and Prediction of County-Level Employment Growth Data Spatial Modeling and Prediction of County-Level Employment Growth Data N. Ganesh Abstract For correlated sample survey estimates, a linear model with covariance matrix in which small areas are grouped

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Point process models for earthquakes with applications to Groningen and Kashmir data

Point process models for earthquakes with applications to Groningen and Kashmir data Point process models for earthquakes with applications to Groningen and Kashmir data Marie-Colette van Lieshout colette@cwi.nl CWI & Twente The Netherlands Point process models for earthquakes with applications

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Sample Size and Power Considerations for Longitudinal Studies

Sample Size and Power Considerations for Longitudinal Studies Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous

More information

Aggregated cancer incidence data: spatial models

Aggregated cancer incidence data: spatial models Aggregated cancer incidence data: spatial models 5 ième Forum du Cancéropôle Grand-est - November 2, 2011 Erik A. Sauleau Department of Biostatistics - Faculty of Medicine University of Strasbourg ea.sauleau@unistra.fr

More information

Composite likelihood and two-stage estimation in family studies

Composite likelihood and two-stage estimation in family studies Biostatistics (2004), 5, 1,pp. 15 30 Printed in Great Britain Composite likelihood and two-stage estimation in family studies ELISABETH WREFORD ANDERSEN The Danish Epidemiology Science Centre, Statens

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

Estimating AR/MA models

Estimating AR/MA models September 17, 2009 Goals The likelihood estimation of AR/MA models AR(1) MA(1) Inference Model specification for a given dataset Why MLE? Traditional linear statistics is one methodology of estimating

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

Answer Key for STAT 200B HW No. 8

Answer Key for STAT 200B HW No. 8 Answer Key for STAT 200B HW No. 8 May 8, 2007 Problem 3.42 p. 708 The values of Ȳ for x 00, 0, 20, 30 are 5/40, 0, 20/50, and, respectively. From Corollary 3.5 it follows that MLE exists i G is identiable

More information

large number of i.i.d. observations from P. For concreteness, suppose

large number of i.i.d. observations from P. For concreteness, suppose 1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate

More information

Statistical Prediction Based on Censored Life Data. Luis A. Escobar Department of Experimental Statistics Louisiana State University.

Statistical Prediction Based on Censored Life Data. Luis A. Escobar Department of Experimental Statistics Louisiana State University. Statistical Prediction Based on Censored Life Data Overview Luis A. Escobar Department of Experimental Statistics Louisiana State University and William Q. Meeker Department of Statistics Iowa State University

More information

A Latent Model To Detect Multiple Clusters of Varying Sizes. Minge Xie, Qiankun Sun and Joseph Naus. Department of Statistics

A Latent Model To Detect Multiple Clusters of Varying Sizes. Minge Xie, Qiankun Sun and Joseph Naus. Department of Statistics A Latent Model To Detect Multiple Clusters of Varying Sizes Minge Xie, Qiankun Sun and Joseph Naus Department of Statistics Rutgers, the State University of New Jersey Piscataway, NJ 08854 Summary This

More information

A spatial scan statistic for multinomial data

A spatial scan statistic for multinomial data A spatial scan statistic for multinomial data Inkyung Jung 1,, Martin Kulldorff 2 and Otukei John Richard 3 1 Department of Epidemiology and Biostatistics University of Texas Health Science Center at San

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,

More information

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Jin Kyung Park International Vaccine Institute Min Woo Chae Seoul National University R. Leon Ochiai International

More information

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Monitoring Wafer Geometric Quality using Additive Gaussian Process Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Yan Lu Jan, 2018, week 3 1 / 67 Hypothesis tests Likelihood ratio tests Wald tests Score tests 2 / 67 Generalized Likelihood ratio tests Let Y = (Y 1,

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA

MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA J. Jpn. Soc. Comp. Statist., 26(2013), 53 69 DOI:10.5183/jjscs.1212002 204 MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA Yiping Tang ABSTRACT Overdispersion is a common

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, GA 30332-0205,

More information

A Bayesian multi-dimensional couple-based latent risk model for infertility

A Bayesian multi-dimensional couple-based latent risk model for infertility A Bayesian multi-dimensional couple-based latent risk model for infertility Zhen Chen, Ph.D. Eunice Kennedy Shriver National Institute of Child Health and Human Development National Institutes of Health

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling François Bachoc former PhD advisor: Josselin Garnier former CEA advisor: Jean-Marc Martinez Department

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

STA 450/4000 S: January

STA 450/4000 S: January STA 450/4000 S: January 6 005 Notes Friday tutorial on R programming reminder office hours on - F; -4 R The book Modern Applied Statistics with S by Venables and Ripley is very useful. Make sure you have

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information