Composite Likelihood

Size: px
Start display at page:

Download "Composite Likelihood"

Transcription

1 Composite Likelihood Nancy Reid January 30, 2012 with Cristiano Varin and thanks to Don Fraser, Grace Yi, Ximing Xu

2 Background parametric model f (y; θ), y R m ; θ R d likelihood function L(θ; y) f (y; θ) why likelihood? maximum likelihood estimator is consistent and asymptotically efficient ˆθ. N{θ, j 1 (ˆθ)} j(θ) = l (θ); l(θ) = log L(θ) likelihood ratio, or log-likelihood difference, captures asymmetry in the model w(θ) = 2{l(ˆθ) l(θ)}. χ 2 d combine with prior for Bayesian inference EM algorithm for computing maximum likelihood estimate Composite Likelihood Brown, January

3 ... background warning: likelihood methods need regularity conditions on the model can have poor finite sample behaviour for large numbers of parameters priors also very tricky with large numbers of parameters finite sample corrections may be advisable difficulty with construction of the likelihood function inversion of large covariance matrices intractable integrals; awkward normalization constants combinatorial explosion nuisance components difficult to specify lower dimensional marginal or conditional distributions may be tractable combine these to form composite likelihood Composite Likelihood Brown, January

4 Terminology Model f (y; θ), y R m, θ R p Events A 1,..., A K ; sub-densities f (y A k ; θ) Composite log-likelihood K K cl(θ; y) = w k log f (y A k ; θ) = w k l(θ; y A k ) k=1 k=1 w k weights to be determined composite likelihood is a type of: pseudo-likelihood (spatial modelling); quasi-likelihood (econometrics); limited information method (psychometrics)... Composite Likelihood Brown, January

5 Example: spatial generalized linear models generalized linear geostatistical models E{Y (s) u(s)} = g{x(s) T β + u(s)}, s S R d, d 2 Diggle & Ribeiro, 2007 random intercept u( ) is a realization of a stationary GRF, mean 0, covariance cov{u(s), u(s )} = σ 2 ρ(s s ; α) m observed locations y = (y 1,..., y m ) with y i = y(s i ) likelihood function m L(θ; y) = f (y i u i ; θ) f (u; θ) du R n }{{} 1... du m i=1 MVN(0,Σ) no factorization into lower dimensional integrals, as with independent observations from the usual GLMMs Composite Likelihood Brown, January

6 ... spatial generalized linear models L(θ; y) = R n i=1 n f (y i u i ; θ)f (u; θ)du 1... du n simulation methods, MCMC, MCEM, etc., costly O(m 3 ) pairwise likelihood L pair (θ; y) = f (y i u i ; θ)f (y j u j ; θ)f 2 (u i, u j ; θ)du i du j R 2 {(i,j) S δ } Heagerty & Lele (1998), Varin (2008) comments: product of bivariate integrals accurate quadrature approximations available use only close pairs: S δ = {(i, j) : s i s j < δ} computational cost O(n) Composite Likelihood Brown, January

7 Composite conditional likelihoods Besag (1974) pseudo-likelihood m L C (θ; y) = f (y r {y s : y s neighbour of y r ; θ) r=1 pairwise conditional m m L C (θ; y) = f (y r y s ; θ) full conditional L C (θ; y) = time series r=1 s=1 Molenbergs & Verbeke (2005); Mardia et al. (2009) L C (θ; y) = m f (y r y ( r) ; θ) r=1 m f (y r y r 1 ; θ) r=1 Composite Likelihood Brown, January

8 Composite marginal likelihoods independence likelihood pairwise likelihood L ind (θ; y) = L pair (θ; y) = pairwise differences L diff (θ; y) = m m f (y r ; θ) r=1 m r=1 s=r+1 m 1 m r=1 s=r+1 optimal combination of L ind and L pair Chandler & Bate (2007) f (y r, y s ; θ) Cox & Reid (2004); Varin (2008) f (y r y s ; θ) Curriero & Lele (1999) Composite Likelihood Brown, January

9 Derived quantities composite log-likelihood cl(θ; y) = log L C (θ; y) = K k=1 w kl k (θ; y) composite score u(θ; y) = θ cl(θ; y) = K k=1 w k θ l k (θ; y) E{u(θ; Y )} = 0 sensitivity matrix H(θ) = E θ { θ u(θ; Y )} variability matrix J(θ) = var θ {u(θ; Y )} Godambe information G(θ) = H(θ)J 1 (θ)h(θ) H(θ) J(θ) Composite Likelihood Brown, January

10 Inference Sample y 1,..., y n independent from f (y; θ) or f i (y; θ) n Composite log-likelihood cl(θ; y) = cl(θ; y i ) maximum composite likelihood estimator ˆθ CL = arg max cl(θ; y) u(ˆθ CL ; y) = 0 i=1 Asymptotic consistency, normality n(ˆθ CL θ) L N p {0, G 1 (θ)}, n, m fixed if n fixed and m, need assumptions on replication examples include time series and spatial data decaying correlations G(θ) = H(θ)J 1 (θ)h(θ) Composite Likelihood Brown, January

11 ... inference θ = (ψ, λ), ψ R q, λ R p q inference based on maximum likelihood estimator ˆθ CL. N p {θ, G 1 (ˆθ CL )} = ˆψ CL. N q {ψ, G ψψ (ˆθ CL )} CL log-likelihood ratio statistic w CL (ψ) = 2{cl(ˆθ CL ) cl( θ CL )} L q j=1 λ jzj 2 θ CL = {ψ, ˆλ CL (ψ)}, constrained mle Z j N(0, 1) λ j eigenvalues of (H ψψ ) 1 G ψψ Kent (1982) G(θ) = H(θ)J 1 (θ)h(θ) Composite Likelihood Brown, January

12 ... inference ˆθ CL not fully efficient unless G(θ) = H(θ)J 1 (θ)h(θ) = i(θ) efficiency ρ cl(θ) is not a log-likelihood function logliknorm rho.vals Composite Likelihood Brown, January

13 ... inference efficiency of ˆθ CL can be pretty high, in many applications careful choice of weights can improve efficiency of ˆθ CL in special cases weights can be used to incorporate sampling information, including missing data Yi, 12, Molenberghs, 12, Briollais & Choi,12 w CL (ψ) can be re-scaled to. χ 2 q Chandler & Bate (2007), Salvan et al. 11, 12 (wip) log likelihoods rho=0.5, n=10, q=5 log likelihoods rho=0.8, n=10, q= rho rho Composite Likelihood Brown, January

14

15 y N(µ1, σ 2 R), R rs = ρ

16 Model selection Akaike Information Criterion AIC CL = 2cl n (ˆθ CL ) + 2 tr{j(ˆθ)h 1 (ˆθ)} Bayesian Information Criterion Varin & Vidoni (2005) BIC CL = 2cl n (ˆθ CL ) + log(n) tr{j(ˆθ)h 1 (ˆθ)} effective number of parameters Gao & Song (2010) tr{h(θ)g 1 (θ)} = tr{j(θ)h 1 (θ)} G(θ) = H(θ)J 1 (θ)h(θ) Composite Likelihood Brown, January

17 Some surprises Y N(µ, Σ) ˆµ CL = ˆµ, ˆΣ CL = ˆΣ 1 ρ... ρ Y N(µ1, σ 2 ρ 1... ρ R), R = ρ... ρ 1 ˆθCL = ˆθ, G(θ) = i(θ), G(θ) = H(θ)J 1 (θ)h(θ) H(θ) J(θ) H(θ) = var(score), J = E( θ Score) Y (0, R): ˆρ CL ˆρ; a.var(ˆρ CL ) > a.var(ˆρ) efficiency improvement, nuisance parameter is unknown CL can be fully efficient, even if H(θ) J(θ) Mardia et al (2008); Xu, 12 Composite Likelihood Brown, January

18 ... some surprises Godambe information G(θ) can decrease as more component CLs are added pairwise CL can be less efficient than independence CL this can t always be fixed by weighting Xu, 12 parameter constraints can be important Example: binary vector Y, exp(βy j + βy k + θ jk y j y k ) P(Y j = y j, Y k = y k ) {1 + exp(βy j + βy k + θ jk y j y k )} this model is inconsistent parameters may not be identifiable in the CL, even if they are in the full likelihood Yi, 12 Composite Likelihood Brown, January

19 Applications: spatial and space-time data conditional approaches seem more natural condition on neighbours in space condition on small number of lags (in time) some form of blockwise components often proposed Stein et al, 04; Caragea and Smith, 07 fmri time series Kang et al, 12 air pollution and health effects Bai et al, 12 computer experiments: Gaussian process models Xi, 12 spatially correlated extremes joint tail probability known joint density requires combinatorial effort (partial derivatives) composite likelihood based on joint distribution of pairs, triples seems to work well Davison et al, (2012); Genton et al., 12, Ribatet, 12 Composite Likelihood Brown, January

20 Spatial extremes vector observations (X 1i,..., X di ), i = 1,..., n example, wind speed at each of d locations component-wise maxima Z 1,..., Z m ; Z j = max(x j1,..., X jn ) Z j are transformed (centered and scaled) general theory says Pr(Z 1 z 1,..., Z m z m ) = exp{ V (z 1,..., z m )} function V ( ) can be parameterized via Gaussian process models example V (z 1, z 2 ) = z 1 1 Φ{(1/2)a(h) + a 1 (h) log(z 2 /z 1 )} + z 1 2 Φ{(1/2)a(h) + a 1 (h) log(z 1 /z 2 )} Z (h) = (z 1, z 2 ), Z (0) = (0, 0), a(h) = h T Ω 1 h Composite Likelihood Brown, January

21 ... spatial extremes Pr(Z 1 z 1,..., Z d z m ) = exp{ V (z 1,..., z m )} to compute log-likelihood function, need the density combinatorial explosion in computing joint derivatives of V ( ) Davison et al. (2012, Statistical Science) used pairwise composite likelihood compared the fits of several competing models, using AIC analogue described above applied to annual maximum rainfall at several stations near Zurich Composite Likelihood Brown, January

22 Davison et al, 2012 Composite Likelihood Brown, January

23 ... Davison et al, 2012 Composite Likelihood Brown, January

24 Network tomography, Liang & Yu, 2003 X = (X 1,..., X m ) network dynamics e.g. traffic flow counts; node delays Y = (Y 1,..., Y m ) measurement vector m m Y = AX, A known routing matrix, entries 1 m m components X j are independent, with density f j ( ; θ j ) Composite Likelihood Brown, January

25 ... applications time series a case of large m, fixed n need new arguments re consistency, asymptotic normality consecutive pairs: consistent, not asy. normal AR(1): consecutive pairs fully efficient; all pairs terrible (consistent, highly variable) MA(1): consecutive pairs terrible Davis and Yau (2011) genetics: estimation of recombination rate somewhat similar to time series but correlation may not decrease with increasing length suggesting all possible pairs may be inconsistent joint blocks of short sequences seems preferable linkage disequilibrium family based sampling Larribe and Fearnhead (2011); Choi and Briollais, 12 Composite Likelihood Brown, January

26 ... applications Gaussian graphical models Gao and Massam, 12 symmetry constraints have a natural formulation in terms of elements of concentration matrix conditional distribution of y j y ( j) multivariate binary data for multi-neuron spike trains CL as a working likelihood in maximization by parts latent variable models in psychometrics Amari (IMS,12) Bellio, 12 Moustaki, 12, Maydeu-Olivares, 12 many linear and generalized linear models with random effects multivariate survival data... Composite Likelihood Brown, January

27 What don t we know? Design marginal vs. conditional choice of weights down-weighting distant observations choosing blocks and block sizes Uncertainty estimation Ĵ(ˆθ CL ) = var{ cl(θ)/ θ} need replication; need lots of replication perhaps estimate G(ˆθ CL ) or var(ˆθ CL ) directly bootstrap, jackknife Composite Likelihood Brown, January

28 ... what don t we know? Identifiability (1): does there exist a model compatible with a set of marginal or conditional densities? Identifiability (2): what if different components are estimating different parameters? Robustness: CL uses low-dimensional information: is this a type of robustness? find a class of models with same low-d marginals Xu, 12 classical perturbation of starting model (using copulas?) Joe, 12 random effects models might be amenable to theoretical analysis Jordan, 12 asymptotic theory for large m (long vectors of responses), small n relationship to Generalized Estimating Equations Composite Likelihood Brown, January

29 Aspects of robustness model robustness univariate and bivariate margins only for example means, variances, association parameters similar in flavour to generalized estimating equations GEE: mean structure primary computational robustness composite log-likelihood functions are smoother than log-likelihood functions easier to maximize, easier to work with especially in high dimension cases Liang and Yu (2003) robust to missing data mechanisms: Yi, Zeng and Cook (2010) access to multivariate distributions: e.g. mv extremes Davison et al. (2012) Composite Likelihood Brown, January

30 Robustness of consistency working model {f (y; θ); θ Θ} true model g(y) Model Full Likelihood Composite Likelihood Correctly specified f (y; θ 0 ) = g(y) f k (y; θ 0 ) = g k (y) for all k ˆθ ML θ 0 ˆθ CL θ 0 Misspecified f (y; θ) g(y), f k (y; θ) g k (y) for some k ˆθ ML θml ˆθ CL θ Top row efficiency Ximing Xu, U Toronto Composite Likelihood Brown, January

31 ... robustness of consistency working model {f (y; θ); θ Θ} true model g(y) Model Full Likelihood Composite Likelihood Correctly specified f (y; θ 0 ) = g(y) f k (y; θ 0 ) = g k (y) for all k ˆθ ML θ 0 ˆθ CL θ 0 Misspecified f (y; θ) g(y), f k (y; θ) g k (y) for some k ˆθ ML θml ˆθ CL θ Ximing Xu, U Toronto Composite Likelihood Brown, January

32 ... robustness of consistency example (Andrei and Kendziorski, 2009): Y 1 N(µ 1, σ1 2), Y 2 N(µ 2, σ2 2 ), ε N(0, 1) Y 3 = Y 1 + Y 2 + by 1 Y 2 + ɛ full likelihood for multivariate normal is a mis-specified model, ˆb = 0 composite conditional likelihood based on normal distribution for f (Y 3 Y 2, Y 1 ) consistent estimate of b example (Arnold and Xu): f (Y ) = Φ p (Y ; µ, Σ) + g(µ, Σ)( p i=1 Y i1{ Y i t}) sub-distributions of dimension k < p are multivariate normal pairwise likelihood estimator nearly identical to mle assuming incorrect N p (µ; Σ) model Composite Likelihood Brown, January

33 Some dichotomies conditional vs marginal pairwise vs everything else unstructured vs time series/spatial weighted vs unweighted it works vs why does it work? vs when will it not work... Composite Likelihood Brown, January

34

35

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1 Recap Vector observation: Y f (y; θ), Y Y R m, θ R d sample of independent vectors y 1,..., y n pairwise log-likelihood n m i=1 r=1 s>r w rs log f 2 (y ir, y is ; θ) weights are often 1 more generally,

More information

Composite likelihood methods

Composite likelihood methods 1 / 20 Composite likelihood methods Nancy Reid University of Warwick, April 15, 2008 Cristiano Varin Grace Yun Yi, Zi Jin, Jean-François Plante 2 / 20 Some questions (and answers) Is there a drinks table

More information

Composite Likelihood Estimation

Composite Likelihood Estimation Composite Likelihood Estimation With application to spatial clustered data Cristiano Varin Wirtschaftsuniversität Wien April 29, 2016 Credits CV, Nancy M Reid and David Firth (2011). An overview of composite

More information

Nancy Reid SS 6002A Office Hours by appointment

Nancy Reid SS 6002A Office Hours by appointment Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Problems assigned weekly, due the following week http://www.utstat.toronto.edu/reid/4508s16.html Various types of likelihood 1. likelihood,

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS

AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS Statistica Sinica 21 (2011), 5-42 AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS Cristiano Varin, Nancy Reid and David Firth Università Ca Foscari Venezia, University of Toronto and University of Warwick

More information

Approximate Likelihoods

Approximate Likelihoods Approximate Likelihoods Nancy Reid July 28, 2015 Why likelihood? makes probability modelling central l(θ; y) = log f (y; θ) emphasizes the inverse problem of reasoning y θ converts a prior probability

More information

Likelihood inference for complex data

Likelihood inference for complex data 1 / 52 Likelihood inference for complex data Nancy Reid Kuwait Foundation Lecture DPMMS, University of Cambridge May 5, 2009 2 / 52 Models, data and likelihood Likelihood inference Theory Examples Composite

More information

Nancy Reid SS 6002A Office Hours by appointment

Nancy Reid SS 6002A Office Hours by appointment Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Light touch assessment One or two problems assigned weekly graded during Reading Week http://www.utstat.toronto.edu/reid/4508s14.html

More information

Likelihood and p-value functions in the composite likelihood context

Likelihood and p-value functions in the composite likelihood context Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining

More information

Aspects of Composite Likelihood Inference. Zi Jin

Aspects of Composite Likelihood Inference. Zi Jin Aspects of Composite Likelihood Inference Zi Jin A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Statistics University of Toronto c

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Composite likelihood for space-time data 1

Composite likelihood for space-time data 1 Composite likelihood for space-time data 1 Carlo Gaetan Paris, May 11th, 2015 1 I thank my colleague Cristiano Varin for sharing ideas, slides and office with me. Bruce Lindsay March 7, 1947 - May 5, 2015

More information

Joint Composite Estimating Functions in Spatial and Spatio-Temporal Models

Joint Composite Estimating Functions in Spatial and Spatio-Temporal Models Joint Composite Estimating Functions in Spatial and Spatio-Temporal Models by Yun Bai A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Biostatistics)

More information

Likelihood inference in the presence of nuisance parameters

Likelihood inference in the presence of nuisance parameters Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Likelihood and Asymptotic Theory for Statistical Inference

Likelihood and Asymptotic Theory for Statistical Inference Likelihood and Asymptotic Theory for Statistical Inference Nancy Reid 020 7679 1863 reid@utstat.utoronto.ca n.reid@ucl.ac.uk http://www.utstat.toronto.edu/reid/ltccf12.html LTCC Likelihood Theory Week

More information

Likelihood and Asymptotic Theory for Statistical Inference

Likelihood and Asymptotic Theory for Statistical Inference Likelihood and Asymptotic Theory for Statistical Inference Nancy Reid 020 7679 1863 reid@utstat.utoronto.ca n.reid@ucl.ac.uk http://www.utstat.toronto.edu/reid/ltccf12.html LTCC Likelihood Theory Week

More information

Aspects of Likelihood Inference

Aspects of Likelihood Inference Submitted to the Bernoulli Aspects of Likelihood Inference NANCY REID 1 1 Department of Statistics University of Toronto 100 St. George St. Toronto, Canada M5S 3G3 E-mail: reid@utstat.utoronto.ca, url:

More information

Old and new approaches for the analysis of categorical data in a SEM framework

Old and new approaches for the analysis of categorical data in a SEM framework Old and new approaches for the analysis of categorical data in a SEM framework Yves Rosseel Department of Data Analysis Belgium Myrsini Katsikatsou Department of Statistics London Scool of Economics UK

More information

2. Inference method for margins and jackknife.

2. Inference method for margins and jackknife. The Estimation Method of Inference Functions for Margins for Multivariate Models Harry Joe and James J. Xu Department of Statistics, University of British Columbia ABSTRACT An estimation approach is proposed

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Bayesian Asymptotics

Bayesian Asymptotics BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {

More information

An Extended BIC for Model Selection

An Extended BIC for Model Selection An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,

More information

A Model for Correlated Paired Comparison Data

A Model for Correlated Paired Comparison Data Working Paper Series, N. 15, December 2010 A Model for Correlated Paired Comparison Data Manuela Cattelan Department of Statistical Sciences University of Padua Italy Cristiano Varin Department of Statistics

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Accurate directional inference for vector parameters

Accurate directional inference for vector parameters Accurate directional inference for vector parameters Nancy Reid February 26, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Nancy Reid Accurate directional inference for vector parameters York University

More information

Multiple Comparisons using Composite Likelihood in Clustered Data

Multiple Comparisons using Composite Likelihood in Clustered Data Multiple Comparisons using Composite Likelihood in Clustered Data Mahdis Azadbakhsh, Xin Gao and Hanna Jankowski York University May 29, 15 Abstract We study the problem of multiple hypothesis testing

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Latent variable models: a review of estimation methods

Latent variable models: a review of estimation methods Latent variable models: a review of estimation methods Irini Moustaki London School of Economics Conference to honor the scientific contributions of Professor Michael Browne Outline Modeling approaches

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia

More information

Applied Asymptotics Case studies in higher order inference

Applied Asymptotics Case studies in higher order inference Applied Asymptotics Case studies in higher order inference Nancy Reid May 18, 2006 A.C. Davison, A. R. Brazzale, A. M. Staicu Introduction likelihood-based inference in parametric models higher order approximations

More information

testing MAHDIS AZADBAKHSH A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN

testing MAHDIS AZADBAKHSH A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN Composite likelihood: Multiple comparisons and non-standard conditions in hypothesis testing MAHDIS AZADBAKHSH A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE

More information

Accurate directional inference for vector parameters

Accurate directional inference for vector parameters Accurate directional inference for vector parameters Nancy Reid October 28, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Parametric models and likelihood model f (y; θ), θ R p data y = (y 1,...,

More information

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-206-0508 Title Combining likelihood and significance functions Manuscript ID SS-206-0508 URL http://www.stat.sinica.edu.tw/statistica/ DOI 0.5705/ss.20206.0508 Complete

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe

More information

Max-stable processes: Theory and Inference

Max-stable processes: Theory and Inference 1/39 Max-stable processes: Theory and Inference Mathieu Ribatet Institute of Mathematics, EPFL Laboratory of Environmental Fluid Mechanics, EPFL joint work with Anthony Davison and Simone Padoan 2/39 Outline

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Estimation of spatial max-stable models using threshold exceedances

Estimation of spatial max-stable models using threshold exceedances Estimation of spatial max-stable models using threshold exceedances arxiv:1205.1107v1 [stat.ap] 5 May 2012 Jean-Noel Bacro I3M, Université Montpellier II and Carlo Gaetan DAIS, Università Ca Foscari -

More information

Journal of Statistical Planning and Inference

Journal of Statistical Planning and Inference Journal of Statistical Planning and Inference ] (]]]]) ]]] ]]] Contents lists available at ScienceDirect Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi On

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle 2015 IMS China, Kunming July 1-4, 2015 2015 IMS China Meeting, Kunming Based on joint work

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle European Meeting of Statisticians, Amsterdam July 6-10, 2015 EMS Meeting, Amsterdam Based on

More information

GMM - Generalized method of moments

GMM - Generalized method of moments GMM - Generalized method of moments GMM Intuition: Matching moments You want to estimate properties of a data set {x t } T t=1. You assume that x t has a constant mean and variance. x t (µ 0, σ 2 ) Consider

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

Akaike Information Criterion

Akaike Information Criterion Akaike Information Criterion Shuhua Hu Center for Research in Scientific Computation North Carolina State University Raleigh, NC February 7, 2012-1- background Background Model statistical model: Y j =

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Likelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona

Likelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona Likelihood based Statistical Inference Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona L. Pace, A. Salvan, N. Sartori Udine, April 2008 Likelihood: observed quantities,

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

ECE 275A Homework 7 Solutions

ECE 275A Homework 7 Solutions ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator

More information

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6.

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6. Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Bootstrap and Parametric Inference: Successes and Challenges

Bootstrap and Parametric Inference: Successes and Challenges Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist

More information

Advanced Econometrics

Advanced Econometrics Advanced Econometrics Dr. Andrea Beccarini Center for Quantitative Economics Winter 2013/2014 Andrea Beccarini (CQE) Econometrics Winter 2013/2014 1 / 156 General information Aims and prerequisites Objective:

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1 Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school 2009 - Lipari, Italy These slides are based on Marc Lavielle s slides Outline 1

More information

Bayesian and frequentist inference

Bayesian and frequentist inference Bayesian and frequentist inference Nancy Reid March 26, 2007 Don Fraser, Ana-Maria Staicu Overview Methods of inference Asymptotic theory Approximate posteriors matching priors Examples Logistic regression

More information

STAT Financial Time Series

STAT Financial Time Series STAT 6104 - Financial Time Series Chapter 4 - Estimation in the time Domain Chun Yip Yau (CUHK) STAT 6104:Financial Time Series 1 / 46 Agenda 1 Introduction 2 Moment Estimates 3 Autoregressive Models (AR

More information

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March

More information

Lecture 6: Discrete Choice: Qualitative Response

Lecture 6: Discrete Choice: Qualitative Response Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;

More information

Lecture 14 More on structural estimation

Lecture 14 More on structural estimation Lecture 14 More on structural estimation Economics 8379 George Washington University Instructor: Prof. Ben Williams traditional MLE and GMM MLE requires a full specification of a model for the distribution

More information

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Last week. posterior marginal density. exact conditional density. LTCC Likelihood Theory Week 3 November 19, /36

Last week. posterior marginal density. exact conditional density. LTCC Likelihood Theory Week 3 November 19, /36 Last week Nuisance parameters f (y; ψ, λ), l(ψ, λ) posterior marginal density π m (ψ) =. c (2π) q el P(ψ) l P ( ˆψ) j P ( ˆψ) 1/2 π(ψ, ˆλ ψ ) j λλ ( ˆψ, ˆλ) 1/2 π( ˆψ, ˆλ) j λλ (ψ, ˆλ ψ ) 1/2 l p (ψ) =

More information

Pairwise Likelihood Estimation for factor analysis models with ordinal data

Pairwise Likelihood Estimation for factor analysis models with ordinal data Working Paper 2011:4 Department of Statistics Pairwise Likelihood Estimation for factor analysis models with ordinal data Myrsini Katsikatsou Irini Moustaki Fan Yang-Wallentin Karl G. Jöreskog Working

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Discussion of Maximization by Parts in Likelihood Inference

Discussion of Maximization by Parts in Likelihood Inference Discussion of Maximization by Parts in Likelihood Inference David Ruppert School of Operations Research & Industrial Engineering, 225 Rhodes Hall, Cornell University, Ithaca, NY 4853 email: dr24@cornell.edu

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was

More information

Modelling using ARMA processes

Modelling using ARMA processes Modelling using ARMA processes Step 1. ARMA model identification; Step 2. ARMA parameter estimation Step 3. ARMA model selection ; Step 4. ARMA model checking; Step 5. forecasting from ARMA models. 33

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling

Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling J. Shults a a Department of Biostatistics, University of Pennsylvania, PA 19104, USA (v4.0 released January 2015)

More information

Default priors and model parametrization

Default priors and model parametrization 1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information