Various types of likelihood
|
|
- Scott O’Neal’
- 5 years ago
- Views:
Transcription
1 Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood, penalized likelihood 4. quasi-likelihood, composite likelihood 5. simulated likelihood, indirect inference 6. bootstrap likelihood, h-likelihood, weighted likelihood, pseudo-likelihood, local likelihood, sieve likelihood STA 4508 November
2 November 6 HW 2 comments & K-L divergence presentations semi-parametric likelihood as profile empirical likelihood composite likelihood STA 4508 November
3 Exercises October 23 STA 4508S (Fall, 2018) 1. The Kullback-Leibler divergence from the distribution G to the distribution F is given by KL(F : G) = log f(y) f(y)dy, (1) g(y) where f and g are and density functions with respect to Lebesgue measure. Note that the divergence is not symmetric in its arguments. This is called the directed information distance in Barndorff-Nielsen and Cox (1994) where the more general definition KL(F : G) = log(df/dg)df is used, assuming F and G are mutually absolutely continuous. (a) In the canonical exponential family model with density f(s; ϕ) = exp{ϕ T s k(ϕ)}h(s), s R p, find an expression for the KL divergence between the model with parameter ϕ1 and that with parameter ϕ2. (b) Show that for a sample of observations from a model with density f(y;θ) the maximum likelihood estimator minimizes the KL divergence from F ( ; θ) to Gn( ), where Gn( ) is the empirical distribution function putting mass 1/n at each observation yi. 2. Suppose yi N(µi, 1/n), i = 1,..., k and ψ 2 = Σ k i=1µ 2 i is the parameter of interest. 1 (a) Show that the marginal posterior density for nψ 2, assuming a flat prior π(µ) 1, is a non-central χ 2 k distribution, with noncentrality parameter nσyi 2. (b) Show that the maximum likelihood estimate of ψ 2 is ˆψ 2 = Σyi 2, and that n ˆψ 2 has a non-central χ 2 k distribution with non-centrality parameter nψ 2. (c) Compare the normal approximations to ru(ψ), re(ψ) and r(ψ) with the exact distribution of the maximum likelihood estimate. (d) Compare the 95% Bayesian posterior probability interval for ψ 2, based on (a) to the 95% confidence interval for ψ 2, based on (b). 1 It will be convenient to use λi = µi/ Σµ 2 i 1 for the nuisance parameters.
4 Proportional hazards regression partial log-likelihood function y 1 < y 2 < < y n l part (β; y, d) = n d i {xi T β log exp(xi T β)} j Ri i=1 can be motivated as: R i = {j : y j y i } set of individuals that could be observed to fail at time y i see SM 10.8 for treatment of ties 1. marginal log-likelihood of the ranks of the failure times 2. n Pr(unit i fails at y i R i, there is one failure at y i ) i=1 3. profile log-likelihood function if λ( ) is represented by a vector of values (λ 1,..., λ n) = {λ(y 1),... λ(y n)} STA 4508 November
5 Inference n l p( θ n) = l p(θ 0) + ( θ n θ 0) T Ũ j (θ 0) 1 2 n( θ n θ 0) T ĩ 1 (θ 0)( θ n θ 0) j=1 + o p( n θ n θ 0 + 1) 2 Ũ is the projection of l/ θ on space spanned by nuisance function as in parametric models, lead to (ˆθ θ 0). N{0, ĩ 1 (θ 0)} and likelihood ratio test 2{l p(ˆθ) l p(θ 0)}. χ 2 d proof uses least favourable sub-models through the true model effectively turns infinite-dimensional parameter finite STA 4508 November
6 Infinite-dimensional models recall that L(θ; y) f(y; θ) f(y; θ) a density w.r. to dominating measure more abstract definition: if a probability measure Q is absolutely continuous w.r. to a probability measure P, and both possess densities w.r. to a measure µ, then the likelihood of Q w.r. to P is the Radon-Nikodym derivative dq dp = q p, a.e. P some semi-parametric models have a dominating measure, and a family of densities some can be handled by the notion of empirical likelihood some may use mixtures of these STA 4508 November
7 ... infinite-dimensional models Definition: Given a measure P, and a sample (y 1,..., y n ), the empirical likelihood function is EL(P; y) = n P({y i }), where P{y} is the measure of the one-point set {y} i=1 Definition: Given a model P, a maximum likelihood estimator is the distribution P that maximizes the empirical likelihood over P may or may not exist STA 4508 November
8 Example: the empirical distribution vdw P is the set of all probability distributions on a measurable space {Y, A} suppose the observed values y 1,..., y n are distinct {(P{y 1 },..., P({y n }); P P} (p 1,..., p n ), p i 0, Σp i = 1) 1-point sets are measurable empirical likelihood maximized at ( 1 n,..., 1 n ) empirical distribution function is the nonparametric MLE F n( ) = n 1 Σ1(Y i ) EL is not the same as f(y i ), even if P has a density f STA 4508 November
9 Compare Owen, Ch. 2 for y R, define F(y) = Pr(Y y) and F(y ) = Pr(Y < y) for y 1,..., y n the nonparametric likelihood function is n L(F) = {F(y i ) F(y )}, i hence 0 if F is continuous Theorem 2.1 of Owen: i=1 L(F) < L(F n ), F n (y) = 1 n Σ1{y i y} there is a likelihood function on the space of distribution functions for which the empirical c.d.f. is the maximum likelihood estimator why does this fail for densities? STA 4508 November
10 Aside: empirical likelihood Owen, Ch.2 profile version of empirical likelihood { } L(F) R(θ) = sup F F, T(F) = θ L(F n ) R a relative likelihood, hence np i example: T(F) = xdf(x) R(θ) = max{ n i=1 np i n i=1 p iy i = θ, p i 0, p i = 1} For y 1,..., y n i.i.d. F 0, E(y i ) = θ 0, var(y i ) <, 2 log R(θ 0 ) d χ 2 1, n ˆp i = 1 1 n {1 + α(y i θ 0 )}, 1 n n y i θ α(y i θ 0 ) = 0 Theorem 2.2 Owen i=1 STA 4508 November
11 Semi-parametric logistic regression VandW Ex eθv+η(w) Pr(Y = 1 V, W) = 1 + e θv+η(w) sample (Y i, V i, W i ), i = 1,..., n independent L(θ, η; Y) n { e θv i +η(w i ) } yi { e θv i+η(w i ) 1 + e θv i+η(w i ) i=1 η(w i ) = when y i = 1, η(w i ) = when y i = 0 gives L(θ, η) } 1 yi suggestion: penalized log-likelihood log L(θ, η; Y) ˆα n 2 {η (k) (w)} 2 dw we can t maximize it needs separate analysis of properties STA 4508 November
12 Example: missing covariate Murphy and vdw, 2000 observation (D, W, Z); D and W are independent, given Z Pr(D = 0) = {1 + exp(γ + βe z )} 1 W N(α 0 + α 1 z; σ 2 ) Z g( ), non-parametric (d C, w C, z C ) a complete observation Z is the gold standard covariate, e.g. LDL cholesterol (d R, w R ) has a missing covariate W is a surrogate for Z f(x; θ, g) = f(d C, w C z C ; θ)g(z C ) f(d R, w R z; θ)g(z)dz EL(θ, g) = f(d C, w C z C ; θ)g{z C } x = (d C, w C, z C, d R, w R ) f(d R, w R z)g(z)dz θ = γ, β, α 0, α 1, σ 2 can be profiled, according to M&vdW STA 4508 November
13 Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood, penalized likelihood 4. quasi-likelihood, composite likelihood 5. simulated likelihood, indirect inference 6. bootstrap likelihood, h-likelihood, weighted likelihood, pseudo-likelihood, local likelihood, sieve likelihood STA 4508 November
14 Composite likelihood Vector observation: Y f(y; θ), Y Y R m, θ R d Set of events: {A k, k K} Composite Log-Likelihood: Lindsay, 1988 cl(θ; y) = k K w k l k (θ; y) l k (θ; y) = log{f({y A k }; θ)} log-likelihood for an event {w k, k K} a set of weights also called: pseudo-likelihood (spatial modelling) quasi-likelihood (econometrics) limited information method (psychometrics) STA 4508 November
15 Examples of composite log-likelihood m r=1 w r log f 1 (y r ; θ) m r=1 s>r w rs log f 2 (y r, y s ; θ) m r=1 w r log f(y r y ( r) ; θ) m r=1 s>r w rs log f(y r y s ; θ) m r=1 w r log f(y r y r 1 ; θ) m r=1 w r log f(y r neighbours of y r ; θ) Independence Pairwise Conditional All pairs conditional Time series Spatial Small blocks of observations; pairwise differences;... your favourite combination... STA 4508 November
16 Derived quantities single response y with density f(y; θ), y R m, θ R d composite log-likelihood cl(θ; y) = log cl(θ; y) = k w kl k (θ; y) composite score function U CL (θ) = cl(θ; y)/ θ sensitivity H(θ) = E θ { 2 cl(θ; y)/ θ θ T } variability J(θ) = E θ {U CL (θ)u(θ) T } Godambe information G(θ) = H(θ)J 1 (θ)h(θ) STA 4508 November
17 ... derived quantities sample y = (y 1,..., y n ) with joint density f(y; θ), y R m, θ R d score function U CL (θ) = θ cl(θ; y) = n i=1 θ cl(θ; y i) maximum composite ˆθ CL = ˆθ CL (y) = arg sup θ cl(θ; y) likelihood estimate score equation U CL (ˆθ CL ) = cl (ˆθ CL ) = 0 composite LRT Godambe information w CL (θ) = 2{cl(ˆθ CL ) cl(θ)} G(θ) = G n (θ) = H n (θ)j 1 n (θ)h n (θ) = O(n) STA 4508 November
18 Inference Sample: Y 1,..., Y n, i.i.d., CL(θ; y) = n i=1 CL(θ; y i) ˆθ CL θ. N{0, G 1 (θ)} G n (θ) = H(θ)J(θ) 1 H(θ) U(ˆθ CL ). = U(θ) + (ˆθ CL θ) θ U(θ) U = U CL ˆθ CL θ. = θ U(θ) 1 U(θ). = H 1 (θ)u(θ) U(θ). N{0, J(θ)} H 1 (θ)u(θ). N{0, H 1 (θ)j(θ)h T (θ)} conclude n(ˆθ CL θ). N{0, G 1 (θ)} STA 4508 November
19 ... inference w(θ) = 2{cl(ˆθ CL ) cl(θ)}. d a=1 µ az 2 a Z a N(0, 1) µ 1,..., µ d eigenvalues of J(θ)H(θ) 1 cl(ˆθ CL ) cl(θ) =. 1 2 (ˆθ CL θ) T { cl (ˆθ CL )}(ˆθ CL θ) non-central χ 2 limit J(θ) = varu(θ), H(θ) = E θ U(θ) if J(θ) = H(θ), w(θ). χ 2 d if d = 1, w(θ). µ 1 χ 2 1 = J(θ)H 1 (θ)χ 2 1 H, J both scalars STA 4508 November
20 Example: symmetric normal Y i N(0, R), var(y ir ) = 1, corr (Y ir, Y is ) = ρ compound bivariate normal densities to form pairwise likelihood nm(m 1) cl(ρ; y 1,..., y n ) = log(1 ρ 2 ) m 1 + ρ 4 2(1 ρ 2 ) SS w (m 1)(1 ρ) SS b 2(1 ρ 2 ) m n m n SS w = (y is ȳ i. ) 2, SS b = i=1 s=1 n(m 1) l(ρ; y 1,..., y n ) = log(1 ρ) n log{1 + (m 1)ρ} (1 ρ) SS 1 SS b w 2{1 + (m 1)ρ} m i=1 y 2 i. STA 4508 November
21 ... symmetric normal a. var(ˆρ) = a. var(ˆρ CL ) = 2 {1 + (m 1)ρ} 2 (1 ρ) 2 nm(m 1) 1 + (m 1)ρ 2 2 (1 ρ) 2 c(m, ρ) nm(m 1) (1 + ρ 2 ) 2 c(m, ρ) = (1 ρ) 2 (3ρ 2 + 1) + mρ( 3ρ 3 + 8ρ 2 3ρ + 2) + m 2 ρ 2 (1 ρ) 2 2 (1 ρ) 2 a.var(ˆρ CL ) = nm(m 1) (1 + ρ 2 c(m, ρ) ) 2 O( 1 n ) O(1) n m STA 4508 November
22 ... symmetric normal a.var(ˆρ ), m = 3, 5, 8, 10 a.var(ˆρ CL ) (Cox & Reid, 2004) efficiency ρ STA 4508 November
23 Likelihood ratio test log likelihoods rho=0.5, n=10, q=5 log likelihoods rho=0.8, n=10, q= rho rho log likelihoods rho=0.2, n=10, q=5 log likelihoods rho=0.2, n=7, q= rho rho STA 4508 November
24 Example: longitudinal count data subjects i = 1,..., n observations counts y ir, r = 1,... m i model y ir Poisson(u ir x T ir β) u i1,..., u imi gamma-distributed random effects but correlated corr(u ir, u is ) = ρ r s joint density has combinatorial number of terms in m i ; impractical weighted pairwise composite likelihood L pair (β) = n i=1 1 m i 1 m i m i r=1 s=r+1 f(y ir, y is ; β) weights chosen so that L pair = full likelihood if ρ = 0 Henderson & Shimura, 2003 STA 4508 November
Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1
Recap Vector observation: Y f (y; θ), Y Y R m, θ R d sample of independent vectors y 1,..., y n pairwise log-likelihood n m i=1 r=1 s>r w rs log f 2 (y ir, y is ; θ) weights are often 1 more generally,
More informationComposite likelihood methods
1 / 20 Composite likelihood methods Nancy Reid University of Warwick, April 15, 2008 Cristiano Varin Grace Yun Yi, Zi Jin, Jean-François Plante 2 / 20 Some questions (and answers) Is there a drinks table
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Light touch assessment One or two problems assigned weekly graded during Reading Week http://www.utstat.toronto.edu/reid/4508s14.html
More informationComposite Likelihood
Composite Likelihood Nancy Reid January 30, 2012 with Cristiano Varin and thanks to Don Fraser, Grace Yi, Ximing Xu Background parametric model f (y; θ), y R m ; θ R d likelihood function L(θ; y) f (y;
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Problems assigned weekly, due the following week http://www.utstat.toronto.edu/reid/4508s16.html Various types of likelihood 1. likelihood,
More informationComposite Likelihood Estimation
Composite Likelihood Estimation With application to spatial clustered data Cristiano Varin Wirtschaftsuniversität Wien April 29, 2016 Credits CV, Nancy M Reid and David Firth (2011). An overview of composite
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationVarious types of likelihood
Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationLikelihood and Asymptotic Theory for Statistical Inference
Likelihood and Asymptotic Theory for Statistical Inference Nancy Reid 020 7679 1863 reid@utstat.utoronto.ca n.reid@ucl.ac.uk http://www.utstat.toronto.edu/reid/ltccf12.html LTCC Likelihood Theory Week
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationLikelihood and p-value functions in the composite likelihood context
Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining
More informationLikelihood and Asymptotic Theory for Statistical Inference
Likelihood and Asymptotic Theory for Statistical Inference Nancy Reid 020 7679 1863 reid@utstat.utoronto.ca n.reid@ucl.ac.uk http://www.utstat.toronto.edu/reid/ltccf12.html LTCC Likelihood Theory Week
More informationLikelihood Inference in the Presence of Nuisance Parameters
PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe
More informationLikelihood Inference in the Presence of Nuisance Parameters
Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based
More informationLikelihood inference in the presence of nuisance parameters
Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationApproximate Likelihoods
Approximate Likelihoods Nancy Reid July 28, 2015 Why likelihood? makes probability modelling central l(θ; y) = log f (y; θ) emphasizes the inverse problem of reasoning y θ converts a prior probability
More informationNuisance parameters and their treatment
BS2 Statistical Inference, Lecture 2, Hilary Term 2008 April 2, 2008 Ancillarity Inference principles Completeness A statistic A = a(x ) is said to be ancillary if (i) The distribution of A does not depend
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationAspects of Composite Likelihood Inference. Zi Jin
Aspects of Composite Likelihood Inference Zi Jin A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Statistics University of Toronto c
More informationIEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm
IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.
More informationApplied Asymptotics Case studies in higher order inference
Applied Asymptotics Case studies in higher order inference Nancy Reid May 18, 2006 A.C. Davison, A. R. Brazzale, A. M. Staicu Introduction likelihood-based inference in parametric models higher order approximations
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The
More informationLikelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona
Likelihood based Statistical Inference Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona L. Pace, A. Salvan, N. Sartori Udine, April 2008 Likelihood: observed quantities,
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationHigh Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data
High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationEmpirical Likelihood
Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationBootstrap and Parametric Inference: Successes and Challenges
Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationAspects of Likelihood Inference
Submitted to the Bernoulli Aspects of Likelihood Inference NANCY REID 1 1 Department of Statistics University of Toronto 100 St. George St. Toronto, Canada M5S 3G3 E-mail: reid@utstat.utoronto.ca, url:
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationBayesian Analysis of Risk for Data Mining Based on Empirical Likelihood
1 / 29 Bayesian Analysis of Risk for Data Mining Based on Empirical Likelihood Yuan Liao Wenxin Jiang Northwestern University Presented at: Department of Statistics and Biostatistics Rutgers University
More informationBayesian Asymptotics
BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationDEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY
Journal of Statistical Research 200x, Vol. xx, No. xx, pp. xx-xx ISSN 0256-422 X DEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY D. A. S. FRASER Department of Statistical Sciences,
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationProbabilistic Graphical Models
Parameter Estimation December 14, 2015 Overview 1 Motivation 2 3 4 What did we have so far? 1 Representations: how do we model the problem? (directed/undirected). 2 Inference: given a model and partially
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationInformation in Data. Sufficiency, Ancillarity, Minimality, and Completeness
Information in Data Sufficiency, Ancillarity, Minimality, and Completeness Important properties of statistics that determine the usefulness of those statistics in statistical inference. These general properties
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationMore on nuisance parameters
BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationSemiparametric posterior limits
Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationLast week. posterior marginal density. exact conditional density. LTCC Likelihood Theory Week 3 November 19, /36
Last week Nuisance parameters f (y; ψ, λ), l(ψ, λ) posterior marginal density π m (ψ) =. c (2π) q el P(ψ) l P ( ˆψ) j P ( ˆψ) 1/2 π(ψ, ˆλ ψ ) j λλ ( ˆψ, ˆλ) 1/2 π( ˆψ, ˆλ) j λλ (ψ, ˆλ ψ ) 1/2 l p (ψ) =
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationThe formal relationship between analytic and bootstrap approaches to parametric inference
The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.
More informationASYMPTOTICS AND THE THEORY OF INFERENCE
ASYMPTOTICS AND THE THEORY OF INFERENCE N. Reid University of Toronto Abstract Asymptotic analysis has always been very useful for deriving distributions in statistics in cases where the exact distribution
More informationBetter Bootstrap Confidence Intervals
by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose
More informationStatistics Ph.D. Qualifying Exam: Part I October 18, 2003
Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationMath 152. Rumbos Fall Solutions to Assignment #12
Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationIntroduction An approximated EM algorithm Simulation studies Discussion
1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse
More informationBayesian Interpretations of Regularization
Bayesian Interpretations of Regularization Charlie Frogner 9.50 Class 15 April 1, 009 The Plan Regularized least squares maps {(x i, y i )} n i=1 to a function that minimizes the regularized loss: f S
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationBayesian Adaptation. Aad van der Vaart. Vrije Universiteit Amsterdam. aad. Bayesian Adaptation p. 1/4
Bayesian Adaptation Aad van der Vaart http://www.math.vu.nl/ aad Vrije Universiteit Amsterdam Bayesian Adaptation p. 1/4 Joint work with Jyri Lember Bayesian Adaptation p. 2/4 Adaptation Given a collection
More informationGraduate Econometrics I: Maximum Likelihood II
Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationA Model for Correlated Paired Comparison Data
Working Paper Series, N. 15, December 2010 A Model for Correlated Paired Comparison Data Manuela Cattelan Department of Statistical Sciences University of Padua Italy Cristiano Varin Department of Statistics
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationLikelihood inference for complex data
1 / 52 Likelihood inference for complex data Nancy Reid Kuwait Foundation Lecture DPMMS, University of Cambridge May 5, 2009 2 / 52 Models, data and likelihood Likelihood inference Theory Examples Composite
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationLecture 1: Bayesian Framework Basics
Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of
More information1. Fisher Information
1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score
More informationi=1 h n (ˆθ n ) = 0. (2)
Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating
More informationMinimax lower bounds I
Minimax lower bounds I Kyoung Hee Kim Sungshin University 1 Preliminaries 2 General strategy 3 Le Cam, 1973 4 Assouad, 1983 5 Appendix Setting Family of probability measures {P θ : θ Θ} on a sigma field
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationLecture Notes 15 Prediction Chapters 13, 22, 20.4.
Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data
More informationANCILLARY STATISTICS: A REVIEW
Statistica Sinica 20 (2010), 1309-1332 ANCILLARY STATISTICS: A REVIEW M. Ghosh 1, N. Reid 2 and D. A. S. Fraser 2 1 University of Florida and 2 University of Toronto Abstract: In a parametric statistical
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationPrinciples of Statistics
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for
More informationChapter 4: Imputation
Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation
More informationTest of Association between Two Ordinal Variables while Adjusting for Covariates
Test of Association between Two Ordinal Variables while Adjusting for Covariates Chun Li, Bryan Shepherd Department of Biostatistics Vanderbilt University May 13, 2009 Examples Amblyopia http://www.medindia.net/
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationA union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling
A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling Min-ge Xie Department of Statistics, Rutgers University Workshop on Higher-Order Asymptotics
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More information