Approximate Inference for the Multinomial Logit Model
|
|
- Dustin Carpenter
- 5 years ago
- Views:
Transcription
1 Approximate Inference for the Multinomial Logit Model M.Rekkas Abstract Higher order asymptotic theory is used to derive p-values that achieve superior accuracy compared to the p-values obtained from traditional tests for inference about parameters of the multinomial logit model. Simulations are provided to assess the finite sample behavior of the test statistics considered and to demonstrate the superiority of the higher order method. Stata code that outputs these p-values is available to facilitate the implementation of these methods for the end-user. Department of Economics, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, phone: (778) , fax: (778) I would like to thank Nancy Reid and two anonymous referees for helpful comments and suggestions. The support of the Natural Sciences and Engineering Research Council of Canada is gratefully appreciated. 1
2 1 Introduction The multinomial logit specification is the most popular discrete choice model in applied statistical disciplines such as economics. Recent developments in higher order likelihood asymptotic methods are applied to obtain highly accurate tail probabilities for testing parameters of interest in these models. This involves using an adjusted version of the standard log likelihood ratio statistic. Simulations are provided to demonstrate the significant improvements in accuracy that can be achieved on conventional first-order methods, that is, methods that achieve distributional accuracy of order O(n 1/2 ), where n is the sample size. The resulting p-value expressions for assessing scalar parameters of interest are remarkably simple and can easily be programmed into conventional statistical packages. The results have particular appeal to applied statisticians dealing with discrete choice models where the number of observations may be limited. More generally, these higherorder methods can be conducted regardless of the sample size in order to determine the extent to which first-order methods can be relied upon. The two main contributions are as follows. First, higher order likelihood theory is used to obtain highly accurate p-values for testing parameters of the multinomial model. And second, Stata code is made available to the end-user for this model. 1 While the past two decades have seen significant advances in likelihood asymptotic methods, the empirical work employing these techniques has severely lagged behind. This schism is undoubtedly due to the lack of user friendly computer code. The Stata programs are provided as a means to bridge this gap. 2 Model For a given parametric model and observed data y = (y 1, y 2,..., y n ), denote the log-likelihood function as l(θ), where θ is the full parameter vector of the model expressed as θ T = (ψ, λ T ) T, with scalar interest parameter ψ and nuisance parameter vector λ. Denote the overall maximum likelihood estimator as ˆθ = ( ˆψ, ˆλ T ) T = argmax θ l(θ) and the constrained maximum likelihood estimator as ˆθ ψ T = (ψ, ˆλ T ψ )T = argmax λ l(θ; y) for fixed ψ values. Let j θθ T (ˆθ) = 2 l(θ)/ θ θ T ˆθ denote the observed information matrix and j λλ T (ˆθ ψ ) = 2 l(θ)/ λ λ T denote the observed ˆθψ nuisance information matrix. Inference about ψ is typically based on two departure methods, 1 Brazzale (1999) provides R Code for approximate conditional inference for logistic and loglinear models but does not consider the multinomial logit model. 2
3 known as the Wald departure (q) and the signed log likelihood ratio departure (r): q = ( ˆψ ψ) { } 1/2 jθθ T (ˆθ) (1) j λλ T (ˆθ ψ ) r = sgn( ˆψ ψ)[2{l(ˆθ) l(ˆθ ψ )}] 1/2. (2) Note that the expression in (1) is not the usual Wald statistic for which the estimated standard error of ˆψ is used for standardization. 2 Approximate p-values are given by Φ(q) and Φ(r), where Φ( ) represents the standard normal cumulative distribution function. These methods are referred to as first-order methods as q and r are distributed asymptotically as standard normal with firstorder accuracy (i.e. the relative error of the approximation is O(n 1/2 )). In small and even moderate samples, these methods can be highly inaccurate. Barndorff-Nielsen (1986) derived the modified signed log likelihood ratio statistic for higher order inference r = r 1 r log ( r Q ), (3) where r is the signed likelihood ratio departure in (2) and Q is a standardized maximum likelihood departure term. The distribution of r is also asymptotically distributed as standard normal but when the distribution of y is continuous it achieves third-order accuracy. Tail area approximations can be obtained by using Φ(r ). For exponential family models, several definitions for Q exist, see for example, Barndorff-Nielsen (1991), Pierce and Peters (1992), Fraser and Reid (1995), and Jensen (1995). The derivation of Q given by Fraser and Reid (1995) will be used in this paper. While the Fraser and Reid version only applies to continuous data, saddlepoint arguments can be invoked to argue that the method is still valid for exponential family models in the discrete setting. Given the maximum likelihood estimates take values on a lattice however, technical issues surrounding the exact order of the error produce p-values with distributional accuracy of order O(n 1 ). For more general models, Davison et al. (2006) provide a framework for handling discrete data that also achieves second-order accuracy. Given the present context involves the exponential family model, the Fraser and Reid (1995) methodology is directly applicable. Fraser and Reid (1995) used tangent exponential models to derive a highly accurate approximation to the p-value for testing a scalar interest parameter. The theory for obtaining Q involves two main components. The first component requires a reduction of dimension by approximate 2 The standard Wald statistic will be considered in the examples and simulations as this is the statistic that is typically reported in conventional statistical packages. 3
4 ancillarity. 3 This step reduces the dimension of the variable to the dimension of the full parameter. The second component requires a further reduction of dimension from the dimension of the parameter to the dimension of the scalar interest parameter. These two components are achieved through two key reparameterizations: from the parameter θ to a new parameter ϕ, and from the parameter ϕ to a new parameter χ. The variable ϕ represents the local canonical parameter of an approximating exponential model, and the parameter χ is a scaled version of ϕ. The canonical parameterization of θ is given by: ϕ T (θ) = l(θ; y) yt V, (4) y o where V = (v 1,..., v p ) is an ancillary direction array that can be obtained as V = y θ T { } k(y, θ) 1 { } k(y, θ) = ˆθ y T θ T, (5) ˆθ where k = k(y, θ) = (k 1,..., k n ) T is a full dimensional pivotal quantity. Fraser and Reid (1995) obtain this conditionality reduction without the computation of an explicit ancillary statistic. The second reparameterization is to χ(θ), where χ(θ) is constructed to act as a scalar canonical parameter in the new parameterization: χ(θ) = ψ ϕ T (ˆθ ψ ) ϕ(θ), (6) ψ ϕ T (ˆθ ψ ) where ψ ϕ T (θ) = ψ(θ)/ ϕ T = ( ψ(θ)/ θ T )( ϕ(θ)/ θ T ) 1. The matrices j ϕϕ T (ˆθ) and j (λλ T )(ˆθ ψ ) are defined as the observed information matrix and observed nuisance information matrix, respectively, and are defined as follows: j ϕϕ T (ˆθ) = j θθ T (ˆθ) ϕ θ T (ˆθ) 2 (7) j (λλ T )(ˆθ ψ ) = j λλ T (ˆθ ψ ) ϕ T λ (ˆθ ψ )ϕ λ T (ˆθ ψ ) 1. (8) The standardized maximum likelihood departure is then given by Q = sgn( ˆψ ψ) χ(ˆθ) χ(ˆθ ψ ) { } 1/2 jϕϕ T (ˆθ). (9) j (λλ T )(ˆθ ψ ) Notice for canonical parameter, ϕ T (θ) = (ψ, λ T ) T, the expression in (9) simplifies to the Wald departure given in (1). For this case, more accurate inference about ψ is simply based on (3) with the conventional first-order quantities given in (1) and (2) as inputs. 3 Fraser and Reid show that an exact ancillary statistic is not required for this reduction. 4
5 Now, consider the multinomial logit model. Suppose there are J+1 response categories, y i = (y i0,..., y ij ) with corresponding probabilities, (π i0,..., π ij ) and K explanatory variables with associated β T j parameters where β j is a K 1 vector. The probabilities are derived as: π ij = Λ(β T j x i ) = exp(βj T x i) 1 +, j = 0, 1,..., J, J m=1 exp(βmx T i ) with the normalization β 0 = 0. For data y = (y 1,..., y n ) the likelihood function is given by L(β) = n = exp π y i1 i1 πy i2 i2 πy ij ij (1 π i1... π ij ) (1 y i1... y ij ) { β T 1 y i1 x i βj T [ ]} 1 y ij x i + log 1 +. J m=1 exp(βmx T i ) The corresponding log likelihood is given by [ ] l(β) = β1 T y i1 x i βj T 1 y ij x i + log 1 +. (10) J m=1 exp(βmx T i ) The exponential family form in (10) gives the canonical parameter ϕ(θ) = (β1 T,..., βt k ). Thus if interest is on a scalar component of βj T, the maximum likelihood departure Q is given by expression (1). To calculate this expression the first and second derivatives of the log likelihood function and related quantities are required. The first and second derivatives for this model are easily calculated: l βjk = (y ij π ij )x ik, j = 1,..., J and k = 1,..., K l βjk β jl = π ij (1 π ij )x ik x il, j = 1,..., J and k, l = 1,..., K l βjk β j T l = π ij π ij x ik x il, j = 1,..., J and k, l = 1,..., K j j T. To examine the higher-order adjustment, two simple examples are considered. 4 For the first example, data from a real economic field experiment are used to estimate the parameters of the model. In this example, there are five independent variables and a dependent variable that can take on one of two values, 0 or 1, i.e. the model is the standard logit model. 5 The dataset for this example is provided in Table 1. The estimation results (with the constant suppressed) with 0 as the comparison group are provided in Table 2. Log odds can easily be calculated by exponentiating the coefficients. The conventional p-values associated with the maximum likelihood 4 All computations were done in Stata 8. Code for the two examples is accessible from mrekkas. Code for the first example is also provided using R Code. 5 The special case where the dependent variable can only take one of two values has previously been considered. For more on the logit model see Brazzale (1999). 5
6 estimates are reported along with those produced from the signed log likelihood ratio departure given in (2) and from the modified log likelihood ratio statistic given in (3). These resulting p-values are denoted as MLE, LR, and RSTAR, respectively. The p-values associated with the maximum likelihood estimates are provided for comparison as these p-values are outputted by most conventional statistical packages. It should be noted that using the r formula in (3) along with Φ(r ) produces a p-value with interpretation as probability left of the data point. However, for consistency with output reported by statistical packages, the p-values associated with r in the tables are always reported to reflect tail probabilities. As can be discerned from Table 2, even with 40 observations, the p-values produced from the three different methods are quite different and, depending on the method chosen, would lead to different inferences about the parameters. Next, the second example considers a dependent variable that can take on one of three values, 1, 2 or 3. The dataset is provided in Table 3. 6 Results from this estimation (with the constants suppressed) with group 1 as the comparison group are provided in Table 4. Relative risk ratios can be obtained by exponentiating the coefficients. Once again the table reveals a wide range of p-values. For instance, the coefficient for variable X2 in the Y=3 equation, would be deemed insignificant at the 5% level using the conventional MLE test while it would be deemed significant at this level using the LR or RSTAR methods. To investigate the properties of the higher-order method in small and large samples, two simulations are conducted and accuracy is assessed by computing the observed p-values for each method (MLE, LR, RSTAR) and recording several criteria. The recorded criteria for each method is as follows: coverage probability, coverage error, upper and lower error probabilities, and coverage bias. The coverage probability records the percentage of a true parameter value falling within the intervals. The coverage error records the absolute difference between the nominal level and the coverage probability. The upper (lower) error probability records the percentage of a true parameter value falling above (below) the intervals. And the coverage bias represents the sum of the absolute differences between the upper and lower error probabilities and their nominal levels. The first simulation generates 10,000 random samples each of size 50 from a dataset of brand choice with two independent variables representing gender and age. 7 The simulated dependent variable can take one of three different values representing one of three different brands. The data are provided in Table 5, where X0 represents the constant, X1 represents the gender of the 6 This dataset consists of a sample of size 30 from the data available at for car choice. 7 The dataset consists of a sample of size 50 from the data available at 6
7 consumer (coded 1 if the consumer is female) and X2 represents the age of the consumer. The dependent variable is simulated under the following conditions: the first brand was chosen as the base category and the true values for the parameters were set as and for the constants of brands 2 and 3, respectively, and for the parameter associated with the gender variable for brands 2 and 3, respectively, and and for the parameter associated with age for brands 2 and 3, respectively. The results from this simulation are recorded in Table 6 for nominal 90%, 95%, and 99% confidence intervals for covering the true gender parameter associated with brand 2 of The superiority of the higher-order method in terms of coverage error and coverage bias is evident. Notice the skewed tail probabilities produced by both first-order methods. The second simulation generates 10,000 random samples using the full dataset of 735 observations with similar conditions set out in the first simulation. The dataset is not listed but is available at the website provided earlier. The results from this simulation are provided in Table 7 again for nominal 90%, 95%, and 99% confidence intervals for covering the true gender parameter associated with brand 2 of With this larger sample size the first-order methods perform predictably better, however, the asymmetry in the tails while diminished, still persists. 3 Conclusion In this paper higher order likelihood asymptotic theory was applied for testing parameters of the multinomial logit; improvements to first-order methods were shown using two simulations. Stata code has been made available to facilitate the implementation of these higher order adjustments. 7
8 References [1] Barndorff-Nielsen, O., 1991, Modified Signed Log-Likelihood Ratio, Biometrika 78, [2] Brazzale, A., 1999, Approximate Conditional Inference in Logistic and Loglinear Models, Journal of Computational and Graphical Statistics 8(3), [3] Davison, A., Fraser, D., Reid, N., 2006, Improved Likelihood Inference for Discrete Data, Journal of the Royal Statistical Society Series B 68, [4] Fraser, D., Reid, N., 1995, Ancillaries and Third-Order Significance, Utilitas Mathematica 7, [5] Jensen, J., 1995, Saddlepoint Approximation, Oxford University Press, New York. [6] Lugannani, R., Rice, S., 1980, Saddlepoint Approximation for the Distribution of the Sums of Independent Random Variables, Advances in Applied Probability 12, [7] Pierce, D., Peters, D., Practical Use of Higher Order Asymptotics for Multiparameter Exponential Families (with discussion), Journal of the Royal Statistical Society Series B 54,
9 Table 1: Data from Field Experiment Y X0 X1 X2 X3 X4 X5 Y X0 X1 X2 X3 X4 X Table 2: Estimation Results p-values Coefficient SE MLE LR RSTAR X X X X X
10 Table 3: Data Y X0 X1 X2 Y X0 X1 X Table 4: Estimation Results p-values Y Coefficient SE MLE LR RSTAR 2 X X X X
11 Table 5: Simulation Data X0 X1 X2 X0 X1 X2 X0 X1 X2 X0 X1 X2 X0 X1 X Table 6: Simulation Results for n = 50 Coverage Coverage Lower Upper Coverage CI Method Probability Error Probability Probability Bias MLE % LR RSTAR MLE % LR RSTAR MLE % LR RSTAR Table 7: Simulation Results for n = 735 Coverage Coverage Lower Upper Coverage CI Method Probability Error Probability Probability Bias MLE % LR RSTAR MLE % LR RSTAR MLE % LR RSTAR
Improved Inference for First Order Autocorrelation using Likelihood Analysis
Improved Inference for First Order Autocorrelation using Likelihood Analysis M. Rekkas Y. Sun A. Wong Abstract Testing for first-order autocorrelation in small samples using the standard asymptotic test
More informationThird-order inference for autocorrelation in nonlinear regression models
Third-order inference for autocorrelation in nonlinear regression models P. E. Nguimkeu M. Rekkas Abstract We propose third-order likelihood-based methods to derive highly accurate p-value approximations
More informationASSESSING A VECTOR PARAMETER
SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;
More informationDEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY
Journal of Statistical Research 200x, Vol. xx, No. xx, pp. xx-xx ISSN 0256-422 X DEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY D. A. S. FRASER Department of Statistical Sciences,
More informationResearch Article Inference for the Sharpe Ratio Using a Likelihood-Based Approach
Journal of Probability and Statistics Volume 202 Article ID 87856 24 pages doi:0.55/202/87856 Research Article Inference for the Sharpe Ratio Using a Likelihood-Based Approach Ying Liu Marie Rekkas 2 and
More informationAn Improved Specification Test for AR(1) versus MA(1) Disturbances in Linear Regression Models
An Improved Specification Test for AR(1) versus MA(1) Disturbances in Linear Regression Models Pierre Nguimkeu Georgia State University Abstract This paper proposes an improved likelihood-based method
More informationThe formal relationship between analytic and bootstrap approaches to parametric inference
The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.
More informationApplied Asymptotics Case studies in higher order inference
Applied Asymptotics Case studies in higher order inference Nancy Reid May 18, 2006 A.C. Davison, A. R. Brazzale, A. M. Staicu Introduction likelihood-based inference in parametric models higher order approximations
More informationImproved Inference for Moving Average Disturbances in Nonlinear Regression Models
Improved Inference for Moving Average Disturbances in Nonlinear Regression Models Pierre Nguimkeu Georgia State University November 22, 2013 Abstract This paper proposes an improved likelihood-based method
More informationAccurate directional inference for vector parameters
Accurate directional inference for vector parameters Nancy Reid February 26, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Nancy Reid Accurate directional inference for vector parameters York University
More informationAccurate directional inference for vector parameters
Accurate directional inference for vector parameters Nancy Reid October 28, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Parametric models and likelihood model f (y; θ), θ R p data y = (y 1,...,
More informationCONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES. D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3
CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3 N. Reid Department of Statistics University of Toronto Toronto,
More informationCOMBINING p-values: A DEFINITIVE PROCESS. Galley
0 Journal of Statistical Research ISSN 0 - X 00, Vol., No., pp. - Bangladesh COMBINING p-values: A DEFINITIVE PROCESS D.A.S. Fraser Department of Statistics, University of Toronto, Toronto, Canada MS G
More informationApproximating models. Nancy Reid, University of Toronto. Oxford, February 6.
Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y
More informationBayesian and frequentist inference
Bayesian and frequentist inference Nancy Reid March 26, 2007 Don Fraser, Ana-Maria Staicu Overview Methods of inference Asymptotic theory Approximate posteriors matching priors Examples Logistic regression
More informationLikelihood Inference in the Presence of Nuisance Parameters
PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe
More informationPrinciples of Statistical Inference
Principles of Statistical Inference Nancy Reid and David Cox August 30, 2013 Introduction Statistics needs a healthy interplay between theory and applications theory meaning Foundations, rather than theoretical
More informationBootstrap and Parametric Inference: Successes and Challenges
Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist
More informationPrinciples of Statistical Inference
Principles of Statistical Inference Nancy Reid and David Cox August 30, 2013 Introduction Statistics needs a healthy interplay between theory and applications theory meaning Foundations, rather than theoretical
More informationLikelihood Inference in the Presence of Nuisance Parameters
Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based
More informationLikelihood inference in the presence of nuisance parameters
Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference
More informationLast week. posterior marginal density. exact conditional density. LTCC Likelihood Theory Week 3 November 19, /36
Last week Nuisance parameters f (y; ψ, λ), l(ψ, λ) posterior marginal density π m (ψ) =. c (2π) q el P(ψ) l P ( ˆψ) j P ( ˆψ) 1/2 π(ψ, ˆλ ψ ) j λλ ( ˆψ, ˆλ) 1/2 π( ˆψ, ˆλ) j λλ (ψ, ˆλ ψ ) 1/2 l p (ψ) =
More informationPARAMETER CURVATURE REVISITED AND THE BAYES-FREQUENTIST DIVERGENCE.
Journal of Statistical Research 200x, Vol. xx, No. xx, pp. xx-xx Bangladesh ISSN 0256-422 X PARAMETER CURVATURE REVISITED AND THE BAYES-FREQUENTIST DIVERGENCE. A.M. FRASER Department of Mathematics, University
More informationConditional Inference by Estimation of a Marginal Distribution
Conditional Inference by Estimation of a Marginal Distribution Thomas J. DiCiccio and G. Alastair Young 1 Introduction Conditional inference has been, since the seminal work of Fisher (1934), a fundamental
More informationExponential Models: Approximations for Probabilities
JIRSS (2011) Vol. 10, No. 2, pp 95-107 Exponential Models: Approximations for Probabilities D. A. S. Fraser 1,2,A.Naderi 3, Kexin Ji 1,WeiLin 1, Jie Su 1 1 Department of Statistics, University of Toronto,
More informationASYMPTOTICS AND THE THEORY OF INFERENCE
ASYMPTOTICS AND THE THEORY OF INFERENCE N. Reid University of Toronto Abstract Asymptotic analysis has always been very useful for deriving distributions in statistics in cases where the exact distribution
More informationStaicu, A-M., & Reid, N. (2007). On the uniqueness of probability matching priors.
Staicu, A-M., & Reid, N. (2007). On the uniqueness of probability matching priors. Early version, also known as pre-print Link to publication record in Explore Bristol Research PDF-document University
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Light touch assessment One or two problems assigned weekly graded during Reading Week http://www.utstat.toronto.edu/reid/4508s14.html
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More informationModern Likelihood-Frequentist Inference. Donald A Pierce, OHSU and Ruggero Bellio, Univ of Udine
Modern Likelihood-Frequentist Inference Donald A Pierce, OHSU and Ruggero Bellio, Univ of Udine Shortly before 1980, important developments in frequency theory of inference were in the air. Strictly, this
More informationhoa: An R Package Bundle for Higher Order Likelihood Inference
hoa: An R Package Bundle for Higher Order Likelihood Inference by Alessandra R. Brazzale Rnews, 5/1 May 2005, pp. 20 27 Introduction The likelihood function represents the basic ingredient of many commonly
More informationNancy Reid SS 6002A Office Hours by appointment
Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Problems assigned weekly, due the following week http://www.utstat.toronto.edu/reid/4508s16.html Various types of likelihood 1. likelihood,
More informationMeasuring nuisance parameter effects in Bayesian inference
Measuring nuisance parameter effects in Bayesian inference Alastair Young Imperial College London WHOA-PSI-2017 1 / 31 Acknowledgements: Tom DiCiccio, Cornell University; Daniel Garcia Rasines, Imperial
More informationLikelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona
Likelihood based Statistical Inference Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona L. Pace, A. Salvan, N. Sartori Udine, April 2008 Likelihood: observed quantities,
More informationAustralian & New Zealand Journal of Statistics
Australian & New Zealand Journal of Statistics Aust.N.Z.J.Stat.51(2), 2009, 115 126 doi: 10.1111/j.1467-842X.2009.00548.x ROUTES TO HIGHER-ORDER ACCURACY IN PARAMETRIC INFERENCE G. ALASTAIR YOUNG 1 Imperial
More informationPRINCIPLES OF STATISTICAL INFERENCE
Advanced Series on Statistical Science & Applied Probability PRINCIPLES OF STATISTICAL INFERENCE from a Neo-Fisherian Perspective Luigi Pace Department of Statistics University ofudine, Italy Alessandra
More informationANCILLARY STATISTICS: A REVIEW
Statistica Sinica 20 (2010), 1309-1332 ANCILLARY STATISTICS: A REVIEW M. Ghosh 1, N. Reid 2 and D. A. S. Fraser 2 1 University of Florida and 2 University of Toronto Abstract: In a parametric statistical
More informationModern likelihood inference for the parameter of skewness: An application to monozygotic
Working Paper Series, N. 10, December 2013 Modern likelihood inference for the parameter of skewness: An application to monozygotic twin studies Mameli Valentina Department of Mathematics and Computer
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More informationANCILLARY STATISTICS: A REVIEW
1 ANCILLARY STATISTICS: A REVIEW M. Ghosh, N. Reid and D.A.S. Fraser University of Florida and University of Toronto Abstract: In a parametric statistical model, a function of the data is said to be ancillary
More informationMarginal Posterior Simulation via Higher-order Tail Area Approximations
Bayesian Analysis (2014) 9, Number 1, pp. 129 146 Marginal Posterior Simulation via Higher-order Tail Area Approximations Erlis Ruli, Nicola Sartori and Laura Ventura Abstract. A new method for posterior
More informationAccurate directional inference for vector parameters in linear exponential families
Accurate directional inference for vector parameters in linear exponential families A. C. Davison, D. A. S. Fraser, N. Reid and N. Sartori August 27, 2013 Abstract We consider inference on a vector-valued
More informationNuisance parameters and their treatment
BS2 Statistical Inference, Lecture 2, Hilary Term 2008 April 2, 2008 Ancillarity Inference principles Completeness A statistic A = a(x ) is said to be ancillary if (i) The distribution of A does not depend
More informationDefault priors for Bayesian and frequentist inference
Default priors for Bayesian and frequentist inference D.A.S. Fraser and N. Reid University of Toronto, Canada E. Marras Centre for Advanced Studies and Development, Sardinia University of Rome La Sapienza,
More informationIntegrated likelihoods in survival models for highlystratified
Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationGood Confidence Intervals for Categorical Data Analyses. Alan Agresti
Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline
More informationANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW
SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationInterval Estimation for the Ratio and Difference of Two Lognormal Means
UW Biostatistics Working Paper Series 12-7-2005 Interval Estimation for the Ratio and Difference of Two Lognormal Means Yea-Hung Chen University of Washington, yeahung@u.washington.edu Xiao-Hua Zhou University
More informationLikelihood and Asymptotic Theory for Statistical Inference
Likelihood and Asymptotic Theory for Statistical Inference Nancy Reid 020 7679 1863 reid@utstat.utoronto.ca n.reid@ucl.ac.uk http://www.utstat.toronto.edu/reid/ltccf12.html LTCC Likelihood Theory Week
More informationModern Likelihood-Frequentist Inference. Summary
Modern Likelihood-Frequentist Inference Donald A. Pierce Oregon Health and Sciences University Portland, Oregon U.S.A Ruggero Bellio University of Udine Udine, Italy Summary We offer an exposition of modern
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationChapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview
Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview Abstract This chapter introduces the likelihood function and estimation by maximum likelihood. Some important properties
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationSaddlepoint-Based Bootstrap Inference in Dependent Data Settings
Saddlepoint-Based Bootstrap Inference in Dependent Data Settings Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri University of Science and Technology Indika Wickramasinghe,
More informationTheory and Methods of Statistical Inference. PART I Frequentist likelihood methods
PhD School in Statistics XXV cycle, 2010 Theory and Methods of Statistical Inference PART I Frequentist likelihood methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationGeneralized confidence intervals for the ratio or difference of two means for lognormal populations with zeros
UW Biostatistics Working Paper Series 9-7-2006 Generalized confidence intervals for the ratio or difference of two means for lognormal populations with zeros Yea-Hung Chen University of Washington, yeahung@u.washington.edu
More informationTheory and Methods of Statistical Inference. PART I Frequentist theory and methods
PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution
More informationON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION
ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ZHENLINYANGandRONNIET.C.LEE Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore
More informationTheory and Methods of Statistical Inference
PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:
More informationA NOTE ON LIKELIHOOD ASYMPTOTICS IN NORMAL LINEAR REGRESSION
Ann. Inst. Statist. Math. Vol. 55, No. 1, 187-195 (2003) Q2003 The Institute of Statistical Mathematics A NOTE ON LIKELIHOOD ASYMPTOTICS IN NORMAL LINEAR REGRESSION N. SARTORI Department of Statistics,
More informationLikelihood and Asymptotic Theory for Statistical Inference
Likelihood and Asymptotic Theory for Statistical Inference Nancy Reid 020 7679 1863 reid@utstat.utoronto.ca n.reid@ucl.ac.uk http://www.utstat.toronto.edu/reid/ltccf12.html LTCC Likelihood Theory Week
More informationLikelihood and p-value functions in the composite likelihood context
Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining
More informationComparison between conditional and marginal maximum likelihood for a class of item response models
(1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia
More informationIntegrated likelihoods in models with stratum nuisance parameters
Riccardo De Bin, Nicola Sartori, Thomas A. Severini Integrated likelihoods in models with stratum nuisance parameters Technical Report Number 157, 2014 Department of Statistics University of Munich http://www.stat.uni-muenchen.de
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationSecond order ancillary: A differential view from continuity
Second order ancillary: A differential view from continuity BY AILANA M. FRASER Department of Mathematics, University of British Columbia, Vancouver, Canada V6T 1Z2 afraser@math.ubc.ca DONALD A.S. FRASER*
More informationTesting an Autoregressive Structure in Binary Time Series Models
ömmföäflsäafaäsflassflassflas ffffffffffffffffffffffffffffffffffff Discussion Papers Testing an Autoregressive Structure in Binary Time Series Models Henri Nyberg University of Helsinki and HECER Discussion
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More informationPredicting a Future Median Life through a Power Transformation
Predicting a Future Median Life through a Power Transformation ZHENLIN YANG 1 Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore 117543 Abstract.
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More informationIntroduction to mtm: An R Package for Marginalized Transition Models
Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition
More informationLecture 26: Likelihood ratio tests
Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for
More informationLikelihood-based Inference for Linear-by-Linear Association in Contingency Tables
Università degli Studi di Padova Dipartimento di Scienze Statistiche Corso di Laurea Triennale in Statistica per le Tecnologie e le Scienze Relazione Finale Likelihood-based Inference for Linear-by-Linear
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationAN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY
Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationAccurate Directional Inference for Vector Parameters in Linear Exponential Families
Accurate Directional Inference for Vector Parameters in Linear Exponential Families A. C. DAVISON,D.A.S.FRASER, N.REID, and N. SARTORI Q1 5 10 We consider inference on a vector-valued parameter of interest
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationGeneralized Linear Modeling - Logistic Regression
1 Generalized Linear Modeling - Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating
More informationLogistic regression: Miscellaneous topics
Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio
More informationSample size determination for logistic regression: A simulation study
Sample size determination for logistic regression: A simulation study Stephen Bush School of Mathematical Sciences, University of Technology Sydney, PO Box 123 Broadway NSW 2007, Australia Abstract This
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More information1 Procedures robust to weak instruments
Comment on Weak instrument robust tests in GMM and the new Keynesian Phillips curve By Anna Mikusheva We are witnessing a growing awareness among applied researchers about the possibility of having weak
More informationADJUSTED PROFILE LIKELIHOODS FOR THE WEIBULL SHAPE PARAMETER
ADJUSTED PROFILE LIKELIHOODS FOR THE WEIBULL SHAPE PARAMETER SILVIA L.P. FERRARI Departamento de Estatística, IME, Universidade de São Paulo Caixa Postal 66281, São Paulo/SP, 05311 970, Brazil email: sferrari@ime.usp.br
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationLikelihood Inference in Exponential Families and Generic Directions of Recession
Likelihood Inference in Exponential Families and Generic Directions of Recession Charles J. Geyer School of Statistics University of Minnesota Elizabeth A. Thompson Department of Statistics University
More informationFREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE
FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia
More informationDEPARTMENT OF ECONOMICS
ISSN 0819-64 ISBN 0 7340 616 1 THE UNIVERSITY OF MELBOURNE DEPARTMENT OF ECONOMICS RESEARCH PAPER NUMBER 959 FEBRUARY 006 TESTING FOR RATE-DEPENDENCE AND ASYMMETRY IN INFLATION UNCERTAINTY: EVIDENCE FROM
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More information