Approximate Median Regression via the Box-Cox Transformation

Size: px
Start display at page:

Download "Approximate Median Regression via the Box-Cox Transformation"

Transcription

1 Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The usual median regression estimating equations are derived from minimizing the least absolute deviations (LAD). Because they are not a smooth function of the regression parameters, a solution is best obtained using a linear programming algorithm. As an alternative, we propose estimating the median regression parameters via Gaussian estimating equations after applying a Box-Cox transformation to both the outcome and the linear predictor. The proposed estimator is notably more efficient than the standard LAD estimator, albeit with an acknowledged loss of robustness. KEY WORDS: Least absolute deviations; Maximum likelihood; Median regression. 1. INTRODUCTION Median regression, in which the median of the outcome is assumed to be linear in the regression parameters, is used increasingly for analyzing non-normal continuous outcome data. For example, median regression has been applied to positively skewed data, such as length of hospital stay data (Lee, Fung, and Fu 2003) or survival data (Koenker and Geling 2001). Median regression models are appealing because they are robust to outliers in the outcome; their regression parameters also have simple interpretations. The median regression parameters can be consistently estimated by minimizing the least absolute deviations (LAD) via a linear programming algorithm (Basset and Koenker 1982). In this article, we propose estimating the median regression parameters via Gaussian estimating equations after applying a Box-Cox transformation (Box and Cox 1964) to both the outcome and linear predictor. We relate the mean of the Box-Cox transformed outcome to the same transformation of the linear predictor; the transformation parameter is assumed to be unknown. When the distribution of the transformed outcome is symmetric, the median of the untransformed outcome equals the linear predictor. The median regression parameters, and the unknown transformation parameter, are estimated by maximizing the Gaussian likelihood function. Certainly, not all data can be transformed to a normal distribution. However, Draper and Cox (1969) reported that even Garrett M. Fitzmaurice is Associate Professor, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA ( fitzmaur@hsph.harvard.edu). Stuart R. Lipsitz is Associate Professor, Harvard Medical School, Boston, MA. Michael Parzen is Associate Professor, Emory University, Atlanta, GA. The authors are grateful for the support provided by grants AI and GM from the U.S. National Institutes of Health. in cases where no Box-Cox transformation can transform the outcome exactly to a normal random variable, the transformed variable satisfies certain restrictions on the first four moments, and its distribution will usually be close to symmetric. When its distribution is almost symmetric, we would expect little bias for estimating the median regression. We explore the robustness of the proposed method in more detail via simulations in Section MEDIAN REGRESSION VIA BOX-COX TRANSFORMATION Let Y i be a continuous response variable for the ith independent observation and x i be a J 1 covariate vector, i = 1,...,n. We assume the median of Y i (conditional on x i )ism i = x i β or, equivalently, Pr(Y i >x i β) = 0.5. Next, consider the Box-Cox transformation of Y i, denoted by U i = u(y i,λ,c),where (Y i +c) λ 1 λ if λ = 0, U i = u(y i,λ,c)= (1) log(y i + c) if λ = 0; c is a fixed constant to ensure Y i + c > 0; and λ is an unknown parameter to be estimated. Note that because u(y i,λ,c) is a monotone transformation of Y i and Pr(Y i >x i β) = 0.5, then Pr{u(Y i,λ,c) > u(x i β,λ,c)} =0.5. Furthermore, if the distribution of U i is symmetric, then µ i = E(U i x i ) = u(x i β,λ,c) (x i β + c)λ 1 if λ = 0, = λ (2) log(x i β + c) if λ = 0. In this article we assume that the distribution of Y i after the optimal Box-Cox transformation is almost symmetric, and thus the mean µ i is approximately equal to the median of U i, that is, Pr(U i >µ i ) Pr(Y i >x i β) = 0.5. The accuracy of this approximation depends on the symmetry of the distribution of U i. Thus, the proposed estimator requires that the distribution of U i is approximately symmetrical rather than normal. Taylor (1985) showed that the Box-Cox transformation is generally the most suitable method for transforming to symmetry. Although the focus is on the Box-Cox transformation, we note that the proposed method can be applied to any monotonic transformation of Y i (and the linear predictor). Finally, assuming the variance of U i is σ 2, and that U i has an approximate normal distribution, U i N(µ i,σ 2 ), the loglikelihood contribution from subject i is l i (β, σ 2,λ)= 1 2 log(2πσ2 ) 1 2σ 2 (U i µ i ) 2 + (λ 1) log(y i + c). (3) c 2007 American Statistical Association DOI: / X The American Statistician, August 2007, Vol. 61, No

2 Table 1. Results of simulation study for estimated medians at each value of x with Y i exponential. n = 80 n = 160 n = 320 x x x Approach Relative Box-Cox Bias (%) LAD MLE Simulation Box-Cox Variance LAD MLE Coverage Box-Cox Probability a LAD MLE a Coverage Probability of 95% Confidence Intervals Note, this contribution is the log of the normal density of U i plus the log of the Jacobian of the transformation (derivative of U i with respect to y i ). In cases where the distribution of U i is approximately symmetrical rather than normal, this method yields a quasi-likelihood estimator of β and the variance of β can be estimated using the so-called sandwich variance estimator (e.g., White 1982). The Gaussian assumption is used only to derive an estimator of β; the proposed method is based on, but does not require, that U i has a Gaussian distribution. The validity of the proposed estimator requires the weaker assumption that the distribution of U i is approximately symmetrical rather than normal. 3. MONTE CARLO SIMULATION STUDY To explore the properties of the proposed estimator, and its robustness to varying degrees of asymmetry in the distribution of U i, we performed a simulation study with four different specifications of the distribution of Y i given x i : log-normal, exponential, gamma, and Pareto. Note, when Y i is specified as lognormal, this is a special case of the Box-Cox transformation with λ = 0 and the proposed method is exact. For the remaining distributions, the Box-Cox transformation does not yield an exact normal random variable. In all specifications except for the gamma distribution, we let the median of Y i given x i be M i = β 0 +β 1 x i = x i ; for the gamma distribution, we let M i = x i. We considered total sample sizes n = 80, 160, and 320; we let x i take on values 1.0, 1.5, 2.0, and 2.5, with each value represented by 25% of the sample. We performed 2,500 simulation replications for each of the four possible distributions of Y i. We estimated the median regression parameters via the proposed Box-Cox transformation, in addition to minimizing the LAD via the linear programming algorithm of Basset and Koenker (1982). For the log-normal, exponential, and Pareto distributions of Y i, we also estimated (β 0,β 1 ) via maximum likelihood under the true distribution for Y i. Finally, for the case where Y i has a Pareto distribution, we also estimated (β 0,β 1 ) via the Box-Cox transformation applied to log(y i ) (and to the log of the linear predictor); as discussed earlier, the proposed method can be applied to any monotonic transformation of Y i. The Box-Cox Gaussian likelihood was maximized using the SAS procedure NLMIXED (SAS Institute Inc. 2000); sample SAS code for implementing the proposed method is available upon request from the authors. The results of the simulations when Y i is log-normal (not shown, but available upon request from the authors), with log(y i ) N(log(6.5 + x i ), 1), indicate that the estimates of the medians obtained via the optimal Box-Cox transformation are unbiased for n = 80, 160, and 320. This result is to be expected because the log-normal distribution is a special case where the Box-Cox transformation can be accurately applied. The relative bias of the Box-Cox and LAD estimates of the medians (when averaged over the distribution of the covariates and the three sample sizes) is approximately 1%, while for the MLE it is less than 0.25%. Of note, the Box-Cox estimates are almost as efficient as the MLEs (which can be considered a Box-Cox estimator with λ = 0 assumed known) and are notably more efficient than the standard LAD estimates. Specifically, the simulation variances of the LAD estimates are 50 75% larger than those for the estimates from the Box-Cox transformation. Finally, all three methods show good coverage probabilities for 95% confidence intervals. To explore robustness of the proposed method to varying degrees of asymmetry in the distribution of U i,welety i have an exponential distribution, with p(y i x i,β) = θ i e θ iy i, where θ i ={log(2)}/(β 0 + β 1 x i ); note, no transformation of Y i will yield a normal random variable. However, the average skewness of the residuals from the regression of the Box-Cox transformed Y i over all simulation replications was small (skewness = 0.05), indicating that the optimal Box-Cox transformation achieved a high degree of symmetry. The simulation results reported in Table 1 indicate that all three estimators show very little bias for n = 80, 160, and 320. The simulation variances of the LAD estimates are approximately 80% larger than those 234 Statistical Practice

3 Table 2. Results of simulation study for estimated medians at each value of x with Y i Gamma. n = 80 n = 160 n = 320 x x x Approach Relative Box-Cox Bias (%) LAD Simulation Box-Cox Variance LAD Coverage Box-Cox Probability a LAD a Coverage Probability of 95% Confidence Intervals for the estimates from the Box-Cox transformation. Next, we increased the degree of asymmetry in the distribution of U i by letting Y i have a gamma distribution with shape parameter = 0.5. This is an example of a skewed, heavy-tailed distribution where even the Box-Cox transformed Y i is likely to show skewness. The average skewness of the residuals from the regression of U i over all simulation replications was The results indicate that the proposed estimator yields biased estimates of β 0 and β 1. As expected, skewness in the residuals are likely to have somewhat greater impact on the intercept than on the slope. This bias results in biased estimates of the medians (see Table 2), with relative bias between 2 6%. Note, however, that the absolute bias is almost negligible when compared to the variance, even when n = 320. The LAD estimator also has relative bias in the range 1 8% for n = 80 and n = 160. On average, the relative bias of the Box-Cox and LAD median estimates is 4.9% and 2.8%, respectively. Finally, we considered the extreme case where Y i has a Pareto distribution with scale parameter equal to 1. The Pareto is an example of an extremely skewed, heavy-tailed distribution where U i is likely to show very discernible skewness. The average skewness of the residuals from the regression of the transformed Y i was 0.29, indicating that the transformation did not achieve a high degree of symmetry. As a result, the proposed estimator yields badly biased estimates of β 0, and consequently, of the medians (not shown). As would be expected, the coverage probabilities are poor in this setting. In contrast, the LAD estimator of (β 0,β 1 ) is far more robust and almost unbiased when the sample sizes are large (e.g., n = 320). Finally, when the Box- Cox transformation is applied to log(y i ) (and to log(x i β)),the bias of the proposed estimator is substantially reduced and the coverage probabilities meet the target coverage; in addition, the simulation variances of the estimates from the Box-Cox transformation of log(y i ) are approximately half the size of those for the LAD estimates. This improvement can be explained by the latter transformation of Y i achieving a far greater degree of symmetry (the average skewness was 0.05 instead of 0.29). This highlights an important point: when the distribution of Y i after a Box-Cox transformation is asymmetrical, there may be alternative monotonic transformations of Y i that can achieve a higher degree of symmetry. In summary, the results of the simulation study suggest that the proposed estimator is relatively robust to modest degrees of asymmetry in the distribution of Y i after a Box-Cox transformation. In these cases the relative bias is less than 5% and of comparable magnitude to the finite sample bias of the LAD estimator. In addition, the proposed method provides discernibly more efficient estimates than the standard LAD estimator. Of note, when the degree of asymmetry is modest, the bias is almost negligible compared to the variance. However, when there is strong asymmetry in the distribution of U i, the proposed estimator can yield badly biased estimates. 4. APPLICATION: DEVELOPMENTAL TOXICITY STUDY OF DYME Developmental toxicity studies of laboratory animals play a crucial role in the testing and regulation of chemicals and pharmaceutical compounds. Exposure to developmental toxicants during pregnancy typically causes adverse effects, such as fetal malformations and reduced fetal weight. In a typical experiment, laboratory animals are assigned to increasing doses of a test substance. In this section we describe an analysis of fetal weight data from a study of diethylene glycol dimethyl ether (DYME), a widely used organic solvent. In this study of laboratory mice, conducted through the National Toxicology Program, DYME was administered at five doses (0, 62.5, 125, 250, and 500 mg/kg/day) to 111 pregnant mice (dams) just after implantation. Following sacrifice, fetal weight was recorded for each live fetus with the goal of determining the effect of dose on weight. We considered a linear model in dose ( 10 3 ) for the median fetal weight, estimated via the Box-Cox transformation and standard LAD estimator. Results of these analyses produced very similar estimates of the intercept and slope for both methods. The two estimates of the slope for dose are 0.90 (Box-Cox) and 0.88 (LAD), indicating that the decrease in median fetal weight, comparing the highest dose group to the control group, is approximately 0.45 gm (or ). The two fitted median lines, presented in Figure 1, are virtually indistinguishable and fit the The American Statistician, August 2007, Vol. 61, No

4 Figure 1. Scatterplot of fetal weight (gm) versus dose (mg/kg/day), with the empirical medians (solid circles) and fitted medians for fetal weight for the estimators based on the Box-Cox transformation (solid line) and LAD (dashed line). empirical median at each dose well. However, the standard error for the slope estimated via the Box-Cox transformation (SE = 0.039) is discernibly smaller than the corresponding standard error for the LAD estimator (SE = 0.045); the ratio of the two estimates of variance is 1.33 and highlights the potential gains in efficiency. Finally, to highlight settings where the proposed estimator can and cannot be validly applied, we generated two simulated datasets from the exponential and Pareto distributions, respectively. Specifically, we generated 1,000 observations from exponential and Pareto (scale parameter = 1) distributions with median, M i = X i, where X i has a uniform distribution over the interval [0,4]. For the data from the exponential distribution, both methods produced similar estimates of the intercept and slope; the corresponding fitted lines (dashed lines) are presented in Figure 2(a) and are completely indistinguishable from the true median regression line (solid line). Note, the Y -axis in Figure 2(a) is on the log scale. In contrast, for the date from the Pareto distribution, the Box-Cox and LAD methods produced discernibly different estimates of the intercept (6.43 and 5.21, respectively) and slope (1.46 and 1.03, respectively). Figure 2(b) presents the fitted lines for the Box Cox (long-dashed line) and standard LAD (short-dashed line) estimators which are now clearly distinguishable. Although the Box-Cox estimator of the intercept is almost unbiased, the estimator of the slope has a relative bias of almost 50%. In this setting, the Box-Cox transformation is unable to symmetrize the data; moreover, even after transformation, there are numerous extreme observations that have an inordinate influence on the estimated slope. For example, although the median ranges from , over 10% of the observations (on the original scale) were larger than 1,000 (these extreme observation have been omitted from the scatterplot in Figure 2(b)), with 5% greater than 8,000, and the largest equal to The estimator based on the Box-Cox transformation is not robust to such extreme observations. 5. CONCLUDING REMARKS In this article we propose estimating the median regression parameters via Gaussian estimating equations after applying the optimal Box-Cox transformation to both the outcome and the linear predictor. The proposed estimator is notably more efficient than the standard LAD estimator, albeit with an acknowledged loss of robustness. Although the proposed estimator is relatively robust to modest degrees of asymmetry in the distribution of the transformed response, U i, we caution that it can yield badly biased estimates when there is strong asymmetry in the distribution of U i. Fortunately, the degree of skewness can be assessed easily from the data at hand. In cases where there is discernible skewness, the use of a more general class of transformations, for example, folded-power transformations (Mosteller and Tukey 1977) or the Zellner-Revankar transformation (Zellner and Revankar 1969), may yield a higher degree of symmetry. An appealing property of the standard LAD estimator is that it is robust to outliers. Although the Box-Cox transformation is likely to reduce the number of and influence of outliers on the transformed scale, because the proposed method relies on regressing the mean of the transformed response, it will likely be sensitive to extreme or outlying observations. When outliers are apparent on the transformed scale, the proposed approach 236 Statistical Practice

5 Figure 2(a). Scatterplot of 1,000 observations from an exponential distribution, with comparison of the true median regression line (solid line) and the fitted median lines for the estimators based on the Box-Cox transformation (long-dashed line) and LAD (short-dashed line). Figure 2(b). Scatterplot of 1,000 observations from a Pareto distribution, with comparison of the true median regression line (solid line) and the fitted median lines for the estimators based on the Box-Cox transformation (long-dashed line) and LAD (short-dashed line). The American Statistician, August 2007, Vol. 61, No

6 can be made more robust by simply downweighting or trimming the most extreme residuals. Any potential loss of information from trimming the most extreme residuals should be more than compensated by the increased precision over LAD estimation. [Received November Revised April 2007.] REFERENCES Bassett, G., and Koenker, R. (1982), An Empirical Quantile Function for Linear Models With iid Errors, Journal of the American Statistical Association, 77, Box, G.E.P., and Cox, D. R. (1964), An Analysis of Transformations, Journal of the Royal Statistical Society, Series B, 26, Draper, N.R., and Cox, D.R. (1969), On Distributions and Their Transformation to Normality, Journal of the Royal Statistical Society, Series B, 31, Koenker, R., and Geling, O. (2001), Reappraising Medfly Longevity: A Quantile Regression Survival Analysis, Journal of the American Statistical Association, 96, Lee, A.H., Fung, W.K., and Fu, B. (2003), Analyzing Hospital Length of Stay: Mean or Median Regression? Medical Care, 41, Mosteller, F., and Tukey, J.W. (1977), Data Analysis and Linear Regression, Reading, MA: Addison-Wesley. SAS Institute Inc. (2000), SAS/STAT User s Guide (Version 8 ed.), Cary, NC: SAS Institute. Taylor, J.M.G. (1985), Power Transformations to Symmetry, Biometrika, 72, Wedderburn, R.W.M. (1974), Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton Method, Biometrika, 61, White, H. (1982), Maximum Likelihood Estimation Under Misspecified Models, Econometrica, 50, Zellner, A., and Revankar, N. (1969), Generalized Production Functions, Review of Economic Studies, 36, Statistical Practice

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Chapter 11: Robust & Quantile regression

Chapter 11: Robust & Quantile regression Chapter 11: Robust & Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 13 11.3: Robust regression Leverages h ii and deleted residuals t i

More information

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops

More information

SAS macro to obtain reference values based on estimation of the lower and upper percentiles via quantile regression.

SAS macro to obtain reference values based on estimation of the lower and upper percentiles via quantile regression. SESUG 2012 Poster PO-12 SAS macro to obtain reference values based on estimation of the lower and upper percentiles via quantile regression. Neeta Shenvi Department of Biostatistics and Bioinformatics,

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

Accounting for Complex Sample Designs via Mixture Models

Accounting for Complex Sample Designs via Mixture Models Accounting for Complex Sample Designs via Finite Normal Mixture Models 1 1 University of Michigan School of Public Health August 2009 Talk Outline 1 2 Accommodating Sampling Weights in Mixture Models 3

More information

Model Fitting. Jean Yves Le Boudec

Model Fitting. Jean Yves Le Boudec Model Fitting Jean Yves Le Boudec 0 Contents 1. What is model fitting? 2. Linear Regression 3. Linear regression with norm minimization 4. Choosing a distribution 5. Heavy Tail 1 Virus Infection Data We

More information

For right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood,

For right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood, A NOTE ON LAPLACE REGRESSION WITH CENSORED DATA ROGER KOENKER Abstract. The Laplace likelihood method for estimating linear conditional quantile functions with right censored data proposed by Bottai and

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION (REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs Martin J. Wolfsegger Department of Biostatistics, Baxter AG, Vienna, Austria Thomas Jaki Department of Statistics, University of South Carolina,

More information

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions JKAU: Sci., Vol. 21 No. 2, pp: 197-212 (2009 A.D. / 1430 A.H.); DOI: 10.4197 / Sci. 21-2.2 Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions Ali Hussein Al-Marshadi

More information

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data Eduardo Elias Ribeiro Junior 1 2 Walmes Marques Zeviani 1 Wagner Hugo Bonat 1 Clarice Garcia

More information

STA 2201/442 Assignment 2

STA 2201/442 Assignment 2 STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution

More information

Remedial Measures for Multiple Linear Regression Models

Remedial Measures for Multiple Linear Regression Models Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline

More information

Statistical Practice

Statistical Practice Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University

More information

Small Sample Corrections for LTS and MCD

Small Sample Corrections for LTS and MCD myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

More information

Ph.D. course: Regression models. Introduction. 19 April 2012

Ph.D. course: Regression models. Introduction. 19 April 2012 Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable

More information

3. QUANTILE-REGRESSION MODEL AND ESTIMATION

3. QUANTILE-REGRESSION MODEL AND ESTIMATION 03-Hao.qxd 3/13/2007 5:24 PM Page 22 22 Combining these two partial derivatives leads to: m + y m f(y)dy = F (m) (1 F (m)) = 2F (m) 1. [A.2] By setting 2F(m) 1 = 0, we solve for the value of F(m) = 1/2,

More information

A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

More information

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable

More information

Identify Relative importance of covariates in Bayesian lasso quantile regression via new algorithm in statistical program R

Identify Relative importance of covariates in Bayesian lasso quantile regression via new algorithm in statistical program R Identify Relative importance of covariates in Bayesian lasso quantile regression via new algorithm in statistical program R Fadel Hamid Hadi Alhusseini Department of Statistics and Informatics, University

More information

Comparing MLE, MUE and Firth Estimates for Logistic Regression

Comparing MLE, MUE and Firth Estimates for Logistic Regression Comparing MLE, MUE and Firth Estimates for Logistic Regression Nitin R Patel, Chairman & Co-founder, Cytel Inc. Research Affiliate, MIT nitin@cytel.com Acknowledgements This presentation is based on joint

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

A Simple Plot for Model Assessment

A Simple Plot for Model Assessment A Simple Plot for Model Assessment David J. Olive Southern Illinois University September 16, 2005 Abstract Regression is the study of the conditional distribution y x of the response y given the predictors

More information

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS TABLE OF CONTENTS INTRODUCTORY NOTE NOTES AND PROBLEM SETS Section 1 - Point Estimation 1 Problem Set 1 15 Section 2 - Confidence Intervals and

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

S The Over-Reliance on the Central Limit Theorem

S The Over-Reliance on the Central Limit Theorem S04-2008 The Over-Reliance on the Central Limit Theorem Abstract The objective is to demonstrate the theoretical and practical implication of the central limit theorem. The theorem states that as n approaches

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

XVI. Transformations. by: David Scott and David M. Lane

XVI. Transformations. by: David Scott and David M. Lane XVI. Transformations by: David Scott and David M. Lane A. Log B. Tukey's Ladder of Powers C. Box-Cox Transformations D. Exercises The focus of statistics courses is the exposition of appropriate methodology

More information

A. Motivation To motivate the analysis of variance framework, we consider the following example.

A. Motivation To motivate the analysis of variance framework, we consider the following example. 9.07 ntroduction to Statistics for Brain and Cognitive Sciences Emery N. Brown Lecture 14: Analysis of Variance. Objectives Understand analysis of variance as a special case of the linear model. Understand

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

p y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.

p y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise. 1. Suppose Y 1, Y 2,..., Y n is an iid sample from a Bernoulli(p) population distribution, where 0 < p < 1 is unknown. The population pmf is p y (1 p) 1 y, y = 0, 1 p Y (y p) = (a) Prove that Y is the

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

The Glejser Test and the Median Regression

The Glejser Test and the Median Regression Sankhyā : The Indian Journal of Statistics Special Issue on Quantile Regression and Related Methods 2005, Volume 67, Part 2, pp 335-358 c 2005, Indian Statistical Institute The Glejser Test and the Median

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Investigation of Possible Biases in Tau Neutrino Mass Limits

Investigation of Possible Biases in Tau Neutrino Mass Limits Investigation of Possible Biases in Tau Neutrino Mass Limits Kyle Armour Departments of Physics and Mathematics, University of California, San Diego, La Jolla, CA 92093 (Dated: August 8, 2003) We study

More information

A Note on Visualizing Response Transformations in Regression

A Note on Visualizing Response Transformations in Regression Southern Illinois University Carbondale OpenSIUC Articles and Preprints Department of Mathematics 11-2001 A Note on Visualizing Response Transformations in Regression R. Dennis Cook University of Minnesota

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013 BSquare: An R package for Bayesian simultaneous quantile regression Luke B Smith and Brian J Reich North Carolina State University May 21, 2013 BSquare in an R package to conduct Bayesian quantile regression

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

V. Properties of estimators {Parts C, D & E in this file}

V. Properties of estimators {Parts C, D & E in this file} A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

More on Specification and Data Issues

More on Specification and Data Issues More on Specification and Data Issues Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Specification and Data Issues 1 / 35 Functional Form Misspecification Functional

More information

A measure of partial association for generalized estimating equations

A measure of partial association for generalized estimating equations A measure of partial association for generalized estimating equations Sundar Natarajan, 1 Stuart Lipsitz, 2 Michael Parzen 3 and Stephen Lipshultz 4 1 Department of Medicine, New York University School

More information

Generalized linear models with a coarsened covariate

Generalized linear models with a coarsened covariate Appl. Statist. (2004) 53, Part 2, pp. 279 292 Generalized linear models with a coarsened covariate Stuart Lipsitz, Medical University of South Carolina, Charleston, USA Michael Parzen, University of Chicago,

More information

What to do if Assumptions are Violated?

What to do if Assumptions are Violated? What to do if Assumptions are Violated? Abandon simple linear regression for something else (usually more complicated). Some examples of alternative models: weighted least square appropriate model if the

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring Communications for Statistical Applications and Methods 2014, Vol. 21, No. 6, 529 538 DOI: http://dx.doi.org/10.5351/csam.2014.21.6.529 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation for Mean

More information

Quantile Regression for Recurrent Gap Time Data

Quantile Regression for Recurrent Gap Time Data Biometrics 000, 1 21 DOI: 000 000 0000 Quantile Regression for Recurrent Gap Time Data Xianghua Luo 1,, Chiung-Yu Huang 2, and Lan Wang 3 1 Division of Biostatistics, School of Public Health, University

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

Simple Linear Regression for the Climate Data

Simple Linear Regression for the Climate Data Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO

More information

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Statistica Sinica 15(05), 1015-1032 QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Scarlett L. Bellamy 1, Yi Li 2, Xihong

More information

References. Regression standard errors in clustered samples. diagonal matrix with elements

References. Regression standard errors in clustered samples. diagonal matrix with elements Stata Technical Bulletin 19 diagonal matrix with elements ( q=ferrors (0) if r > 0 W ii = (1 q)=f errors (0) if r < 0 0 otherwise and R 2 is the design matrix X 0 X. This is derived from formula 3.11 in

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Predicting a Future Median Life through a Power Transformation

Predicting a Future Median Life through a Power Transformation Predicting a Future Median Life through a Power Transformation ZHENLIN YANG 1 Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore 117543 Abstract.

More information

Estimation de mesures de risques à partir des L p -quantiles

Estimation de mesures de risques à partir des L p -quantiles 1/ 42 Estimation de mesures de risques à partir des L p -quantiles extrêmes Stéphane GIRARD (Inria Grenoble Rhône-Alpes) collaboration avec Abdelaati DAOUIA (Toulouse School of Economics), & Gilles STUPFLER

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression: Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of

More information

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X). 4. Interval estimation The goal for interval estimation is to specify the accurary of an estimate. A 1 α confidence set for a parameter θ is a set C(X) in the parameter space Θ, depending only on X, such

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model Applied and Computational Mathematics 2014; 3(5): 268-272 Published online November 10, 2014 (http://www.sciencepublishinggroup.com/j/acm) doi: 10.11648/j.acm.20140305.22 ISSN: 2328-5605 (Print); ISSN:

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58 Generalized Linear Mixed-Effects Models Copyright c 2015 Dan Nettleton (Iowa State University) Statistics 510 1 / 58 Reconsideration of the Plant Fungus Example Consider again the experiment designed to

More information

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1. Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error

More information

SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS. Ping Sa and S.J.

SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS. Ping Sa and S.J. SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS " # Ping Sa and S.J. Lee " Dept. of Mathematics and Statistics, U. of North Florida,

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable

More information

Better Bootstrap Confidence Intervals

Better Bootstrap Confidence Intervals by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at Biometrika Trust Robust Regression via Discriminant Analysis Author(s): A. C. Atkinson and D. R. Cox Source: Biometrika, Vol. 64, No. 1 (Apr., 1977), pp. 15-19 Published by: Oxford University Press on

More information

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 3. Review - OLS and 2SLS Luc Anselin http://spatial.uchicago.edu OLS estimation (recap) non-spatial regression diagnostics endogeneity - IV and 2SLS OLS Estimation (recap) Linear Regression

More information