Approximate Median Regression via the Box-Cox Transformation

Similar documents
WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

General Regression Model

Chapter 11: Robust & Quantile regression

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

SAS macro to obtain reference values based on estimation of the lower and upper percentiles via quantile regression.

Institute of Actuaries of India

Regression diagnostics

Accounting for Complex Sample Designs via Mixture Models

Model Fitting. Jean Yves Le Boudec

For right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood,

Semiparametric Generalized Linear Models

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

The Nonparametric Bootstrap

Confidence Intervals, Testing and ANOVA Summary

For more information about how to cite these materials visit

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

What s New in Econometrics? Lecture 14 Quantile Methods

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

STA 2201/442 Assignment 2

Remedial Measures for Multiple Linear Regression Models

Statistical Practice

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

Small Sample Corrections for LTS and MCD

Ph.D. course: Regression models. Introduction. 19 April 2012

3. QUANTILE-REGRESSION MODEL AND ESTIMATION

A Note on Bayesian Inference After Multiple Imputation

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Identify Relative importance of covariates in Bayesian lasso quantile regression via new algorithm in statistical program R

Comparing MLE, MUE and Firth Estimates for Logistic Regression

Bayesian spatial quantile regression

Finite Population Sampling and Inference

A Simple Plot for Model Assessment

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

multilevel modeling: concepts, applications and interpretations

S The Over-Reliance on the Central Limit Theorem

ECON The Simple Regression Model

Modification and Improvement of Empirical Likelihood for Missing Response Problem

XVI. Transformations. by: David Scott and David M. Lane

A. Motivation To motivate the analysis of variance framework, we consider the following example.

Bayesian Regression Linear and Logistic Regression

p y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.

Introduction to Linear regression analysis. Part 2. Model comparisons

The Glejser Test and the Median Regression

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Investigation of Possible Biases in Tau Neutrino Mass Limits

A Note on Visualizing Response Transformations in Regression

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Stat 5101 Lecture Notes

Swarthmore Honors Exam 2012: Statistics

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

11. Bootstrap Methods

V. Properties of estimators {Parts C, D & E in this file}

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

More on Specification and Data Issues

A measure of partial association for generalized estimating equations

Generalized linear models with a coarsened covariate

What to do if Assumptions are Violated?

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring

Quantile Regression for Recurrent Gap Time Data

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Simple Linear Regression for the Climate Data

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS

References. Regression standard errors in clustered samples. diagonal matrix with elements

Stat 5102 Final Exam May 14, 2015

Predicting a Future Median Life through a Power Transformation

Estimation de mesures de risques à partir des L p -quantiles

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Lecture 22 Survival Analysis: An Introduction

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

Generalized Estimating Equations

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

SIMULTANEOUS CONFIDENCE BANDS FOR THE PTH PERCENTILE AND THE MEAN LIFETIME IN EXPONENTIAL AND WEIBULL REGRESSION MODELS. Ping Sa and S.J.

Obtaining Critical Values for Test of Markov Regime Switching

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Better Bootstrap Confidence Intervals

BTRY 4090: Spring 2009 Theory of Statistics

Subject CS1 Actuarial Statistics 1 Core Principles

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Transcription:

Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The usual median regression estimating equations are derived from minimizing the least absolute deviations (LAD). Because they are not a smooth function of the regression parameters, a solution is best obtained using a linear programming algorithm. As an alternative, we propose estimating the median regression parameters via Gaussian estimating equations after applying a Box-Cox transformation to both the outcome and the linear predictor. The proposed estimator is notably more efficient than the standard LAD estimator, albeit with an acknowledged loss of robustness. KEY WORDS: Least absolute deviations; Maximum likelihood; Median regression. 1. INTRODUCTION Median regression, in which the median of the outcome is assumed to be linear in the regression parameters, is used increasingly for analyzing non-normal continuous outcome data. For example, median regression has been applied to positively skewed data, such as length of hospital stay data (Lee, Fung, and Fu 2003) or survival data (Koenker and Geling 2001). Median regression models are appealing because they are robust to outliers in the outcome; their regression parameters also have simple interpretations. The median regression parameters can be consistently estimated by minimizing the least absolute deviations (LAD) via a linear programming algorithm (Basset and Koenker 1982). In this article, we propose estimating the median regression parameters via Gaussian estimating equations after applying a Box-Cox transformation (Box and Cox 1964) to both the outcome and linear predictor. We relate the mean of the Box-Cox transformed outcome to the same transformation of the linear predictor; the transformation parameter is assumed to be unknown. When the distribution of the transformed outcome is symmetric, the median of the untransformed outcome equals the linear predictor. The median regression parameters, and the unknown transformation parameter, are estimated by maximizing the Gaussian likelihood function. Certainly, not all data can be transformed to a normal distribution. However, Draper and Cox (1969) reported that even Garrett M. Fitzmaurice is Associate Professor, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115 (E-mail: fitzmaur@hsph.harvard.edu). Stuart R. Lipsitz is Associate Professor, Harvard Medical School, Boston, MA. Michael Parzen is Associate Professor, Emory University, Atlanta, GA. The authors are grateful for the support provided by grants AI 60373 and GM 29745 from the U.S. National Institutes of Health. in cases where no Box-Cox transformation can transform the outcome exactly to a normal random variable, the transformed variable satisfies certain restrictions on the first four moments, and its distribution will usually be close to symmetric. When its distribution is almost symmetric, we would expect little bias for estimating the median regression. We explore the robustness of the proposed method in more detail via simulations in Section 3. 2. MEDIAN REGRESSION VIA BOX-COX TRANSFORMATION Let Y i be a continuous response variable for the ith independent observation and x i be a J 1 covariate vector, i = 1,...,n. We assume the median of Y i (conditional on x i )ism i = x i β or, equivalently, Pr(Y i >x i β) = 0.5. Next, consider the Box-Cox transformation of Y i, denoted by U i = u(y i,λ,c),where (Y i +c) λ 1 λ if λ = 0, U i = u(y i,λ,c)= (1) log(y i + c) if λ = 0; c is a fixed constant to ensure Y i + c > 0; and λ is an unknown parameter to be estimated. Note that because u(y i,λ,c) is a monotone transformation of Y i and Pr(Y i >x i β) = 0.5, then Pr{u(Y i,λ,c) > u(x i β,λ,c)} =0.5. Furthermore, if the distribution of U i is symmetric, then µ i = E(U i x i ) = u(x i β,λ,c) (x i β + c)λ 1 if λ = 0, = λ (2) log(x i β + c) if λ = 0. In this article we assume that the distribution of Y i after the optimal Box-Cox transformation is almost symmetric, and thus the mean µ i is approximately equal to the median of U i, that is, Pr(U i >µ i ) Pr(Y i >x i β) = 0.5. The accuracy of this approximation depends on the symmetry of the distribution of U i. Thus, the proposed estimator requires that the distribution of U i is approximately symmetrical rather than normal. Taylor (1985) showed that the Box-Cox transformation is generally the most suitable method for transforming to symmetry. Although the focus is on the Box-Cox transformation, we note that the proposed method can be applied to any monotonic transformation of Y i (and the linear predictor). Finally, assuming the variance of U i is σ 2, and that U i has an approximate normal distribution, U i N(µ i,σ 2 ), the loglikelihood contribution from subject i is l i (β, σ 2,λ)= 1 2 log(2πσ2 ) 1 2σ 2 (U i µ i ) 2 + (λ 1) log(y i + c). (3) c 2007 American Statistical Association DOI: 10.1198/000313007X220534 The American Statistician, August 2007, Vol. 61, No. 3 233

Table 1. Results of simulation study for estimated medians at each value of x with Y i exponential. n = 80 n = 160 n = 320 x x x Approach 1 1.5 2 2.5 1 1.5 2 2.5 1 1.5 2 2.5 Relative Box-Cox 0.74 0.59 0.45 0.34 1.06 1.33 1.56 1.77 1.43 1.48 1.52 1.56 Bias (%) LAD 3.73 2.68 1.74 0.92 1.15 0.88 0.65 0.44 0.28 0.44 0.59 0.72 MLE 0.17 0.23 0.58 0.90 0.12 0.01 0.12 0.22 0.07 0.01 0.08 0.14 Simulation Box-Cox 2.68 1.29 1.56 3.48 1.32 0.67 0.81 1.74 0.68 0.33 0.39 0.86 Variance LAD 4.69 1.93 2.31 5.82 2.22 0.99 1.23 2.96 1.07 0.49 0.59 1.39 MLE 2.13 0.95 1.17 2.77 1.05 0.47 0.58 1.40 0.52 0.24 0.29 0.68 Coverage Box-Cox 96.0 96.4 95.9 95.8 95.3 94.5 95.2 95.3 95.4 94.4 94.7 95.2 Probability a LAD 95.2 96.5 96.2 95.0 94.0 95.0 94.4 94.4 95.0 95.6 95.0 95.6 MLE 93.2 93.4 93.9 93.3 94.3 95.1 94.6 94.1 94.4 94.5 94.6 94.6 a Coverage Probability of 95% Confidence Intervals Note, this contribution is the log of the normal density of U i plus the log of the Jacobian of the transformation (derivative of U i with respect to y i ). In cases where the distribution of U i is approximately symmetrical rather than normal, this method yields a quasi-likelihood estimator of β and the variance of β can be estimated using the so-called sandwich variance estimator (e.g., White 1982). The Gaussian assumption is used only to derive an estimator of β; the proposed method is based on, but does not require, that U i has a Gaussian distribution. The validity of the proposed estimator requires the weaker assumption that the distribution of U i is approximately symmetrical rather than normal. 3. MONTE CARLO SIMULATION STUDY To explore the properties of the proposed estimator, and its robustness to varying degrees of asymmetry in the distribution of U i, we performed a simulation study with four different specifications of the distribution of Y i given x i : log-normal, exponential, gamma, and Pareto. Note, when Y i is specified as lognormal, this is a special case of the Box-Cox transformation with λ = 0 and the proposed method is exact. For the remaining distributions, the Box-Cox transformation does not yield an exact normal random variable. In all specifications except for the gamma distribution, we let the median of Y i given x i be M i = β 0 +β 1 x i = 6.5+1.0 x i ; for the gamma distribution, we let M i = 6.0+1.0 x i. We considered total sample sizes n = 80, 160, and 320; we let x i take on values 1.0, 1.5, 2.0, and 2.5, with each value represented by 25% of the sample. We performed 2,500 simulation replications for each of the four possible distributions of Y i. We estimated the median regression parameters via the proposed Box-Cox transformation, in addition to minimizing the LAD via the linear programming algorithm of Basset and Koenker (1982). For the log-normal, exponential, and Pareto distributions of Y i, we also estimated (β 0,β 1 ) via maximum likelihood under the true distribution for Y i. Finally, for the case where Y i has a Pareto distribution, we also estimated (β 0,β 1 ) via the Box-Cox transformation applied to log(y i ) (and to the log of the linear predictor); as discussed earlier, the proposed method can be applied to any monotonic transformation of Y i. The Box-Cox Gaussian likelihood was maximized using the SAS procedure NLMIXED (SAS Institute Inc. 2000); sample SAS code for implementing the proposed method is available upon request from the authors. The results of the simulations when Y i is log-normal (not shown, but available upon request from the authors), with log(y i ) N(log(6.5 + x i ), 1), indicate that the estimates of the medians obtained via the optimal Box-Cox transformation are unbiased for n = 80, 160, and 320. This result is to be expected because the log-normal distribution is a special case where the Box-Cox transformation can be accurately applied. The relative bias of the Box-Cox and LAD estimates of the medians (when averaged over the distribution of the covariates and the three sample sizes) is approximately 1%, while for the MLE it is less than 0.25%. Of note, the Box-Cox estimates are almost as efficient as the MLEs (which can be considered a Box-Cox estimator with λ = 0 assumed known) and are notably more efficient than the standard LAD estimates. Specifically, the simulation variances of the LAD estimates are 50 75% larger than those for the estimates from the Box-Cox transformation. Finally, all three methods show good coverage probabilities for 95% confidence intervals. To explore robustness of the proposed method to varying degrees of asymmetry in the distribution of U i,welety i have an exponential distribution, with p(y i x i,β) = θ i e θ iy i, where θ i ={log(2)}/(β 0 + β 1 x i ); note, no transformation of Y i will yield a normal random variable. However, the average skewness of the residuals from the regression of the Box-Cox transformed Y i over all simulation replications was small (skewness = 0.05), indicating that the optimal Box-Cox transformation achieved a high degree of symmetry. The simulation results reported in Table 1 indicate that all three estimators show very little bias for n = 80, 160, and 320. The simulation variances of the LAD estimates are approximately 80% larger than those 234 Statistical Practice

Table 2. Results of simulation study for estimated medians at each value of x with Y i Gamma. n = 80 n = 160 n = 320 x x x Approach 1 1.5 2 2.5 1 1.5 2 2.5 1 1.5 2 2.5 Relative Box-Cox 2.24 3.38 4.38 5.27 5.02 5.24 5.43 5.60 5.47 5.48 5.49 5.50 Bias (%) LAD 7.54 5.57 3.86 2.34 3.28 2.83 2.45 2.11 0.43 0.87 1.25 1.58 Simulation Box-Cox 5.72 2.94 3.48 7.34 2.79 1.36 1.63 3.59 1.36 0.70 0.84 1.79 Variance LAD 10.37 4.57 5.60 13.47 5.07 2.29 2.83 6.69 2.34 1.05 1.35 3.25 Coverage Box-Cox 96.0 95.5 95.4 96.1 95.6 94.5 94.1 95.1 94.3 92.9 92.6 94.1 Probability a LAD 95.1 96.8 96.8 95.7 94.6 95.4 96.1 94.6 95.6 96.2 95.4 95.1 a Coverage Probability of 95% Confidence Intervals for the estimates from the Box-Cox transformation. Next, we increased the degree of asymmetry in the distribution of U i by letting Y i have a gamma distribution with shape parameter = 0.5. This is an example of a skewed, heavy-tailed distribution where even the Box-Cox transformed Y i is likely to show skewness. The average skewness of the residuals from the regression of U i over all simulation replications was 0.10. The results indicate that the proposed estimator yields biased estimates of β 0 and β 1. As expected, skewness in the residuals are likely to have somewhat greater impact on the intercept than on the slope. This bias results in biased estimates of the medians (see Table 2), with relative bias between 2 6%. Note, however, that the absolute bias is almost negligible when compared to the variance, even when n = 320. The LAD estimator also has relative bias in the range 1 8% for n = 80 and n = 160. On average, the relative bias of the Box-Cox and LAD median estimates is 4.9% and 2.8%, respectively. Finally, we considered the extreme case where Y i has a Pareto distribution with scale parameter equal to 1. The Pareto is an example of an extremely skewed, heavy-tailed distribution where U i is likely to show very discernible skewness. The average skewness of the residuals from the regression of the transformed Y i was 0.29, indicating that the transformation did not achieve a high degree of symmetry. As a result, the proposed estimator yields badly biased estimates of β 0, and consequently, of the medians (not shown). As would be expected, the coverage probabilities are poor in this setting. In contrast, the LAD estimator of (β 0,β 1 ) is far more robust and almost unbiased when the sample sizes are large (e.g., n = 320). Finally, when the Box- Cox transformation is applied to log(y i ) (and to log(x i β)),the bias of the proposed estimator is substantially reduced and the coverage probabilities meet the target coverage; in addition, the simulation variances of the estimates from the Box-Cox transformation of log(y i ) are approximately half the size of those for the LAD estimates. This improvement can be explained by the latter transformation of Y i achieving a far greater degree of symmetry (the average skewness was 0.05 instead of 0.29). This highlights an important point: when the distribution of Y i after a Box-Cox transformation is asymmetrical, there may be alternative monotonic transformations of Y i that can achieve a higher degree of symmetry. In summary, the results of the simulation study suggest that the proposed estimator is relatively robust to modest degrees of asymmetry in the distribution of Y i after a Box-Cox transformation. In these cases the relative bias is less than 5% and of comparable magnitude to the finite sample bias of the LAD estimator. In addition, the proposed method provides discernibly more efficient estimates than the standard LAD estimator. Of note, when the degree of asymmetry is modest, the bias is almost negligible compared to the variance. However, when there is strong asymmetry in the distribution of U i, the proposed estimator can yield badly biased estimates. 4. APPLICATION: DEVELOPMENTAL TOXICITY STUDY OF DYME Developmental toxicity studies of laboratory animals play a crucial role in the testing and regulation of chemicals and pharmaceutical compounds. Exposure to developmental toxicants during pregnancy typically causes adverse effects, such as fetal malformations and reduced fetal weight. In a typical experiment, laboratory animals are assigned to increasing doses of a test substance. In this section we describe an analysis of fetal weight data from a study of diethylene glycol dimethyl ether (DYME), a widely used organic solvent. In this study of laboratory mice, conducted through the National Toxicology Program, DYME was administered at five doses (0, 62.5, 125, 250, and 500 mg/kg/day) to 111 pregnant mice (dams) just after implantation. Following sacrifice, fetal weight was recorded for each live fetus with the goal of determining the effect of dose on weight. We considered a linear model in dose ( 10 3 ) for the median fetal weight, estimated via the Box-Cox transformation and standard LAD estimator. Results of these analyses produced very similar estimates of the intercept and slope for both methods. The two estimates of the slope for dose are 0.90 (Box-Cox) and 0.88 (LAD), indicating that the decrease in median fetal weight, comparing the highest dose group to the control group, is approximately 0.45 gm (or 0.5 0.9). The two fitted median lines, presented in Figure 1, are virtually indistinguishable and fit the The American Statistician, August 2007, Vol. 61, No. 3 235

Figure 1. Scatterplot of fetal weight (gm) versus dose (mg/kg/day), with the empirical medians (solid circles) and fitted medians for fetal weight for the estimators based on the Box-Cox transformation (solid line) and LAD (dashed line). empirical median at each dose well. However, the standard error for the slope estimated via the Box-Cox transformation (SE = 0.039) is discernibly smaller than the corresponding standard error for the LAD estimator (SE = 0.045); the ratio of the two estimates of variance is 1.33 and highlights the potential gains in efficiency. Finally, to highlight settings where the proposed estimator can and cannot be validly applied, we generated two simulated datasets from the exponential and Pareto distributions, respectively. Specifically, we generated 1,000 observations from exponential and Pareto (scale parameter = 1) distributions with median, M i = 6.5 + 1.0X i, where X i has a uniform distribution over the interval [0,4]. For the data from the exponential distribution, both methods produced similar estimates of the intercept and slope; the corresponding fitted lines (dashed lines) are presented in Figure 2(a) and are completely indistinguishable from the true median regression line (solid line). Note, the Y -axis in Figure 2(a) is on the log scale. In contrast, for the date from the Pareto distribution, the Box-Cox and LAD methods produced discernibly different estimates of the intercept (6.43 and 5.21, respectively) and slope (1.46 and 1.03, respectively). Figure 2(b) presents the fitted lines for the Box Cox (long-dashed line) and standard LAD (short-dashed line) estimators which are now clearly distinguishable. Although the Box-Cox estimator of the intercept is almost unbiased, the estimator of the slope has a relative bias of almost 50%. In this setting, the Box-Cox transformation is unable to symmetrize the data; moreover, even after transformation, there are numerous extreme observations that have an inordinate influence on the estimated slope. For example, although the median ranges from 6.5 10.5, over 10% of the observations (on the original scale) were larger than 1,000 (these extreme observation have been omitted from the scatterplot in Figure 2(b)), with 5% greater than 8,000, and the largest equal to 9.31 10 11. The estimator based on the Box-Cox transformation is not robust to such extreme observations. 5. CONCLUDING REMARKS In this article we propose estimating the median regression parameters via Gaussian estimating equations after applying the optimal Box-Cox transformation to both the outcome and the linear predictor. The proposed estimator is notably more efficient than the standard LAD estimator, albeit with an acknowledged loss of robustness. Although the proposed estimator is relatively robust to modest degrees of asymmetry in the distribution of the transformed response, U i, we caution that it can yield badly biased estimates when there is strong asymmetry in the distribution of U i. Fortunately, the degree of skewness can be assessed easily from the data at hand. In cases where there is discernible skewness, the use of a more general class of transformations, for example, folded-power transformations (Mosteller and Tukey 1977) or the Zellner-Revankar transformation (Zellner and Revankar 1969), may yield a higher degree of symmetry. An appealing property of the standard LAD estimator is that it is robust to outliers. Although the Box-Cox transformation is likely to reduce the number of and influence of outliers on the transformed scale, because the proposed method relies on regressing the mean of the transformed response, it will likely be sensitive to extreme or outlying observations. When outliers are apparent on the transformed scale, the proposed approach 236 Statistical Practice

Figure 2(a). Scatterplot of 1,000 observations from an exponential distribution, with comparison of the true median regression line (solid line) and the fitted median lines for the estimators based on the Box-Cox transformation (long-dashed line) and LAD (short-dashed line). Figure 2(b). Scatterplot of 1,000 observations from a Pareto distribution, with comparison of the true median regression line (solid line) and the fitted median lines for the estimators based on the Box-Cox transformation (long-dashed line) and LAD (short-dashed line). The American Statistician, August 2007, Vol. 61, No. 3 237

can be made more robust by simply downweighting or trimming the most extreme residuals. Any potential loss of information from trimming the most extreme residuals should be more than compensated by the increased precision over LAD estimation. [Received November 2005. Revised April 2007.] REFERENCES Bassett, G., and Koenker, R. (1982), An Empirical Quantile Function for Linear Models With iid Errors, Journal of the American Statistical Association, 77, 407 415. Box, G.E.P., and Cox, D. R. (1964), An Analysis of Transformations, Journal of the Royal Statistical Society, Series B, 26, 211 243. Draper, N.R., and Cox, D.R. (1969), On Distributions and Their Transformation to Normality, Journal of the Royal Statistical Society, Series B, 31, 472 476. Koenker, R., and Geling, O. (2001), Reappraising Medfly Longevity: A Quantile Regression Survival Analysis, Journal of the American Statistical Association, 96, 458 468. Lee, A.H., Fung, W.K., and Fu, B. (2003), Analyzing Hospital Length of Stay: Mean or Median Regression? Medical Care, 41, 681 686. Mosteller, F., and Tukey, J.W. (1977), Data Analysis and Linear Regression, Reading, MA: Addison-Wesley. SAS Institute Inc. (2000), SAS/STAT User s Guide (Version 8 ed.), Cary, NC: SAS Institute. Taylor, J.M.G. (1985), Power Transformations to Symmetry, Biometrika, 72, 145 152. Wedderburn, R.W.M. (1974), Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton Method, Biometrika, 61, 439 447. White, H. (1982), Maximum Likelihood Estimation Under Misspecified Models, Econometrica, 50, 1 26. Zellner, A., and Revankar, N. (1969), Generalized Production Functions, Review of Economic Studies, 36, 241 250. 238 Statistical Practice