Robust Nonparametric Methods for Regression to the Mean Model

Size: px

Start display at page:

Download "Robust Nonparametric Methods for Regression to the Mean Model"

Jason Glenn
6 years ago
Views:

Western Michigan University ScholarWorks at WMU Dissertations Graduate College 1-2011 Robust Nonparametric Methods for Regression to the Mean Model Therawat Wisadrattanapong Western Michigan

1 Western Michigan University ScholarWorks at WMU Dissertations Graduate College Robust Nonparametric Methods for Regression to the Mean Model Therawat Wisadrattanapong Western Michigan University Follow this and additional works at: Part of the Statistics and Probability Commons Recommended Citation Wisadrattanapong, Therawat, "Robust Nonparametric Methods for Regression to the Mean Model" (2011). Dissertations This Dissertation-Open Access is brought to you for free and open access by the Graduate College at ScholarWorks at WMU. It has been accepted for inclusion in Dissertations by an authorized administrator of ScholarWorks at WMU. For more information, please contact

2 ROBUST NONPARAMETRIC METHODS FOR REGRESSION TO THE MEAN MODEL by Therawat Wisadrattanapong A Dissertation Submitted to the Faculty of The Graduate College in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Department of Statistics Advisor: Joseph W. McKean, Ph.D. Western Michigan University Kalamazoo, Michigan August 2011

3 ROBUST NONPARAMETRIC METHODS FOR REGRESSION TO THE MEAN MODEL Therawat Wisadrattanapong, Ph.D. Western Michigan University, 2011 Regression to the mean is a statistical phenomenon that often confounds treatment effects in experiments. Consider an experiment involving a treatment, in which a response is measured (baseline) on a subject then a treatment is applied and a second measurement is taken. Then under many bivariate models for the pair of responses (including the bivariate normal), the predicted response of the second measurement will regress to the mean. In experiments where the second response is only taken for a select sample, say above a cutoff value, then this regression to the mean effect may mistakenly be thought of as a treatment effect. In this investigation, we consider a model of the treatment effect which also takes into account this regression to the mean effect. In particular, we consider the multiplicative model of Naranjo and McKean (2001). Naranjo and McKean developed a bootstrap test for treatment effect based on least squares methods for bivariate normal distributions. We developed robust procedures to assess treatment effects for this multiplicative model. Our procedures are based on rank-based methods, for general score functions. Our preliminary Monte Carlo investigations show that our procedures are robust. We extend this robust and traditional development to models other than the bivariate normal, including the multivariate t distributions. We investigate the finite sample properties of these methods and compare their empirical behavior over a variety of models and situations.

4 UMI Number: All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMI Dissertation Publishing UMI Copyright 2011 by ProQuest LLC. All rights reserved. This edition of the work is protected against unauthorized copying under Title 17, United States Code. uest ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml

5 Copyright by Therawat Wisadrattanapong 2011

6 ACKNOWLEDGMENTS I would like to express my sincere gratitude to Dr. Joseph W. McKean, advisor and committee chair, for his patience and guidance. I also thank Dr. Joshua D. Naranjo, Dr. Bradley E. Huitema, and Dr. Jeffrey T. Terpstra, committee members, for their suggestions. Special thanks to my mother, Tarn Wisadrattanapong, my sisters, Marin Suksawat and Sineenad Vieuxtemps, and their families, and my brothers. I am deeply grateful to my father-in-law and mother-in-law, Dr. Sirichai and Suchin Watcharotone. I would also like to thank my wife, Dr. Kuanwong Watcharotone, and my sons, Sukda and Han Wisadrattanapong, for their constant encouragement and support. Therawat Wisadrattanapong ii

7 TABLE OF CONTENTS ACKNOWLEDGMENTS LIST OF TABLES LIST OF FIGURES ii v ix CHAPTER I. INTRODUCTION The Practical Problem of Interest Simple Linear Model Traditional Least Squares Fit Robust Fit High Breakdown Rank-Based (HBR) Estimates 8 II. THE MODEL AND ESTIMATES The Dual Effects Model Parameter Estimates R and HBR Estimations for Linear Regression Model Scale Functionals Scale Functions and Estimator Example of Scale Functionals and Estimators Dispersion Function ParameterX(p Median Absolute Deviation (MAD) Example: Final and Midterm Exam Scores 35 iii

8 Table of Contents continued CHAPTER III. MONTE CARLO STUDY FOR DUAL TREATMENT EFFECTS AND MULVARIATE NORMAL MODE Bootstrap Confidence Intervals (BCI) Tests of Significance Simulation Study Dual Treatment Effects Model of Different Scenarios with 20% and 30% Regression to the Mean Multivariate Normal Model with 20% Regression toward the Mean Multivariate Normal Model with Different Scenarios 49 IV. MONTE CARLO STUDY FOR MULTIVARIATE T MODEL Elliptical Distribution Simulation Study Multivariate T Model Comparing between the and Procedures Multivariate T Model Comparing between the and HBR Procedures Multivariate T Model Using 20% Regression toward the Mean Comparing among the,, and HBR Procedure 82 V. CONCLUSION 87 REFERENCES 91 IV

9 LIST OF TABLES 1. Final and Midterm Exam Scores Mean Parameters % BCI Empirical MSE when p = Empirical MSE when p = Empirical ARE when p = Empirical ARE when p = % Empirical CI when p = %) Empirical CI when p = % Empirical CI when p = % Empirical CI when p = Empirical Level of Full Test when p = Empirical Level of Full Test when p = Empirical Level of Marginal Test when p = Empirical Level of Marginal Test when p = Empirical Means when a = 1 and p = Empirical ARE when c = 1 and p = Empirical MSE Empirical ARE Empirical MSE when df = 3 and p = Empirical MSE when df = 3 and p = v

10 List of Tables continued 22. Empirical ARE when df = 3 and p = Empirical ARE when df = 3 and p = % Empirical CI when df = 3 and p = % Empirical CI when df = 3 and p = % Empirical CI when df = 3 and p = % Empirical CI when df = 3 and p = Empirical Level of Full Test when df = 3 and p = Empirical Level of Full Test when df = 3 and p = Empirical Level of Marginal Test when df = 3 and p = Empirical Level of Marginal Test when df = 3 and p = Empirical MSE when df = 5 and p = Empirical MSE when df = 5 and p = Empirical MSE when df = 5 and p = Empirical MSE when df = 5 and p = % Empirical CI when df = 5 and p = % Empirical CI when df = 5 and p = % Empirical CI when df = 5 and p = % Empirical CI when df = 5 and p = Empirical Level of Full Test when df = 5 and p = Empirical Level of Full Test when df = 5 and p = Empirical Level of Marginal Test when df = 5 and p = Empirical Level of Marginal Test when df = 5 and p = vi

11 List of Tables continued 44. Empirical MSE when df = 10 and p = Empirical MSE when df = 10 and p = Empirical MSE when df = 10 and p = Empirical MSE when df = 10 and p = % Empirical CI when df = 10 and p = % Empirical CI when df = 10 and p = % Empirical CI when df = 10 and p = % Empirical CI when df = 10 and p = Empirical Level of Full Test when df = 10 and p = Empirical Level of Full Test when df = 10 and p = Empirical Level of Marginal Test when df = 10 and p = Empirical Level of Marginal Test when df = 10 and p = Empirical MSE when df = 5 and p = Empirical MSE when df = 5 and p = Empirical MSE when df = 5 and p = Empirical MSE when df = 5 and p = %o Empirical CI when df = 5 and p = % Empirical CI when df = 5 and p = % Empirical CI when df = 5 and p = %o Empirical CI when df = 5 and p = Empirical Level of Full Test when df = 5 and p = Empirical Level of Full Test when df = 5 and p = vii

12 List of Tables continued 66. Empirical Level of Marginal Test when df = 5 and p = Empirical Level of Marginal Test when df = 5 and p = Empirical Means Empirical MSE Empirical ARE 84 viii

13 LIST OF FIGURES 1. Regression toward the Mean Plot Regression toward the Mean Plot AREofRho ARE of Gamma AREofRho ARE of Gamma 86 ix

14 CHAPTER I INTRODUCTION Regression comes from a Latin root which means going back. Francis Galton was the first person who introduced regression toward mediocrity in hereditary stature. In his famous study of the heights of fathers and first sons, Galton found that heights of the sons regressed toward the mean, that is, taller fathers (above the mean height) tended to have shorter sons and shorter fathers tended to have taller sons. Now we call it regression toward the mean or, briefly, the regression effect. The regression effect is a statistical phenomenon. It may make the variation in repeated measurements look as real change. McDonald, Mazzuca, and McCabe (1983) note that very sick patients should feel better at the second measurement without treatment due to the effect of regression toward the mean. In sexual selection, males that remain unmated in the first year, are less attractive than mated males. Here, the regression effect is that unmated males in the first year increase their attractiveness in the second year more than mated males (Kelly and Price, 2005). James (1973) states that even if the correlation between the pre and posttreatment measurement is small, the regression effect may impact statistical significance. A set of informative examples on the effect are provided by Cutter (1976). The regression toward the mean (RTM) effect tends to move initial values close to the mean on the second measurement. The experimental design for this study is the following situation. Patients or subjects for the experiment are selected on an initial (pretest) response. Only patients whose response is above a specified cutoff value; however, are selected for the experiment. Those that are selected are then treated. It is thought that this treatment reduces the response. This is the hypothesis of interest. After a specified period 1

15 of time, a second (postest) response is taken on the patient. An estimate of the treatment effect is the difference in the responses; i.e., pretest - postest. Are these differences; however, due to the treatment or the regression toward the mean effect? Unlike many experiments, there is no control group for comparisons. There is extra information, though, in the initial responses that were smaller than the cutoff point. The models discussed next utilize this information. Mee and Chua (1991), George, Johnson, and Nick (1997), and Ostermann, Willich, and Ludtke (2008) presented methods for estimating the treatment effect by using an additive model. The additive model, though, does not take into account possible interaction between the treatment and the initial response. James (1973), Senn, Brown, and James (1985), and Chen and Cox (1992) presented a multiplicative model which models this interaction. They developed an inferencce using this model. Finally, Naranjo and McKean (2001) considered a dual model that models both the additive and interaction effect. They further proposed methods for estimating treatment effect based on this dual model. All these models, though, assume a bivarite normal assumption on the measurements before and after treatment (X, Y) respectively. The resulting inference is likelihood based, least squares (), which can be seriously impaired by only one outlier; i.e., the methods are not robust. In this study we propose robust methods that are not tied to the assumption of bivariate normality. We develop robust procedures to assess treatment effects for the dual model of Naranjo and McKean (2001). Our precedures are based on rank-based methods for general score functions. In Chapter 2, we present these methods along with their corresponding theory. This includes a robust analog of the bootstrap test for treatment effect proposed by Naranjo and McKean (2001). This theory is asymptotic; however, so in Chapter 3 we present the results of an extensive Monte Carlo study. This includes comparisons for estimation of empirical mean square errors and relative efficiencies between the robust 2

16 and methods. For the tests of treatment effect, empirical levels and power are obtained and compared. Our discussion in Chapter 2 and 3 is based on the bivariate normal distribution. Our robust theory, though, is for general bivariate distributions. In Chapter 4, we consider extensions to other bivariate distributions. In particular, we consider the popular elliptical bivariate distributions; see Muirhead (1982) for an informative discussion of these models. In Chapter 4, we present the results of a Monte Carlo investigation of our robust and methods for several of these elliptical distributions. 1.1 The Practical Problem of Interest As discussed above, in this study, we are concerned with the following practical problem. We are interested in how a subject with an infirmity responds to a certain treatment. As an illustration, we offer this simple example concerning cholesterol. The infirmity to treat is high cholesterol. Subjects are screened for high cholesterol; that is, only subjects who have high cholesterol (over a prespecified threshold) are admitted to the study and only these are treated. Their cholesterol at admission is their Baseline or pretreatment response. After a specified time of treatment, for those in the study, a second response, posttreatment response is taken. The difference in these response, pretreatment posttreatment is the treatment effect for a subject. In saying this, however, we are ignoring the regression effect. In the absence of treatment, a subject with high cholesterol is expected to have a decline in cholesterol at the second reading. So for our study, are we observing a treatment effect or a regression effect? We somehow have to tease out of the effect that portion due to treatment and that portion due to regression. In many studies a control is used. In this case, for the cholesterol study, subjects with high cholesterol on the first reading would be randomly assigned to a treatment group 3

17 and a control (placebo) group. Then, if least squares procedures are used for inference, the mean difference of their differences (first and second responses) would be obtained. This would essentially remove the regression effect. In many studies, however, this may be impossible and it may even be unethical, (knowingly withholding treatment). For the practical problem of this study, there is no control group. Our data consists of the before and after responses of those admitted to the study. In addition, we have history on subjects who were below the threshold. This could be either a sample of subject responses who were not admitted to the study (i.e., their responses were below the threshold) or well known information on the distribution of the pretreatment variable. As we show, this additional information is needed in order to separate the treatment effect and the regression effect. 1.2 Simple Linear Model this model as A basic part of models we discuss is the simple linear regression model. We write Y t = a + x t (3 + e t, (1.1) where Y % is the response variable for the ith valuves; x % is the ith row of a know n xp matrix of independent variables (p > 1); (5 is a p-vector of unknown slope parameters; a is the unknown intercept parameter; and e t is the ith error term. We discuss assumptions of the errors e t in Chapter 2. For now, we assume that e l5..., e n are independent and identically distributed (iid) with cumulative distribution function (cdf) Fit) and probability density function (pdf) fit). For the traditional fit, we assume that a 2 = var(e) exists. The model discussed above are based on simple linear models. The methodology that we develop is based on a robust fits of this model. For reference we next discuss the traditional least squares () and robust fits for the simple linear model. 4

18 1.3 Traditional Least Squares Fit Let Y = (yi,..., y n ) T and X = (xi,..., x n ) T. The estimate of slope (/3 ) and intercept (ct^s) can be obtained in this way. First, center X; i.e., let X c = X X. Then the estimate is 3 = Argmin Y-X c /?, where \\-\\ is the Euclidean norm. Let a^s be the average of the residuals y l = (x l X)P', i = 1,...,n. Then the estimatea are : a = XT=i fa - ^) (y. - v) E,=i fa - x) = V-PX. The fitted values are then y,i = a L s + PXI- Denote the residuals by &,% = Ui- V,Z- The estimate of the variance is a 2 = ^ J2 ^Isy 1.4 Robust Fit Consider the simple linear model (1.1). Assume that the random errors ti,...,e n are iid with cdf F(t) and pdf f(t). In particular, our robust methods do not require finite variance of the random errors. Consider the general R pseudo norm, n i=i where a(i) = f[i/(n + 1)]; i = 1,2,...,n. We assume that ip(u) is an increasing squareintegrable function defined on (0,1). Without loss of generality, assume </? is standardized; 5

19 i.e., j Lp{u)du = 0 and J ip 2 {u)du 1. The dispersion function can be written as D v (p) = \\Y - PXW,, = $>[!%, - x t P)](y x - x t 0). 2=1 It is easy to know that D v {0) is a piece-wise linear. Thus, the D^{P) is a continuous, and convex function of P (Hettmansperger and McKean, 2011, p. 168). Define the R-estimator as 3, = Argmin Y-X/3 v,. (1.2) The negative of the gradient is given by SM dd v (P) dp dp n = -^[Ri^-x'Mi-x) i=i n = J^a[R(y t -x t (3)](x lj ). Thus, the R estimate of slope p R solves also the equations n Y^x.a^Riy, -x' t P)] = 0. i=i 6

20 p.l68);i.e., The R estimate a is the median of residuals (Hettmansperger and McKean, 2011, otr rned < Y X p R >. Under regularity conditions, Hettmansperger and McKean (2011, p. 189) show a *)~N(l a pj \\P k n -r^'(x'x)" 1 -r^x'x)-^ r^(x'x)- 1 where k n n VJ + r^a^x'x) 1 x and T S and T V are the scale parameters which are defined as rs = (2/(0))" 1 r v = {VV2Jf\t)dt)- 1. The Wilcoxon scores discussed in Chapter 3 of Hettmansperger and McKean (2011) are generated by ip(u) = Vl2(u ~). For Wilcoxon scores, if the errors have a normal pdf with variance a 2 then Hettmansperger and McKean (2011) show that T V = o-sj\, and TS = v-sf\- Another popular set of scores are the sign scores generated by <p(u) = sgn(u-i). The ARE of the R estimate in relation to is the ratio of a 2 /r 2. For Wilcoxon scores, this is the familiar value 12cr 2 (J f 2 ) 2, which for normal errors is (McKean and Vidmar, 1994). If the true distribution has tails heavier than the normal, then this efficiency is usually much larger than one (McKean, 2004). 7

21 1.4.1 High Breakdown Rank-Based (HBR) Estimates Consider the simple linear model (1.1) and the dispersion function, n n 1=1 ] = 1 where b v denotes the weight function and e, = y % x\p. Note that when the weights are b l3 = 1, and weights yield the well know rank-based Wilcoxon estimate (Terpstra and McKean, 2005)). Writing the dispersion function as a function of P, we have n n 1=1 3 = 1 = EE^ fa-^)-fa-x,)'p. i=l 3=1 It follows that D(P) is a nonnegative, continuous, and convex function of p. Define the HB R-estimator as The gradient is given by P HBR = Aigmin\\Y-Xp\\ HBR. (1.3) 9 (6) - dd{/3) n n = J2YsK& - x 3 )sgn[(y t - Y 3 ) - fa - x])'p}. 1=1 3 = 1 8

22 Thus, the HBR estimate of slope PHBR also solves the equations SHBR(P) = 0. The intercept is estimated by the median of the residuals. The R estimate of a is the median of residual (Hettmansperger and McKean, 2011, p.265) OCHBR = med y - X'P H BR J Under regularity conditions, Hettmansperger and McKean (2011, p.265) show where Ts is defined by ( ahbr )~N(( a ), \ PHBR J \\pj ^C" 1 " 1 T S = (2/(0))- 1. As discussed in Hettmansperger and McKean (2011), there is a loss in efficiency when HBR estimates are used in place of the Wilcoxon estimates. However, the estimate P R has breakdown \/n due to its sensitivity to outliers in the X space. But with a proper set of weights, the influence function of the HBR estimate is bounded in X and Y space and has 50% breakdown. 9

23 CHAPTER II THE MODEL AND ESTIMATES The model of interest for the regression effect is a bivariate model for a random vector of responses (X, Y). Let X denote the initial response or measurement on a subject. We often refer to X as the Baseline measurement or the premeasurement value of the response before treatment or therapy. Let Y denote the second value of the response on the subject. The variable Y is realized after the treatment and, hence, is often called the posttreatment measurement or response. We are interested in the effect of the treatment, over and above the regression effect. The null hypothesis is that there is no treatment effect. Hence, for our models, under the null hypothesis the marginal distributions of X and Y must be the same. The second requirement for our models is that the expected value of Y should be a linear function of X. With this requirement, we can model both the regression effect and the treatment effect. There are several families of distributions which satisfy these requirements. For example, as we show later, the elliptical family of bivariate distributions is one such family. In practice, though, the most important member of this family is the bivariate normal distribution. Essentailly all the literature on the regression effect is in terms of this model. For this and the next chapter, most of our discussion concerns the bivariate normal model. For the normal null model, (X, Y) can be written as bivariate normal as (*H(. a 2 pa 2 pa 2 a 2 (2.1) where p denotes correlation. With regression towards the mean effect in mind, assume, 10

24 as usual that 0 < p < 1. Note that for this null model, both X and Y are distributed as N(p,a 2 ). We can write model (2.1) from a regression point of view. Consider the model Y = p + p(x - p) + e, (2.2) where e is ~ N(0, a 2 (l p 2 )), X is N(p, a 2 ), and e and X are independent. Hence e X N 0 " " a 2 {l-p 2 ) 0 " p - > 0 a 2. So that It follows that ( Y ) \ x ) r x I p 0 / ivl L /i J e X + a 2 pa 2 pa 2 a 2 So model (2.1) and (2.2) are equivalent. It also follows that p- pp 0 E(Y\X = x) = p + p(x- p). Also note that var(e) is free of x. Then we have To illustrate the regression towards the mean effect, suppose that x > p and p > 0. E(Y\X = x) = p + p(x-p) < p + (x p) = X. 11

25 So, E(Y\X = x) < x. Also, since p(x p) > 0, we have E(Y\X = x) = p + p(x - p) > p. Thus p<e{y\x = x) <x. This is the regression towards the mean effect and it is illustrated in Figure 1. Y Y=/y +(X-/i ) = /i + p(x-/i) X Figure 1: Regression toward the Mean Plot 12

26 2.1 The Dual Effects Model The dual effects model by Naranjo and McKean (2001) can be written as Y = p - S + 7p {X - p) + e, (2.3) where X is N(p, a 2 ), e is N(0, a 2 (l p 2 )), and X and e are independent. If the additive component S = 0, then model (2.3) reduces to the multiplicative model of Chen and Cox (1992), where the treatment effect is represented by 7. If 7 = 1, then model (2.3) reduces to the additive model of Mee and Chua (1991), where the treatment effect is represented by 5 > 0. We then refer to 5 and 7 as the additive and multiplicative components, respectively, of treatment effect (Naranjo and McKean, 2001). For the dual effects model, the bivariate normal distribution (2.3) is equivalent to the conditional model, Y\X = x ~ iv (p p (x - p), a 2 (l - p 2 )), (2.4) where X is N(p, a 2 ). The hypothesis of interest is H0 : S = 0 and 7 = 1 versus H a : 5 ^ 0 and 7 ^ 1. (2.5) We next briefly derive the multivariate distribution of [X, Y). This will help when we discuss distributions other than the normal in Chapter 4. A derivation similar to that showing the equivalence of model (2.1) and (2.5) can be used to show that model (2.3) is equivalent to saying that Y X N p 5 P Yp 2 a 2 + a 2 (l - p 2 ) "fpa 2 Ipa' a (2.6) 13

27 where X denotes N(p, a 2 ), and Y is N(p - 5, ^2p 2 o 2 + a 2 (l - p 2 )). Note that under H 0, Y is N(p, a 2 ), that is Y and X have the same distributions. For inference, we need estimates of the parameters of the model along with the standard errors of the estimates. We also need to discuss tests of H 0. We briefly review the inference developed in Naranjo and McKean (2001) and then develop our robust analogs. 2.2 Parameter Estimates Recall from Section 1.1 the practical problem with which this study is concerned. Suppose we have n pre and posttreatment responses. In the notation of Model (2.1), these responses would be the realizations of the random variables, (Xi,Yi), (x 2,y 2 ),, (X n,y n ), where Xi and Yi denote the pre and posttreatment responses on the ith subject. We assume that these n random vectors are iid with distribution (2.6). As discussed in Section 1.1, we also have at our disposal a sample X n +i, X n+ 2, j X n+m of pretreatment responses of subjects who were not treated. Note that the X\, X 2,..., X n, X n +i, X n+2,, X n+m are iid with the marginal distribution of X as their distribution. In practice instead of this large sample from the marginal distribution of X, we may know characteristics of the distribution of X, such as the mean and variance of X. Consider the case of selected sampling where X measurement is a random sample of n + m. The Y measurements are gotten from only a subset of size n of the n + m. Therefore, the data we have is from (x 1,yi),..., (x n, y n ),x n+1,..., x n+m. 14

28 As in Naranjo and McKean (2001), consider the likelihood of the dual effect model m L{p,l,o) = Y[f x, y (x t,y t ) YI Kx 3 ) 1=1 j=n-fl n m = \\f{x l \y l )h{x l ) YI K X J) i=l 3=n-\-l U L± (2TT) (2irY/ 1 /2( 2 (a 2 (T 2( (l 1 _ p2))i/2 (2 7 r) 1 t /2 CJ * a+77i -. TT - e-^^-^) 2 (2TT) 1 /2 0 - j=n+l v ' = 1 e-^(tr^ Er=i[».-M+«-7P(x.-/i)l 2 1 ^^?_ l(gl -^a (27T) 1 /2( a 2( 1 _ p 2))l/2 e (27r)V2a 1 1 V n+m IT -ia 2 (27r)V 2 a Taking In of the likelihood, we have (p,7,5) = -^n(27r)- /n(cj 2 (l-p 2 ))-^ ^^Jr-p + o- 1P {x % -p)} 2 + k, or ' 7=1 1 ^fa 7, 5) = --/n(l - p 2 ) - 2^2(1 _ 2. Y^Vl ~ V + 5 ~ 7p fa ~ ^ + k - n Taking partial derivatives and solving the Model (p, 7, S) give the MLE's result d ( P,l,S) 1 y^r, x. = d~5 ~ a 2 (1 - p2) 1> ~ ^+ * ~ 7p(^ ~ ^)] 15

29 6) Then let de{p^' = 0 and solving the equation d (p,i,s) = 0 85 n 1 ~2Ti 2^51^ -p + 5-7p(x, -p)] = 0 (T^ 1 P ) Z ' V ' 7 = y^ y % - np + ns - 7p y~] x l - n-ypp = 0. 7=1 7=1 Therefore, we have 7=1 y% i i 2-J7=l "^7 o = h p + 7P 7PP n n = -y + p + -fppx - ipfi = IP(X- p)-{y- p) Taking ^EnH = Q and solving the equation dl(p,~/,5) dip n 1 - a 2 (1 _ p 2) X ^ - M TPfa " P)] ^ ' 7 = 1 * (- (x, - p)) = 0 77 Y^[yt-p + s-ip{x l -P)](X 1 7 = (& ~ ^ fa ~ ^)+ s 53 fa ~ ^ 7=1 7 = p) = o -7pJ3fa-p) 2 = 0. 7 = 1 16

30 We then have --. _ EIU (& - P) fa - P) 7P y^n / \2 E,=i fa - P) Er=i [(^ - y) + (y - M)] [fa - x) + (x - p)] Er=i [fa -x) + {x- p)f EHi fa - x) jy % - y) Er=i fa - ^) 2 Next set 1{ - P^> 5 > 0. Solving the equation, we have 17 d (p,i,s) <97 = 0 1 <A 2M _ 2^ 53^ ~ P + ^ ~ 7Pfa ~ P)] (~P fa ~ P)) = 0 ^ ^ ^ 7= b* - p + $ - IP{XI - P))P fa - p) = o 7= p53(?a ~^)fa -^+5p 53fa~ /i ) ~7P 2 53fa _^) 2 = - 7=1 7=1 7 = 1 Hence, 7 p Er=i (v* ~ P) fa - P) ^ E:=I fa - & JP p' 17

31 Setting dl(p,~t,5) Q ' = 0 and solving the equation, we have i r(i-p 2 )2Ei=i[^-p+<5-7P< X7. 2a 2 2a 2 (1 - p>y p)]{-p{x l - dp p)) Er=i [^ - P + 5-7p (x t - p)f (-2p) ;i - P 2 Y <-2p) 2(1-p 2 ) = 0. Thus, n{2p) 2(1-P 2 ) na 2 1-P 2 = p p p 1 Er=i k - p+^ - 7p fa - P)] 2 (2p) z^2 (1 - P 2 ) 2 Er=i k - P + s - 7P fa - P)Y (1 - P 2 ) 2 Er=i [yi- p + d '- IP fa - p)f na"./, Er=i [yi-v + s- V 1 5-7p (x t - p)y ^ L_ Er=i [yi - P + 7P (s - p) - (y - p) - 7P fa - p)y V na 2 A Er=i k - /- 7P fa - z)] 2 ncr^ Hence, the mle of p is =,/i _ Er=i e 7 2 = L _ E?=i [yi-»-pfa -p)y na" na" 18

32 and the results of MLE from the dual effects model are 6 = 7P (x - p) - {y - p), IP Er=ifa- w ) (y* - y) \2 ' 2^7=1 \ Xt X ~ 7P 7= > P = J x _ Er=i[y*-y-7Pfa-x)] 2 na 2 To compute the inverse of the empirical information matrix, we can rewrite Model (2.4) as I m ) (2^)V2( (T 2( 1 _ p2 ))i/2 e V-'> Taking In on Model (2.7) ln/(ylx) = -^n(27r)-^n(a 2 (l-p 2 ))- 2(j2(i 1 _^2) [-p + (5-7p(x-p)] 2. Taking first partial derivatives Model (2.8) with 6 (2.8) j3 = «^ = L_ h/ _ /J + 4 _ w(l _ rt]. (2. 9) Taking second partial derivatives from Model (2.9) with 5 d 2 lnf (y\x) _ 1 J 33- ^ -"a 2 (l-p 2 )' a i 0 ) 19

33 Taking expected both sides of Model (2.10) and get J 3 3 Taking second partial derivatives from Model (2.9) with 7 d 2 lnf{y\x) _ d 2 lnf{y\x) p{x-p) Js2 - d5 1 ~ d5 1 [ ~ P{X ~ ^ ~ a 2 (l-p 2 )' (2>12) Taking expected both sides of Model (2.12) and got I 32 ' --*M--*(^)-- P W^--&&- - <2J3) /32 = /23. (2.14) From Model (2.14), Then we get / 32 equal I 2 3- Taking second partial derivatives from Model (2.9) with p J31 = a 2 /n/ (2/ x) f [y - p + 5-7p (x - p)] (-a 2 2p) - a 2 {I - p 2 ) (-7 (x - p)) dop V (<x 2 (1 - P 2 )) 2 7 _[j/-p + (5-7P(x-p)](2a 2 p)-a 2 (l-p 2 )(7(x-p)) (a 2 (1 - p 2 )) 2 Taking expected both sides of Model (2.15) and got /31 T vfj s ([y-p + 5-1P {x-p)]{2a 2 p)-a 2 {l-p 2 ){ 1 {x-p)) hi = & {J31) = & (<x 2 (l-p 2 )) 2 hi = E(J3i) = - E{e){2a 2 p)-a 2 {l-p 2 ){ 1 {E{x)-p)) {a 2 {l-p 2 )f / 3 i=/i3 = 0. (2.16) 20

34 Taking first partial derivatives Model (2.8) with 7 Olfi f (y\x} 1 h = Q = ~ a 2/ 1 _ 2\ [y- P + S-IP(X- p)} (-p (x - p)) T p{x- p)[y- p + 6-~fp{x- p)} J 2-2~7l 2\ K Z - l >) a z (1 p l ) Taking second partial derivatives Model (2.17) with 7 _8 2 lnf{y\x) _ p{x-p) p 2 {x - p) 2 J22 - d a 2 (1 - p 2 ) { ~ P {X ~ P)) ~ ~a 2 {l-p 2 Y (2 ' 18) Taking expected both sides of Model (2.18) *--*<«--*!-^J-^-^. (,.9, Taking second partial derivatives Model (2.17) with p J, d 2 lnf{y\x) 21 a 7 p P [y - P IP {x - p)] (~2pa 2 ) - a 2 (1 - P 2 ) [p (-7 (x - p)) + [y - p yp {x - p)]] N * (x - p) {a 2 {l-p 2 )) 2, j 2a 2 P 2 {x- p)[y- p + 5-jp{x-p)] a 2 pj (1 - p 2 ) (x - p) 2 {a 2 (1 - p 2 )) 2 {a 2 (1 - p 2 )f a 2 (1 - p 2 ) {x - p) [y - p + (5-7p (x - p)] (a 2 (l-p 2 )) 2 21

35 22 Xz d ~ l) z D Xz d - l) z D J(r/ -x)dl-g + rl-k} J(r/ - x) dl - g + rl - H) { z d - X) ^ e^-lh* 2(^-1)2^ S(^-l) S(^-T) z{ri - x) { z d - 1) d [{rl- X )dl-g + rl-ft]{r1-x)dlz zd \ \ W -T) ] J{rl-x)dL-g + rl-fi} z { z d- l )_ I \ W - I) 1 {{rl _ x) L-) [(ri -x)dl-g + ri-d} d Z \ z^-l) 1 [{{ri - X) L-) { z d - 1) - (d Z -) [(rl -x)dl-g 7D + I [ W ~ T ) 1 {d Z -)( z d-i)z z [{rl-x)dl-g + rl-fi}d_ = ll f 1 z^d - T ) + r1-a}\ (ri-x)l ( z d- X )-(d Z -)d + I **e - TT. (x\a) fu lz g d qjiav (ZZ'Z) I 3 P IAI saaireauap repied puooas Suprex (ZZ'Z) Xz d - l) z D (z + d ~ l) z D il^l [(rl -x)dl-g + rf - d]d [(ri - x) dl - g + ri - H] (rl - x) L d Xz d -D z[(ri -x)dl-g + rt-fi] (d Z -) + ((ri - x) L-) [(ri - x)dl - g + ri - fl] Z ( z d - \) = i f S^Z (z d - \) z D ( d Z-) zo dq (X\K) fujq l f (XIT) d qjim (8*3) PP IM S9AIJBAU9P rerj-red JSJU Suprex zij = kl_jl_ = ( 0 + zo ( z d - T) Ld z n - 0) ^ ^ l) ZD) = («/ ) 3- = "/ dl I (0Z7) PP 1AI J S3 P? S m 0 l pspsdxa Suprex

36 2p7 (x - p) [y - P + S - 7p (x - p)] a 2 (1 - p 2 ) 2 (2.23) Taking expected both sides of Model (2.23) In = -E{J n ) P Q P(1-P> 2,4p 2 (1-p 2 ) a 2 (1-p 2 ) ^2(1-P 2 ) O Q (1-p 2 ) 2 (1-p 2 ) 2 a 2 (1 - p 2 a 2 (1-p 2 ) 4 a 2 (1-p 2 )', p ' (1-p 2 ) 2 (1-p 2 ) 2 (1-P 2 ) (1-p 2 ) 2 + (1-P 2 ) hi Model (2.25) show J in matrix from p 2 ;I-P 2 ) a-p 2 r (2.24) J = Jll J\2 Jl3 J21 J22 J23 Ju J32 J33 (2.25) Taking expected on Model (2.25) in order to get the inverse of the information matrix I = -E[J] (2.26) Plug Model (2.11),(2.14), (2.16), (2.19), (2.21) and (2.24) into Model (2.27) l+5p 2 (1-P 2 ) T (l_ p 2)2 + IP (1-P 2 ) 0 2 _ (1-P 2 ) P 2 (1-P 2 ) 0 1 <7 2 (1-P 2 ) (2.27) 23

37 Multiply Model (2.27) by n in order to get i, L = nl. (2 In = U \(l-p*) + (1-p 2 ) 2 n IP (i-p 2 ) -n IP 2 (1-P 2 ) '(i-p 2 ) 0 0 (2 0 0 n- l 7 2 (l-p 2 ) Variance and Standard error estimates computed from the information matrix are Var (p) = n 2 P^ (i-p 2 ) n (i-p 2 ) \( 1+7 l+5p 2 \] [{(1-P 2 ) ' (1-P 2 ) 2 j. n2^pk (1 - P 2 ) n 1+^4. 1 I+5P 2 ^ - - P _ V 2 Var (7) = n (1-P 2 ) l+5p 2 fa-p 2 ) + (1-p 2 ) l+5p 2 (1-p 2 ) + (1-p 2 ) 2 7V (1-P 2 ) 2 (1- np 2 P 2 ) 1 + [j + ~ P 2 " [1+7+ (I_p2)j, l+5p 2 0] 7 -l- (i_p 2) 7 j Var (5) = S (S) = a 2 (1 - p 2 ) n a 2 (1 - p 2 n SE (p) = ;i - P 2 ) \n[l + 7+( 1±^-7 2 j S (7) = \ (1- np 2 P 2 ) 1 + [lx 7 x 1+5P 2 1 [-L + 7+ (1_ p 2)J T + 1+5P 2 _,2l 7 -l- (1_p 2) 7 J 24

38 2.3 R and HBR Estimations for Linear Regression Model The dual effects model can be written as Y = p p (X - p) + e. We can rewrite the dual effects model as a linear regression model Y = a + px + e, where a equals to p 8 jpp and P equals to jp. The efficient estimators for a and P depend on the underlying distribution. The R estimators (1.2) are better than the estimators when the distribution has thicker tail or allows outliers in the Y space. However, the R estimator is not robust to outliers in the X space. But the HBR estimator (1.3) is robust to outliers in both the X and Y spaces and, further, it has positive breakdown. The estimate of 7p and 5 based on estimates of slope and intercept is IP = P and 8 = P - & - PP- In particular, 5 is unbiased, consistent, and has minimum variance. However, 5 is sensitive to outliers and deviations from the normality assumption. The estimate of 8 and 7p based on R estimates (1.2) of slope and intercept is 1PR = PR 25

39 and 8R = p-ol R - p R p, where the estimate of 8 and 7p based on HBR estimates (1.3) of slope and intercept are IPHBR = PHBR and 5HBR = P O/HBR PHBRP'- Recall that the MLE estimate of p is -,/i-sas P \I - 9 na* or Ore = A/1- n^l= l l Let a 2 = ^3^ ]C^?- I n practice we often do not know a 2. Then we estimate it by n p _. n+m v 2 = T Y ( x * - ^) 2 ' i=i n + m 1 -^^ i=i which is the nonparametric estimate of a 2 based on a sample. Thus, in practice? 2 For a robust estimate of p, it seems natural to use ratio of robust scale estimators in place of a 2 /a 2. We first look at it generally, obtaining a general consistency result. This is followed by a discussion of explicit estimators. 26

40 2.4 Scale Functionals For this discussion, functional notation is convenient. Let W be a continuous random variable with cumulative distribution function (cdf) F(w) and proability density function (pdf) f(w). Then 9 is a functional if it maps F into R where R denotes the set of real numbers. We use the notation 9(F), 9(f) and 9(W) interchangeably. This is an abuse of notation because 9 is a function of a cdf; i.e., 9(W) means 9(F W ). Definition: Let W be a random variable cdf F. 9 is a scale functional if for all a G R and b > 0 9(W + a) = 9(W) and 9(bW) = b9(w). Consider our regression dual effects normal model (2.3), while we rewrite as Y = p p (x - p) + e. Note that x is a continuous random variable. Theorem: Let 9 be a scale functional. Then 9(e) = y/l p 2 9(x). Proof: For the normal model, we have, in distribution, that e = ^/l p 2 az, where Z is iv(0,1). Since 9 is a scale parameter, then we have 9(e) = y/l-p 2 a9(z). 27

41 Also, we can write X in distribution as X = az, where ZisN(0,l). Hence, 9(X) =a9(z). Thus, 9(e) = Vl Zr P~^[0(X)/a] = y/t=?6(x). Therefore, 9(X) =V " That is, ^ 9 2 = { U-PP (X) 2 ]) 2.5 Scale Functions and Estimator Returnning to the robust estimation of the model (2.3), suppose we select 9 as our scale functional. (1) The estimates of 8 and 7p based on R and HBR estimates of slope and intercept are IPR = PR, 5R = p~a R - p R p. 28

42 and IPHBR - PHBR, 8~HBR = P OiHBR PHBRP- (2) The residuals form ei = y% - (P - 5) ~ lp( x i ~ p)- Denote our estimation of 9 by 9 n,t 9{e\,...,e n ). Assume that 0 j is a consistent estimate of 9(e); i.e., # j -^ 0(e) as n > oo. In our problem, we sometimes have a value of a 2 obtained by subject matter. We can then use normality assumption of the model to determine 9(X). Otherwise we will have to estimate 9(X) based on the large sample; i.e., 9 n+m^(xi,...,x n, x n+ i,..., x n+m ) Assume that 9 n+m, x > 9(X) in probability as (n + m) > oo. Finally, let P N j» 2 (?x.,?») ''aa^'l) ' ) Xn+m) are consistent estimators. Then p is consistent estimator of p. Theorem: Using the notation above, p > p. Proof: 1 _ g*(?l.-,?n) _, 1 _ M _ 1 _ ( > _ n2\ _ el(xi,...,x m+n ) «2 (x) P 2 29

43 2.6 Example of Scale Functionals and Estimators In this section we discuss several scale functionals and estimators that we will use in this study. As we note all of these are consistent, and, hence, lead to consistent estimators of p. In Chapter 3, we will investigate how well they do in practice Dispersion Function D^; i.e., Recall that our robust regression estimator of 7p minimized the dispersion function 7P^ = Argmin/J^p), (2.30) where D^p) YYY^=i a^{r(yt ~ lp x [)](yi ipx[). Note that D is invariant to p and equivariant to a. As discussed in Hettmansperger and McKean (2011, p.201), the functional corresponding to \D^ is D^F) = J Up (F (t)) f (t) dt Theorem: Let X be a random variable. Then D V (F) is a scale parameter. Proof: We want to show that for all a and for all b > 0, D(x + a) = D(x) and D(bx) =bd(x). Let F(x) be the distribution function of x. Consider D(e) = 9(ax), a > 0. Then, 30

44 we have / oo -oo x<p(f e (x))f t (x)dx. (2.31) Assume that X has a cumulative distribution function (cdf) F x and proability density function (pdf) f x, F<(t) = P(e<t) = P(ax< t) = P(X<a = F x [- From Model (2.31), D(e) = Jt<p(F c (t))f c (t) dt ^( F - )^(s)* a <P (F e (s)) f e (s) ds = ad(x). Consider D(e) = D(x + b), oo < b < oo. Let X have a cumulative distribution function (cdf) F x and proability density function (pdf) f x. F t {t) = P(e<t) = P(x + 6<i) = P(x<t-b) = F x (t-6). 31

45 From Model (2.31), D(e) = J t<p(f (t))f e (t)dt = (ty(f x (t-b))-j x (t-b)dt = Js<p(F e (s))f e {s)ds = D(x). As discussed in Hettmansperger and McKean (2011, p.201), the residual dispersion function converges in proability to D^(F), D v {e n ) A D V (F) From Model (2.5), we can rewrite to dispersion function as PD<P i ^2(?i,...,? n ) \ D x (x\,...,x m+n ) Theorem: Using the notation above, p Dip > p. Proof: Consider D(e) = y/l - p 2 D(x). We plug D(e) into Model (2.6.1), PDip I (^T^J 2 D(x)) 2 I D 2 (x) (^T^p 2 ) 2 D 2 (x) D 2 (x) = P- 32

46 Therefore, we conclude that po v > p Parameter T V Define the parameter r by r" 1 = / (p(u)ip f (u)du, (2.32) where (p f (u) = - f f { (F-'(u)y Hettmansperger and McKean (2011, p. 178) showed that r^ is the scale parameter. Let u - F(x) and du = f(x)dx. We then have <p[f(x)\f (x)dx. Let y = a + bx and a > 0 and b > 0. Hence, F r (y)= P (X<^) = F(^). Let/ (y) = i/(«=2) and/ y (y) = /'(*^). We have r" 1 = v>[*wy)]/ y (y)<*y. Therefore, ^--f^^i^w <«3> 33

47 Let z = y -^ and dz ~\dy. Next, plug z and dz into Model (2.33), we get OO -1 / - w- -oo v[f(z)}-f (z)bdz We then have this means that r is a scale functional. As an estimate of r, we use the estimator developed by Koul, Sievers, and McKean (1987). This estimator is a consistent estimator of r under both symmetrical and asymmetrical errors; see also Hettmansperger, McKean, and Sheather (2003). An informative discussion of this estimator can be found in Section of Hettmansperger and McKean (2011) Median Absolute Deviation (MAD) The median absolute deviation from the median, called the MAD, is a common resistant measure of scale (Mosteller and Tukey 1977). Let a sample w\,...,w n. The MAD is defined by the median of the absolute deviations w l from the median w 3 (Lax, D.A., 1985), The MAD model can be written as MAD = med, \w t - med 3 Wj. (2.34) 34

48 The functional of MAD is d UAD = l-4826med \w - med(w). (2.35) It is easy to see that #]yiad * s a sca^e functional; that is, for all a, 6, MAD( U ' + 0 = ^MAD( U ') anc * for ^ > 0 ^MAD^'w) = ^^MAD( U ')- The ^MAD * s a consistent estimator (Mosteller and Tukey, 1977) Example: Final and Midterm Exam Scores The data consist of final and midterm exam scores from 20 students. Let X and Y respectively denote the midterm and final exam scores of a student. Choose the cutoff point 70 for X. Then the select sample is (x t, y t ), where x % > 70 Then n = 10 and m = 13. Use 70.1 and for p and a (these are the sample mean and standard deviation of all the x's). The Midterm and Final scores are shown in Table 1. The regression towards the mean plot (with X centered) is displayed in Figure 2, which contains a dot line, in which the slope is equal 1; and a line, in which the slope is less than 1 or is the regression towards the mean. We investigate and Wilcoxon dispersion () procedures. Terpstra and McKean (2005) have written the weighted Wilcoxon routine using R statistical software package (R Development Core Team, 2005). For this study, we use this R program to perform our computation and call it as W D in this study. 35

49 Table 1: Final and Midterm Exam Scores Student Midterm Scores Final Scores Procedure Table 2: Mean Parameters Mean Parameters 7P P

50 o o o oo o o CD O LO O o oo o Midterm Scores Figure 2: Regression toward the Mean Plot 37

51 The mean parameter of 7 of the method is higher than that of the procedure. While the mean parameter of p and 8 using the procedure is higher than the procedure (Table 2). For either or Wilcoxon procedures, based on the results in Table 3, we reject the over all null hypothesis since 0 0 1$ or 1 f- I 1. Based on the confidence intervals, there are both additive and multiplicative effects for this data set. Hence, there is no simple interpretation. Procedure 7P (1.0761,2.2708) (0.9375,2.3214) Table 3: 95% BCI 95% Bootstrap Confidence Interval (BCI) P 7 (0.6522,0.9532) (1.2827,2.9679) (0.6857,0.9632) (1.0596,2.8563) 8 ( , ) ( , ) 38

52 CHAPTER III MONTE CARLO STUDY FOR DUAL TREATMENT EFFECTS AND MULTIVARIATE NORMAL MODE In this chapter, we consider a simulation study to investigate the traditional and R procedures for the dual effects model. Our summary of this study includes the empirical means, the empirical MSEs, the empirical AREs, the empirical confidence coefficients, and the empirical levels of four parameters; which are 7p, p, 7, and 8. Both pretreatment and posttreatment variables are generated from dual treatment effects models using multivariate normal varites. We specify two settings for p at 0.7 and 0.8. So the regression effect is on for all situations. We also consider the different scenarios for the additive (8) and the multiplicative (7) effects. For our Monte Carlo study, we simulate a clinical study in which only a patient whose response exceeds a prespecified threshold is treated. In our case, we set the threshold at the third quartile of the initial response. Hence for a given bivariate distribution for a random vector (X, Y), a simulated sample is obtained as follows. Generate a realization (x, y) from the bivariate distribution. If x > q Xy3, where q x>3 is the third quartile of X, then retain (x,y); otherwise, retain only x. In our simulation studies, we set the sample size of the bivariate sample at n = 50. Hence, for each simulation we have the bivariate sample (xi,yy),..., as x n+ i,..., x n+m, (x n, y n ) and, in addition, the sample of x's below q x>3, which we label (as in Chapter 2). We expect m to be 150 but it may not be. This adds some variation into the process but it is similar to what occurs in clinical practice. Further in practice, the mean (p) and standard deviation (a) of X may be known; i.e., estimated from a very large sample. If not, then they often estimated from the combined sample of 39

53 x's; i.e., all n + m realizations of X. In our simulation studies, we use the later approach; that is, the combined sample of the x's is used to estimate p and a. In this chapter, the simulated model is the bivariate normal under the dual treatment effects model. The bivariate pdf is given in expression (2.2). The parameters 8, 7, and p are set at different values depending on the situation. For all simulations, we set p = 200 and <T= Bootstrap Confidence Intervals (BCI) Our model for the study is the dual effects model which was discussed in Chapter 2. For convenience, we rewrire it Y = p + p(x - p) + e. In Chapter 2 we discussed traditional and robust estimation of the parameters of the model. In practice, though, we are often interested in the hypotheses of no treatment effect beyond the regression towards the mean effect; that is, the hypotheses H 0 : 8 = 0 and 7 = 1 versus H a : 8 y 0 or 7 ^ 1. We consider testing these hypotheses using the following bootstrap confidences. The bootstrap confidence intervals are calculated as follows. A sample of size n is drawn from (xi,yi)..., (x n,y n ) with replacement. This gives new estimates (^,7*). The bootstrap joint distribution of (8,7) is obtained by repeating the procedure B times to get (51, 77),..., (8g, 7^). A 95% confidence interval for, say, 5 is given by the 2.5th and 97.5th percentiles of the bootstrap distribution of 5. These bootstrap-percentile confidence intervals are denoted 95% BCI (Naranjo and McKean, 2001). 40

54 3.2 Tests of Significance The treatment effect components 5 and 7 may be tested by using the bootstrap confidence intervals. However, as in estimation, Naranjo and McKean (2001) recommend the use of bootstrap-based inference over the likelihood-based standard errors. A marginal test with 5% level of significance against the alternative hypothesis 5 7^ 0 can be conducted by constructing a 95% bootstrap-percentile confidence interval h for 8 to see if it contains the value 0. Similarly, the alternative hypothesis 7^ 1 can be detected if its 95% confidence interval 7 7 does not contain 1. An overall test against the general alternative (8 ^ 0 or 7 7^ 1) is conducted as follows. Let II and I* be bootstrap-percentile 97.5% confidence intervals for 8 and 7, respectively. A Bonferroni test with at most a 5% level of significance for testing H Q : 8 0,7 = 1 versus H a : 8 7^ 0 or 7 7^ 1 is given by IB 5 +. Reject H 0 if 0 P 5 or 1 I*. 3.3 S imulation S tudy Dual Treatment Effects Model of Different Scenarios with 20% and 30% Regression to the Mean Our model for this simulation study is the dual effects model (2.3). As noted, in this chapter we use the bivariate normal distribution to generate the variates. We investigate the procedures for the following four situations: (HI) 8 = 0, 7 = 1 (H2) 5 = 0, 7 = 0.6 (H3) 8 = 0.7, 7 = 1 41

55 (H4) 8 = 0.7, 7 = 0.6 Note that HI is the overall null hypothesis, while H2 and H3 are marginal null situations. As stated before, all four situations are simulated with 20% (p = 0.8) and 30% (p = 0.7) regression to the mean. We run 1000 simulations for each situation. Further, for each simulation we set the bootstrap size at Our simulation study focuses on the comparison of the behavior of the traditional ( based) and the robust procedures based on the Wilcoxon fit. In Chapter 2, we presented these robust estimates of p, based on the Wilcoxon despersion functional (2.30), the functional r (2.32), and the functional MAD. The estimates of each of these functionals along with the Wilcoxon estimates of the other parameters constitute a procedure. Early investigations, though, showed that estimates based on r and MAD led to too frequent negative estimates of p 2. This was not the case for the dispersion function. So for our study we only consider the procedure based on the dispersion function. We label this Wilcoxon dispersion procedure as. We summarize the results for the estimation of the parameters in Tables The summaries include the empirical mean squared errors (MSE), empirical asymptotic relative efficiency (the ratios of MSE's), and the empirical confidences for nominal 95% and 97.5% confidence intervals. There are two testing situations given by 1. Full test (a F ): H 0 : 8 = 0, 7 = 1 versus H a : 8 ^ 0 or 7 7^ 1 is given by Reject H 0 if 0 P & or \ < I* Recall that we use the Bonferroni procedure for this hypothesis with separate nominal level Hence, the nominal level is at most a = We label the empirical level by Sp. 42

56 2. Marginal test: H ao :5 = 0 versus H aa : 8 ^ 0 and H bo : 7 = 1 versus H bb : 7 7^ 1 For 5 marginal test (a a,m,<5): Reject H 0 if 0 /; and For 7 marginal test (Sb >TO, i7 ): Reject H 0 iflg I* The marginal tests have nominal level Table 4: Empirical MSE when p = 0.8 Situation Procedure Empirical MSE HI when p = 0.8 H2 when p = 0.8 H3 when p = 0.8 H4 when p = 0.8 7P P The empirical MSEs over all situations for p = 0.8 are less than that for p = 0.7, for both and methods. In the case of 7p, the MSEs of the estimates are greater than those of the in HI, H2, H3, and H4 situations for both p = 0.8andp = 0.7 (Table 4 and Table 5). This last result is not surprising because procedure is more powerful 43

On Robustification of Some Procedures Used in Analysis of Covariance

Western Michigan University ScholarWorks at WMU Dissertations Graduate College 5-2010 On Robustification of Some Procedures Used in Analysis of Covariance Kuanwong Watcharotone Western Michigan University