Within Groups Comparisons of Least Squares Regression Lines When There Is Heteroscedasticity

Size: px
Start display at page:

Download "Within Groups Comparisons of Least Squares Regression Lines When There Is Heteroscedasticity"

Transcription

1 Within Groups Comparisons of Least Squares Regression Lines When There Is Heteroscedasticity Rand R. Wilcox Dept of Psychology University of Southern California Florence Clark Division of Occupational Science & Occupational Therapy University of Southern California WORD COUNT 3446 December 28,

2 ABSTRACT Motivated by a problem that arose in the Well Elderly II study (Clark et al., 2012; Jackson et al., 2009), the paper deals with the situation where a least squares regression line is fitted to data at two different times and the goal is to test the hypothesis that the slopes and intercepts are equal in a manner that allows a heteroscedastic error term. A bootstrap estimate of the standard errors could be used to deal with heteroscedasticity followed by a simple modification of Hotellng s test. But evidently there are no simulation results regarding the resulting control over the probability of a Type I error. Three related goals are to test the hypothesis of equal intercepts, ignoring the slopes, testing the hypothesis of equal slopes, ignoring the intercepts, and testing the hypothesis that the regression lines differ at a specified design point. This last goal corresponds to the classic Johnson Neyman method when dealing with independent groups. Another unknown is the impact on the actual Type I error probability when leverage points are removed. Here it is found that for various situations, removing leverage points has a minimal impact, but for certain patterns of heteroscedasticity, there is a substantial improvement over the control of a Type I error when the sample size is small. Keywords: analysis of covariance, bootstrap methods, heteroscedasticity, Hotelling s test, Johnson Neyman method, Well Elderly II study. 1 Introduction The paper deals with what in essence is a within groups analysis of covariance design with a single covariate. At time j (j = 1, 2), it is assumed that Y j = β 0j + β 1j X j + λ(x j )ɛ j, (1) where β kj (k = 0, 1; j = 1, 2) are unknown parameters, λ(x j ) is some unknown function that models heteroscedasticity, and ɛ j is a random variable having variance σj 2 and E(ɛ j ) = 0. The paper considers the problem of testing H 0 : (β 01, β 11 ) = (β 02, β 12 ) (2) 2

3 when using the ordinary least squares estimator. Two related goals are testing H 0 : β 01 = β 02, (3) and H 0 : β 11 = β 12, (4) Yet another goal is, for a chosen value X, test H 0 : E(Y 1 X) = E(Y 2 X). (5) For the case of independent groups, this last goal corresponds to the classic Johnson and Neyman (1936) method when there is homoscedasticity and the error term has a normal distribution. Two general remarks are in order. First, it is assumed that there is explicit interest in determining the mean of Y, given X, rather some other robust measure of location such as the median. For skewed distribution, it is evident that a robust measure of location can be argued to better reflect the typical value. The presumption is that at least in some situations, the mean might still provide a useful perspective, in which case least squares regression is more appropriate compared to some robust regression estimator. Second, the methods considered here assume asymptotic normality for reasons that will be evident. Some robust regression estimators are known to be asymptotically normal. But for others, either it is known that this is not necessarily the case or asymptotic normality has not been established. One example is the Theil (1950) and Sen (1968) estimator. Peng et al. (2008) established that the slope estimator may or may not be asymptotically normal. The point is that some obvious extension of the methods in section 2, when using a robust regression estimator, are not necessarily appropriate. Let b kj be the least squares estimate of β kj. Of course, classic methods assume ɛ j has a normal distribution and that λ(x) 1 (homoscedasicity). It is well known, however, that classic inferential methods, which assume homoscedasticity, are based on an incorrect estimate of the standard error of b kj when in fact there is heteroscedasticity (e.g., Godfrey, 2006; Long & Ervin, 2000). For completeness, it is noted that several theoretically sound methods for estimating standard errors, when there is heteroscedasticity, have been derived (e.g., White, 1980; Hinkley, 1977; Cribari-Neto, 2004; Cribari-Neto, Souza & Vasconcellos, 3

4 2007; Cribari-Neto, Souza & da Silva, 2011). However, no details are given here because these estimators are not readily extended to the situation at hand where dependent groups are being compared. That is, these estimators do not provide an estimate of the covariance between b k1 and b k2, which is needed for present purposes. The general strategy here is to use a basic bootstrap estimate of the standard errors that allows heteroscedasticity followed by some obvious test statistics for testing (3) and (4). As for (2), a simple generalization of Hotelling s T 2 test statistic is used. A bootstrap estimate of the standard error also is used when testing (5). Another goal in this study is to investigate the impact of removing leverage points (outliers among the independent variable) on the probability of a Type I error. When using the OLS estimator, a well-known concern is that even a single leverage point can result in a fit that poorly reflects the association among the bulk of the points (e.g., Rousseeuw & Leroy, 1987; Staudte & Sheather, 1990; Heritier et al. 2009; Wilcox, 2012). That is, there are concerns about leverage points beyond their impact on the probability of a Type I error: power can be impacted as well. Simulation results reported here indicate that in many situations, removing leverage points does not alter the control over the Type I error by very much. However, for some situations, control over the Type I error probability is improved substantially, as will be seen. The paper is organized as follows. Section 2 describes the methods used to test (2), (3), (4) and (5). Section 3 reports simulation results and section 4 illustrates the methods using data from the Well Elderly II study (Clark et al., 2012; Jackson et al., 2009), which motivated this paper. 2 Description of the Methods This section describes the details of the methods to be studied via simulation. The method for identifying leverage points is described first followed by the bootstrap method for estimating the standard errors and then the methods for testing (2), (3), (4) and (5). 4

5 2.1 Identifying Leverage Points Let (Y 11, X 11, Y 12, X 12, ),..., (Y n1, X n1, Y n2, X n2 ) be a random sample from some four-variate distribution where all four random variables are possibly correlated. For a within subjects design, (Y ij, X ij ) (i = 1,..., n; j = 1, 2), represents the observations at time j. Let M j be the usual sample median based on X 1j,..., X nj. The median absolute deviation statistic at time j, MAD j, is the median of X 1j M j,..., X nj M j. At time j, the MAD-median rule declares the value X an outlier if, for some specified constant K, X M j MAD j /.6745 > K. A common choice for K is the square root of the.975 quantile of a chi-squared distribution with one degree of freedom (e.g., Rousseeuw & Leroy, 1987; Wilcox, 2012, p. 97) and this convention was used here. So K is approximately Note that for the situation at hand, values declared outliers among the independent variable at time 1 are not necessarily the same as the values declared outliers at time 2. Here, removing leverage points means that for either possible value for j, the point (X ij, Y ij ) is removed if X ij is declared an outlier among the values X 1j,..., X nj using the MAD-median rule. That is, a point is removed if at either time 1 or time 2 the value of the independent variable is declared an outlier. 2.2 Estimating Standard Errors and Covariances There is the issue of estimating the covariance matrix associated with (d 0, d 1 ) = (b 01 b 02, b 11 b 12 ) in a manner that allows heteroscedasticity. Here, a standard bootstrap method is used (e.g. Efron & Tibshirani, 1993). Generate a bootstrap sample by randomly sampling with replacement n vectors of observations from (Y 11, X 11, Y 12, X 12, ),..., (Y n1, X n1, Y n2, X n2 ) yielding (Y 11, X 11, Y 12, X 12, ),..., (Y n1, X n1, Y n2, X n2). Based on this bootstrap sample, compute the least squares estimates of the slopes and intercepts and take the differences yielding (d 0, d 1). Repeat this B times yielding (d 0b, d 1b), b = 1,..., B. The covariance matrix associated with (d 0, d 1 ) is estimated with the sample covariance matrix based on (d 0b, d 1b), b = 1,..., B, which is denoted by S. 5

6 Given some value for the covariate, X = x say, the squared standard error of ˆD = Ŷ1 Ŷ2 is computed in a similar manner, where Ŷj = b 0j + b 1j x. Now let Ŷ j = b 0j + b 1jx, where b 0j and b 1j are the bootstrap estimates of the intercept and slope, respectively. For B bootstrap samples, this yields Ŷ jb, b = 1,..., B. The squared standard error of D is estimated with where ˆD b = Ŷ 1b Ŷ 2b and D = ˆD b /B. U 2 = 1 ( ˆD B 1 b D ) 2, Three choices for B were considered: 100, 200 and 500. Simulations indicated that increasing B from 100 to 200 offered some improvement in terms of the probability of a Type I error. Increasing B to 500 was found to provide little or no improvement, so B = 200 is assumed henceforth. 2.3 The Test Statistics First consider testing (2). The test statistic is based on a simple modification of Hotelling s T 2 statistic for testing the hypothesis that a multivariate normal distribution has a mean of zero. From basic principles, under multivariate normality, the hypothesis that J dependent groups have a common mean of zero is rejected if n(n J) 1 XS m J(n 1) X exceeds the 1 α quantile of an F distribution with degrees of freedom J and n J, where X is a vector of sample means and S m is the usual covariance matrix. It is well known that under general conditions, (d 0, d 1 ) is asymptotically bivariate normal. For the situation at hand, where J = 2, this suggests the test statistic H = n(n 2) 2(n 1) (d 0, d 1 )S 1 (d 0, d 1 ) (6) and rejecting (2) at the α level if H exceeds the 1 α quantile of an F distribution with ν 1 = 2 and ν 2 = n 2 degrees of freedom. For (3) and (4), the test statistic is taken to be T k = d k sk+1,k+1 (7) 6

7 where k = 0 or 1. So for k = 0, (3) is being tested where s k+1,k+1 = s 1,1 is the estimated squared standard error given by S. The null hypothesis is rejected if T k t, the 1 α/2 of a Student s t distribution with n 1 degrees of freedom. (The degrees of freedom were taken to be n 1 rather than n 2 because we did not use the usual homoscedastic estimate of the standard errors based on the residuals. Rather, we are mimicking the usual Student s t test. Despite this, perhaps there is some argument for using n 2, but this remains to be determined.) Finally, (5) is rejected if W t, where W = Ŷ1 Ŷ2 U (8) and again t is the 1 α/2 of a Student s t distribution with n 1 degrees of freedom. 3 Simulation Results Simulations were used to study the small-sample properties of the methods in section 2. The sample sizes considered were 20 and 40. Some additional simulations were run with n = 200 as a partial check on the R functions that were used to apply the methods. Estimated Type I error probabilities, ˆα, were based on 4000 replications. Four types of marginal distributions were used: normal, symmetric and heavy-tailed, asymmetric and light-tailed, and asymmetric and heavy-tailed. More precisely, the marginal distributions were taken to be one of four g-and-h distributions (Hoaglin, 1985) that contain the standard normal distribution as a special case. If Z has a standard normal distribution, then exp(gz) 1 exp(hz 2 /2), if g > 0 g V = Zexp(hZ 2 /2), if g = 0 has a g-and-h distribution where g and h are parameters that determine the first four moments. The four distributions used here were the standard normal (g = h = 0.0), a symmetric heavy-tailed distribution (h = 0.2, g = 0.0), an asymmetric distribution with relatively light tails (h = 0.0, g = 0.2), and an asymmetric distribution with heavy tails (g = h = 0.2). Table 1 shows the skewness (κ 1 ) and kurtosis (κ 2 ) for each distribution. Additional properties 7

8 Table 1: Some properties of the g-and-h distribution. g h κ 1 κ of the g-and-h distribution are summarized by Hoaglin (1985). The correlation among the four variables was taken to be ρ = 0 or.5. (The R function rmul in Wilcox, 2012, was used to generate data.) Three choices for λ were used: λ(x) = 1, λ(x) = X + 1 and λ(x) = 1/( X + 1). For convenience, these three choices are denoted by variance patterns (VP) 1, 2, and 3. As is evident, VP 1 corresponds to the usual homoscedasticity assumption. Table 2 summarizes the simulation results when testing (2) at the.05 level and the sample size is n = 20, where the columns headed by S are the results when leverage points are retained and LR indicates that leverage points are removed. Although the seriousness of a Type I error depends on the situation, Bradley (1978) has suggested that as a general guide, when testing at the.05 level, at a minimum the actual level should be between.025 and.075. In Table 2, when leverage points are retained, estimates range between.016 and With leverage points removed, the range is.026 to.071. Note that in Table 2, when leverage points are retained, the lowest estimates occur for VP 3 when h =.2. For g = 0 the estimate is.016 and for g =.2 the estimate is.017. Increasing the number of bootstrap samples from 200 to 500, the estimates are now.016 for both situations. The highest estimate in Table 2 is.079. With B = 500 the estimate is.080. When leverage points are removed, the lowest estimate in Table 2 is.026. With B = 500 the estimate is.022. The highest estimates are.071 for VP 2 when g = h = 0 and (g, h) = (.2,.2). For these two situations the estimates are.076 and.071 again with B = 500. Table 3 summarizes the results when testing (3) and (4). Note that when leverage points are retained, control over the Type I error probability is fairly good when testing (3), but 8

9 Table 2: Estimated Type I error probability when testing (2), α =.05, n = 20 S LR g h V P ρ =.0 ρ =.5 ρ =.0 ρ = S=Leverage points retained LR=Leverage points removed 9

10 Table 3: Estimates of α when testing (3) and (4), n = 20, α =.05 ρ = 0 ρ =.5 S LR S LR g h V P β 0 β 1 β 0 β 1 β 0 β 1 β 0 β S=Leverage points retained LR=Leverage points removed when testing (4) the estimates range between.011 and.083. It is heteroscedasticity that results in estimates well below or above the nominal level. Removing leverage points, again testing (4), now the estimates range between.019 and.069. In Table 3, the lowest estimates occur for VP 3 when testing (4), leverage points are retained, ρ = 0 and h =.2, the two estimates being.012 and.011. Increasing B to 500, the estimates are.013 for both situations. The highest estimate is.082, which occurs for VP 2. With B = 500, the estimate is.080. To add perspective, note that for VP 2, g =.2, h = 0, Table 3 indicates that when testing (4), without removing leverage points, the estimated Type I error probability is.075 when ρ =.5. Increasing n to 60, the estimate is.072 and for n = 100 the estimate is.062. So control over the Type I error is improving, as is expected, but simply removing leverage points, the estimate is.066 with n = 20. All indications are that in terms of controlling 10

11 Table 4: Estimates of α when testing (5), n = 20, α =.05, ρ = 0 S LR g h V P q1 q2 q3 q1 q2 q S=Leverage points retained LR=Leverage points removed the Type I error probability, little or nothing is lost removing leverage points and in some situations this has practical advantages. Finally, Table 4 reports estimated Type I error probabilities when testing (5) for three choices for X, namely, the estimated quartiles based on the covariate values in group 1. For brevity, only results for ρ = 0 are reported. With ρ =.5, no new insights were made. The results for the lower quartile, the median and the upper quartile are indicated by the columns headed by q1, q2 and q3, respectively. Generally, control over the Type I error probability is reasonably good. Again, when there is heteroscedasticity, removing leverage points can provide some improvement. 11

12 4 An Illustration A general goal in the Well Elderly II study was to assess the efficacy of an intervention strategy aimed at improving the physical and emotional health of older adults. A portion of the study was aimed at understanding the impact of intervention on a measure of meaningful activities as measured by the Meaningful Activity Participation Assessment (MAPA) instrument (Eakman et al., 2010). Higher MAPA scores reflect greater activity satisfaction. (Possible MAPA scores range between 6 and 42.) A covariate of interest was the cortisol awakening response (CAR), which is defined as the change in cortisol concentration that occurs during the first hour after waking from sleep. (CAR is taken to be the cortisol level after the participants were awake for about an hour or less minus the level of cortisol upon awakening.) Extant studies (e.g., Clow et al., 2004; Chida & Steptoe, 2009) indicate that various forms of stress are associated with the CAR. An issue was whether the association between the CAR and MAPA, measured before intervention, differed from the association after intervention. Testing (2) based on the regression line for predicting MAPA, given CAR, the p-value is.011 with leverage points retained compared to.005 when leverage points are removed. Testing (3) and (4), with leverage points retained, the corresponding p-values are.003 and.673. So the results indicate that the intercepts differ, but no significant difference between the slopes is found. Removing leverage points, again testing (3) and (4), now the p-values are.021 and.061. So again the slopes are not significantly different at the.05 level, but when testing (4), the p-value is substantially different compared to when the leverage points are retained. Testing (5) indicates that the regression lines cross somewhere between CAR equal to and But CAR equal to 5.62 falls well outside the range of observed CAR values. From a substantive point of view, among participants whose cortisol levels increase after awakening, MAPA scores tend to be higher after intervention. For those whose cortisol levels decrease, there is no indication that intervention improves MAPA scores. 12

13 5 Concluding Remarks In summary, there are some obvious speculations about how one might control the probability of a Type I error when there is heteroscedasticity and the goal is to compare the least squares regression lines at two different times. Simulations indicate that these methods perform reasonably well when there is homoscedasticity. When there is heteroscedasticity, there are some concerns when n is small, but they can be reduced by removing leverage points. As noted in the introduction, if a distribution is skewed, a robust measure of location might be preferred over the mean. In this case a robust regression estimator is more appropriate than the OLS estimator. And robust estimators offer the potential of higher power when there are outliers among the dependent variable. A method for testing the hypotheses (2), (3), (4) and (5), via a robust estimator, is being investigated that does not assume asymptotic normality. Finally, the R functions DregGOLS, difregols and Dancols are available for applying the methods in section 2. They are stored in the file Rallfun-24 on the first author s web page. Alternatively, these functions can be accessed via the R package WRS, which can be installed using a series of commands that are also described on this web page. REFERENCES Bradley, J. V. (1978) Robustness? British Journal of Mathematical and Statistical Psychology, 31, Chida, Y. & Steptoe, A. (2009). Cortisol awakening response and psychosocial factors: A systematic review and meta-analysis. Biological Psychology, 80, Clark, F., Jackson, J., Carlson, M., Chou,C.-P., Cherry, B. J., Jordan-Marsh, M., Knight, B. G., Mandel, D. Blanchard, J., Granger, D. A., Wilcox, R. R., Lai, M. Y., White, B., Hay, J., Lam, C., Marterella, A., & Azen, S. P. (2012). Effectiveness of a lifestyle intervention in promoting the well-being of independently living older people: results of the Well Elderly 2 Randomise Controlled Trial. Journal of Epidemiology and Community Health, 66, doi: /jech

14 Clow, A., Thorn, L., Evans, P. & Hucklebridge, F. (2004). The awakening cortisol response: Methodological issues and significance. Stress, 7, Cribari-Neto, F. (2004). Asymptotic inference under heteroskedasticity of unknown form. Computational Statistics & Data Analysis, 45, Cribari-Neto, F., & da Silva, W. B. (2011). A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model. AStA Advances in Statistical Analysis, DOI /s Cribari-Neto, F., Souza, T. C. & Vasconcellos, K. L. P. (2007). Inference under heteroskedasticity and leveraged data. Communication in Statistics Theory and Methods, 36, Eakman, A. M., Carlson, M. E. & Clark, F. A. (2010). The meaningful activity participation assessment: a measure of engagement in personally valued activities International Journal of Aging Human Development, 70, Efron, B. & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. New York: Chapman & Hall. Godfrey, L. G. (2006). Tests for regression models with heteroskedasticity of unknown form. Computational Statistics & Data Analysis, 50, Heritier, S., Cantoni, E, Copt, S. & Victoria-Feser, M.-P. (2009). Robust Methods in Biostatistics. New York: Wiley. Hinkley, D. V. (1977). Jackknifing in unbalanced situations. Technometrics, 19, Hoaglin, D. C. (1985). Summarizing shape numerically: The g-and-h distribution. In D. C. Hoaglin, F. Mosteller & J. W. Tukey (Eds.) Exploring Data Tables Trends and Shapes. New York: Wiley, pp Jackson, J., Mandel, D., Blanchard, J., Carlson, M., Cherry, B., Azen, S., Chou, C.-P., Jordan-Marsh, M., Forman, T., White, B., Granger, D., Knight, B., & Clark, F. (2009). Confronting challenges in intervention research with ethnically diverse older adults: the USC Well Elderly II trial. Clinical Trials, 6, Johnson, P. O. & Neyman, J. (1936). Tests of certain linear hypotheses and their application to some educational problems. Statistical Research Memoirs, 1, Long, J. S. & Ervin, L. H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. American Statistician, 54, Peng, H., Wang, S. & Wang, X. (2008). Consistency and asymptotic distribution of 14

15 the Theil Sen estimator. Journal of Statistical Planning and Inference, 138, Rousseeuw, P. J. & Leroy, A. M. (1987). Robust Regression & Outlier Detection. New York: Wiley. Sen, P. K. (1968). Estimate of the regression coefficient based on Kendall s tau. Journal of the American Statistical Association, 63, Staudte, R. G. & Sheather, S. J. (1990). Robust Estimation and Testing. New York: Wiley. Theil, H. (1950). A rank-invariant method of linear and polynomial regression analysis. Indagationes Mathematicae, 12, White, H. (1980). A heteroskedastic-consistent covariance matrix estimator and a direct test of heteroskedasticity. Econometrica, 48, Wilcox, R. R. (2012). Introduction to Robust Estimation and Hypothesis Testing, 3rd Ed. San Diego, CA: Academic Press. 15

COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY

COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY Rand R. Wilcox Dept of Psychology University of Southern California Florence Clark Division of Occupational

More information

COMPARING TWO DEPENDENT GROUPS VIA QUANTILES

COMPARING TWO DEPENDENT GROUPS VIA QUANTILES COMPARING TWO DEPENDENT GROUPS VIA QUANTILES Rand R. Wilcox Dept of Psychology University of Southern California and David M. Erceg-Hurn School of Psychology University of Western Australia September 14,

More information

ANCOVA: A HETEROSCEDASTIC GLOBAL TEST WHEN THERE IS CURVATURE AND TWO COVARIATES

ANCOVA: A HETEROSCEDASTIC GLOBAL TEST WHEN THERE IS CURVATURE AND TWO COVARIATES ANCOVA: A HETEROSCEDASTIC GLOBAL TEST WHEN THERE IS CURVATURE AND TWO COVARIATES Rand R. Wilcox Dept of Psychology University of Southern California February 17, 2016 1 ABSTRACT Consider two independent

More information

COMPARISONS OF TWO QUANTILE REGRESSION SMOOTHERS

COMPARISONS OF TWO QUANTILE REGRESSION SMOOTHERS COMPARISONS OF TWO QUANTILE REGRESSION SMOOTHERS arxiv:1506.07456v1 [stat.me] 24 Jun 2015 Rand R. Wilcox Dept of Psychology University of Southern California September 17, 2017 1 ABSTRACT The paper compares

More information

ANCOVA: A GLOBAL TEST BASED ON A ROBUST MEASURE OF LOCATION OR QUANTILES WHEN THERE IS CURVATURE

ANCOVA: A GLOBAL TEST BASED ON A ROBUST MEASURE OF LOCATION OR QUANTILES WHEN THERE IS CURVATURE ANCOVA: A GLOBAL TEST BASED ON A ROBUST MEASURE OF LOCATION OR QUANTILES WHEN THERE IS CURVATURE Rand R. Wilcox Dept of Psychology University of Southern California June 24, 2015 1 ABSTRACT For two independent

More information

GLOBAL COMPARISONS OF MEDIANS AND OTHER QUANTILES IN A ONE-WAY DESIGN WHEN THERE ARE TIED VALUES

GLOBAL COMPARISONS OF MEDIANS AND OTHER QUANTILES IN A ONE-WAY DESIGN WHEN THERE ARE TIED VALUES arxiv:1506.07461v1 [stat.me] 24 Jun 2015 GLOBAL COMPARISONS OF MEDIANS AND OTHER QUANTILES IN A ONE-WAY DESIGN WHEN THERE ARE TIED VALUES Rand R. Wilcox Dept of Psychology University of Southern California

More information

THE RUNNING INTERVAL SMOOTHER: A CONFIDENCE BAND HAVING SOME SPECIFIED SIMULTANEOUS PROBABILITY COVERAGE

THE RUNNING INTERVAL SMOOTHER: A CONFIDENCE BAND HAVING SOME SPECIFIED SIMULTANEOUS PROBABILITY COVERAGE International Journal of Statistics: Advances in Theory and Applications Vol. 1, Issue 1, 2017, Pages 21-43 Published Online on April 12, 2017 2017 Jyoti Academic Press http://jyotiacademicpress.org THE

More information

Global comparisons of medians and other quantiles in a one-way design when there are tied values

Global comparisons of medians and other quantiles in a one-way design when there are tied values Communications in Statistics - Simulation and Computation ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp20 Global comparisons of medians and other quantiles

More information

Improved Methods for Making Inferences About Multiple Skipped Correlations

Improved Methods for Making Inferences About Multiple Skipped Correlations Improved Methods for Making Inferences About Multiple Skipped Correlations arxiv:1807.05048v1 [stat.co] 13 Jul 2018 Rand R. Wilcox Dept of Psychology University of Southern California Guillaume A. Rousselet

More information

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Journal of Data Science 7(2009), 459-468 Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Rand R. Wilcox University of Southern California Abstract: When comparing

More information

Comparing Two Dependent Groups: Dealing with Missing Values

Comparing Two Dependent Groups: Dealing with Missing Values Journal of Data Science 9(2011), 1-13 Comparing Two Dependent Groups: Dealing with Missing Values Rand R. Wilcox University of Southern California Abstract: The paper considers the problem of comparing

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT Mirtagioğlu et al., The Journal of Animal & Plant Sciences, 4(): 04, Page: J. 344-349 Anim. Plant Sci. 4():04 ISSN: 08-708 INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine September 1, 2008 Abstract This paper

More information

Methods for Detection of Word Usage over Time

Methods for Detection of Word Usage over Time Methods for Detection of Word Usage over Time Ondřej Herman and Vojtěch Kovář Natural Language Processing Centre Faculty of Informatics, Masaryk University Botanická 68a, 6 Brno, Czech Republic {xherman,xkovar}@fi.muni.cz

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine May 22, 2009 Abstract This paper provides

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Practical Statistics for the Analytical Scientist Table of Contents

Practical Statistics for the Analytical Scientist Table of Contents Practical Statistics for the Analytical Scientist Table of Contents Chapter 1 Introduction - Choosing the Correct Statistics 1.1 Introduction 1.2 Choosing the Right Statistical Procedures 1.2.1 Planning

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Heteroskedasticity-Robust Inference in Finite Samples

Heteroskedasticity-Robust Inference in Finite Samples Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering John J. Dziak The Pennsylvania State University Inbal Nahum-Shani The University of Michigan Copyright 016, Penn State.

More information

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors by Bruce E. Hansen Department of Economics University of Wisconsin June 2017 Bruce Hansen (University of Wisconsin) Exact

More information

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates Journal of Modern Applied Statistical Methods Volume Issue Article --3 Conventional And And Independent-Samples t Tests: Type I Error And Power Rates Katherine Fradette University of Manitoba, umfradet@cc.umanitoba.ca

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015 Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity 1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

Parametric Probability Densities and Distribution Functions for Tukey g-and-h Transformations and their Use for Fitting Data

Parametric Probability Densities and Distribution Functions for Tukey g-and-h Transformations and their Use for Fitting Data Applied Mathematical Sciences, Vol. 2, 2008, no. 9, 449-462 Parametric Probability Densities and Distribution Functions for Tukey g-and-h Transformations and their Use for Fitting Data Todd C. Headrick,

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

IENG581 Design and Analysis of Experiments INTRODUCTION

IENG581 Design and Analysis of Experiments INTRODUCTION Experimental Design IENG581 Design and Analysis of Experiments INTRODUCTION Experiments are performed by investigators in virtually all fields of inquiry, usually to discover something about a particular

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors by Bruce E. Hansen Department of Economics University of Wisconsin October 2018 Bruce Hansen (University of Wisconsin) Exact

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

POLSCI 702 Non-Normality and Heteroskedasticity

POLSCI 702 Non-Normality and Heteroskedasticity Goals of this Lecture POLSCI 702 Non-Normality and Heteroskedasticity Dave Armstrong University of Wisconsin Milwaukee Department of Political Science e: armstrod@uwm.edu w: www.quantoid.net/uwm702.html

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation. A statistics method to measure the relationship between two variables. Three characteristics Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

New heteroskedasticity-robust standard errors for the linear regression model

New heteroskedasticity-robust standard errors for the linear regression model Brazilian Journal of Probability and Statistics 2014, Vol. 28, No. 1, 83 95 DOI: 10.1214/12-BJPS196 Brazilian Statistical Association, 2014 New heteroskedasticity-robust standard errors for the linear

More information

EXTENDING PARTIAL LEAST SQUARES REGRESSION

EXTENDING PARTIAL LEAST SQUARES REGRESSION EXTENDING PARTIAL LEAST SQUARES REGRESSION ATHANASSIOS KONDYLIS UNIVERSITY OF NEUCHÂTEL 1 Outline Multivariate Calibration in Chemometrics PLS regression (PLSR) and the PLS1 algorithm PLS1 from a statistical

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale Multiple Comparison Procedures 1 Multiple Comparison Procedures, Trimmed Means and Transformed Statistics Rhonda K. Kowalchuk Southern Illinois University Carbondale H. J. Keselman University of Manitoba

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Specification Tests for Families of Discrete Distributions with Applications to Insurance Claims Data

Specification Tests for Families of Discrete Distributions with Applications to Insurance Claims Data Journal of Data Science 18(2018), 129-146 Specification Tests for Families of Discrete Distributions with Applications to Insurance Claims Data Yue Fang China Europe International Business School, Shanghai,

More information

1 A Review of Correlation and Regression

1 A Review of Correlation and Regression 1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then

More information

11 Correlation and Regression

11 Correlation and Regression Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value

More information

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003. A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about

More information

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED H. J. Keselman Rand R. Wilcox University of Manitoba University of Southern California Winnipeg, Manitoba Los Angeles,

More information

Computational rank-based statistics

Computational rank-based statistics Article type: Advanced Review Computational rank-based statistics Joseph W. McKean, joseph.mckean@wmich.edu Western Michigan University Jeff T. Terpstra, jeff.terpstra@ndsu.edu North Dakota State University

More information

An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability

An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability Southern Illinois University Carbondale OpenSIUC Book Chapters Educational Psychology and Special Education 013 An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means One-Way ANOVA Source Table ANOVA MODEL: ij = µ* + α j + ε ij H 0 : µ 1 = µ =... = µ j or H 0 : Σα j = 0 Source Sum of Squares df Mean Squares F Between Groups n j ( j - * ) J - 1 SS B / J - 1 MS B /MS

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d. Research Design: Topic 8 Hierarchical Linear Modeling (Measures within Persons) R.C. Gardner, Ph.d. General Rationale, Purpose, and Applications Linear Growth Models HLM can also be used with repeated

More information

Robust Confidence Intervals for Effects Sizes in Multiple Linear Regression

Robust Confidence Intervals for Effects Sizes in Multiple Linear Regression Robust Confidence Intervals for Effects Sizes in Multiple Linear Regression Paul Dudgeon Melbourne School of Psychological Sciences The University of Melbourne. Vic. 3010 AUSTRALIA dudgeon@unimelb.edu.au

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Prerequisite Material

Prerequisite Material Prerequisite Material Study Populations and Random Samples A study population is a clearly defined collection of people, animals, plants, or objects. In social and behavioral research, a study population

More information

Diagnostic Procedures

Diagnostic Procedures Diagnostic Procedures Joseph W. McKean Western Michigan University Simon J. Sheather Texas A&M University Abstract Diagnostic procedures are used to check the quality of a fit of a model, to verify the

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn

More information

Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview

Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview Abstract This chapter introduces the likelihood function and estimation by maximum likelihood. Some important properties

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical

More information

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Assessing the relation between language comprehension and performance in general chemistry. Appendices Assessing the relation between language comprehension and performance in general chemistry Daniel T. Pyburn a, Samuel Pazicni* a, Victor A. Benassi b, and Elizabeth E. Tappin c a Department of Chemistry,

More information