Change-point Estimation via Empirical Likelihood for a Segmented Linear Regression

Size: px
Start display at page:

Download "Change-point Estimation via Empirical Likelihood for a Segmented Linear Regression"

Transcription

1 Change-point Estimation via Empirical Likelihood for a Segmented Linear Regression Zhihua Liu and Lianfen Qian Department of Mathematical Science, Florida Atlantic University, Boca Raton, FL 33431, USA Abstract For a segmented regression system with an unknown change-point over two domains of a predictor, a new empirical likelihood ratio statistic is proposed to test the null hypothesis of no change. Under the null hypothesis of no change, the proposed test statistic is empirically shown asymptotically Gumbel distributed with robust location and scale parameters against various parameter settings and error distributions. Under the alternative hypothesis with a change-point, the test statistic is utilized to estimate the change point between the two domains. The power analysis shows that the proposed test is tractable. An empirical example on analyzing the plasma osmolality data is given. Keywords: Empirical likelihood ratio, Gumbel extreme value distribution, segmented linear regression, change-point. 1 Introduction In the classical regression setting, the regression model is usually assumed to be of a single parametric form on the whole domain of predictors. However, a piecewise regression model Corresponding author. lqian@fau.edu 1

2 is used to show that the parameters of the model can be different on different domains of the predictors. In the last thirty years, a considerable body of techniques have been developed for hypothesis testing, parameter estimation and related computing program on detecting change points for piecewise regression models. One special and commonly used piecewise regression model is the two-phase linear regression model. The regression function of this model is a piecewise linear function. One can define this more precisely as follows. Let Y be the response variable and X be a univariate predictor so that (X, Y ) is a bivariate random vector with E Y <. Suppose that {(X i, Y i )} n i=1 is a sequence of independent observations of (X, Y ) satisfying the following model: Y i = (α 0 + α 1 X i )I(X i τ) + (β 0 + β 1 X i )I(X i > τ) + e i (1) where {e i } n i=1 are independent error terms with mean zero. Let {X (i) } n i=1 be the order statistics of {X i } n i=1. If there is an unknown time k such that X (k ) τ < X (k +1), then we shall call k the time of the change and τ the change point. Wide applications of two-phase linear regression models have appeared in diverse research areas. For example in environmental sciences, Piegorsch and Bailer [20] in their section 2.2 illustrate the usefulness of two-phase linear regression models with a series of examples. Lund and Reeves [14] utilize the model to detect undocumented change points. Qian and Ryu [24] fit the model with termite survival as Y and tree resin dosage as X. In the biological sciences, Vieth [31] applies the model to determine the osmotic threshold by fitting arginine vasopressin (AVP) concentration against plasma osmolality in plasma of conscious dogs. In medical science, Smith and Cook [28] use piecewise linear regression model to fit some renal transplant data. Other applications can be found in epidemiology (Ulm [30], Pastor and Guallar [19]), software engineering (Qian and Yao [22]), econometrics (Chow [3], Koul and Qian [13], Fiteni [8], Zeileis [32]) and so on. Hawkins [10] classifies the two-phase linear regression model (1) into two types of models: the continuous and the discontinuous. By continuous, it means that the regression function is 2

3 continuous at the change point τ; that is, the change point τ satisfies the following equation: α 0 + α 1 τ = β 0 + β 1 τ. (2) If equation (2) is not satisfied, the model is discontinuous. The continuous model is also called the segmented linear regression model (Feder [6, 7]). Before applying the model (1), it is usual to test for the existence of a change point. There are two existing types of likelihood based approaches: The Schwartz Information Criteria (SIC) method (Chen [2]) and the classical parametric likelihood approach (Quandt [25, 26]). The SIC method, proposed by Schwartz [27], is a model selection criteria, defined as SIC = 2 log L(ˆθ) + k log n, where ˆθ is the maximum likelihood estimator of the parameter vector, L(ˆθ) is the maximum likelihood function, k is the number of free parameters in the model, and n is the sample size. Chen [2] changes the task of hypothesis testing into model selection process by applying SIC method. See more details in Section 3. The classical parametric likelihood approach was first proposed by Quandt [25, 26] to detect the presence of a change point in a simple linear regression model. Quandt assumes that the error terms {e i } n i=1 are normally and independently distributed with mean zero and standard deviations σ 1 if i k and σ 2 if i > k. The likelihood ratio test statistic is Λ = max {λ(k)}, 3 k n 3 ( ˆσ k with λ(k) = 2 log 1 (k)ˆσ 2 n k (k) ), ˆσ n where ˆσ is the estimator of the standard deviation of the errors for simple linear regression based on all observations; ˆσ 1 (k) and ˆσ 2 (k) are the estimators of σ 1 and σ 2 for fixed k, respectively. Large values of Λ suggest the existence of a change point. Quandt [25] conjectures that the asymptotic distribution of λ(k) is χ 2 4 under the null hypothesis of no change (H 0 ) for all k between 2 and n 2. Under H 0 assuming σ 1 = σ 2 for the segmented (continuous two-phase) linear regression model, Hinkley [11] and [12] claim that the asymptotic distribution of Λ is χ 2 1 and χ 2 3, respectively. Feder [6, 7] comments that the distribution of Λ is not asymptotically χ 2. However, Feder indicates that there is 3

4 evidence for the existence of a limiting distribution of the likelihood ratio. Lund and Leeves [14] point out that the components of {λ(k)} are not independent. In fact, λ(k) and λ(k 1) are correlated for a fixed k. This correlation makes the proof of the asymptotic distribution difficult. Lund and Leeves conjecture that the asymptotic distribution of Λ is related to the Gumbel extreme value distribution. In this paper, we address the afore-mentioned asymptotic distribution problem using a recently developed nonparametric empirical likelihood approach. Empirical likelihood (EL) as a nonparametric data-driven technique is first proposed by Owen [17]. EL employs the likelihood function without specifically assuming the distribution of the data. It incorporates the side information, through constraints or prior distribution, which maximizes the efficiency of the method (Owen [18]). First, we propose an EL based test statistic for testing the null hypothesis of no change. Through simulation studies, we have confirmed, under null hypothesis, that Lund and Leeves s conjecture is correct for the empirical likelihood based method, though the original conjecture is for classical likelihood method, which is not solved yet. Then, if the null hypothesis is rejected, we construct an EL based estimator of the change point for the model (1) under the continuity constraint (2). The rest of the paper is organized as follows. Section 2 proposes the empirical likelihood ratio test statistic for the segmented linear regression model and defines the estimator of the time of the change if it exists. Section 3 shows empirically that the limiting null distribution of the proposed test statistic is the Gumbel extreme value distribution. It observed that the location and scale parameters of this asymptotic distribution were insensitive to the different settings of parameter vector and changes in the error distribution. The critical values for various significance levels, the size and the power performance of the test are reported. Furthermore, a comparison between the proposed EL based method and Chen s [2] Schwartz information criteria (SIC) method is conducted. Section 4 presents an empirical example on analyzing the plasma osmolality data using the proposed ELR method. 4

5 2 EL Ratio Test and its Computing Algorithm Assuming a known change point, Dong [5] derives an empirical likelihood type Wald (ELw) statistic to test the equality of two coefficient vectors from two linear regression models. To be more precise, let ˆα and ˆβ be the least squares estimators of the regression coefficient vectors, respectively. Under the normality assumption of the errors, Dong s ELw test statistic has the form ELw = (ˆα ˆβ) [ σ 1(X 2 1X 1 ) 1 + σ 2(X 2 2X 2 ) 1 ] 1 (ˆα ˆβ) where X i is the design matrix and σ 2 i is the EL estimator of σ 2 i, the variance of the errors, for the ith regression model (i = 1, 2). Dong concludes that the ELw test is asymptotically χ 2 p distributed under the null hypothesis H 0 : α = β R p. Instead of assuming a known change point, we first derive an empirical likelihood based test statistic for testing the null hypothesis of no change. If a change point does exist, we construct the EL based estimator for the time of the change (k ), and hence for the change point τ. Unlike the method used by Dong, we neither require the assumption of normality on the errors nor do we need the time of the change between the two phases to be known. However, we do require that the two phases be continuous at τ [X (k ), X (k +1)). We address the following two important research issues for model (1) with continuity constraint (2): To test simple linear regression versus two-phase linear regression with one single unknown change point. To estimate the time of the change if it exists. Let α = (α 0, α 1 ) and β = (β 0, β 1 ) in model (1). Then the test of no change is equivalent to the test H 0 : α = β. Throughout the rest of the paper, we assume that {X i } n i=1 are already ordered. For a fixed k, we can separate the data into two groups: {(X i, Y i )} k i=1 and {(X i, Y i )} n i=k+1. For each group, we apply simple linear regression to fit the data points by using the ordinary least squares (OLS) method. Let ˆα(k) = (ˆα 0 (k), ˆα 1 (k)) be the OLS estimator of α computed from {(X i, Y i )} k i=1 and ˆβ(k) = ( ˆβ 0 (k), ˆβ1 (k)) be the OLS 5

6 estimator of β computed from {(X i, Y i )} n i=k+1. Then, the estimated errors are Y i [ˆα 0 (k) + ˆα 1 (k)x i ], i = 1,..., k; ê i (k) = Y i [ ˆβ 0 (k) + ˆβ 1 (k)x i ], i = k + 1,..., n. Under the null hypothesis H 0 : α = β, ˆα 0 (k) and ˆβ 0 (k) should be close to each other, and similarly for ˆα 1 (k) and ˆβ 1 (k). Therefore, we propose to switch the rules of the estimated regression coefficient vectors in estimating the errors for these two phases. estimated errors under H 0 can be represented as follows: Y i [ ˆβ 0 (k) + ˆβ 1 (k)x i ], i = 1,..., k; ẽ i (k) = Y i [ˆα 0 (k) + ˆα 1 (k)x i ], i = k + 1,..., n. That is, the (3) Notice that under H 0, E[ẽ i (k)] = 0 for all k. Following Owen (1991), when a change does occur, we should reject H 0 if the empirical likelihood ratio (ELR) { n n R(k) = sup nw i w i ẽ i (k) = 0, w i 0, i=1 i=1 is small. The corresponding logarithm of ELR is { n 2 log R(k) = 2 sup log(nw i ) i=1 i=1 n i=1 n w i ẽ i (k) = 0, w i 0, i=1 } w i = 1 By the Lagrange multiplier method, we write ( n n n ) G = log(nw i ) nλ w i ẽ i (k) γ w i 1, i=1 i=1 n i=1 (4) } w i = 1. (5) where λ and γ are Lagrange multipliers. Take the derivative of G with respect to w i, set it equal to zero and solve to obtain It follows that Define the score function w i (k) = 1 n[1 + λẽ i (k)]. { n } 2 log R(k, λ) = 2 log [1 + λẽ i (k)]. (6) φ(k, λ) = i=1 2 log R(k, λ) 2 λ = n i=1 ẽ i (k) 1 + λẽ i (k). 6

7 Then, we have the profile logarithm of ELR for a fixed k: 2 log ˆR(k) = 2 log R(k, ˆλ), where ˆλ is determined by φ(k, ˆλ) = 0. Notice that the true time of the change k is unidentifiable under the null hypothesis H 0. Large values of 2 log ˆR(k) correspond to a two-tailed alternative hypothesis being true. Therefore, we propose the following test statistic: M n = max { 2 log ˆR(k)}. (7) 3 k n 3 When 2 log ˆR(k) is small for each possible k, M n will be small, as is the case under the null hypothesis. If a change occurs at k, then 2 log ˆR(k ) and M n should be statistically large, thus we reject H 0. Notice that 2 log ˆR(k) is an asymptotic χ 2 1 statistic for each fixed k and the components of { 2 log ˆR(k)} n 3 k=3 are not independent. In this paper, we show, through simulation study, that the limiting null distribution of = M n is Gumbel extreme value distribution which is similar to the parametric likelihood ratio result, see Csörgő and Horváth [4]. If the null hypothesis is rejected, we need to estimate the change point by maximizing 2 log ˆR(k). Simulation studies show that 2 log ˆR(k) is sensitive to outliers when k is too small or too close to the sample size n. This phenomena exists for the parametric likelihood ratio approach. In order to overcome this situation, we adopt the trimmed test statistic defined below: M n = max L k U { 2 log ˆR(k)}, (8) where the choice of L and U are arbitrary. Anything ranges from [ln n] to n 1/2 has been used in the literature for the parametric likelihood ratio approach. For empirical likelihood method, we have tested various trimmed portion. For the sample sizes used in the simulation, too small tail portions do not work well. Hence in this paper, we choose L = [ln n] 2 and U = n L, where [x] means the smallest integer larger than x. Thus the empirical likelihood estimator of k is defined by ˆk = min {k : M n = 2 log ˆR(k)} (9) 7

8 and hence the empirical likelihood estimator of τ is defined as ˆτ = Xˆk. To test whether H 0 is true, we need to compute M n. Without loss of generality, let the ordered predictor values be X 1 X 2... X n. The algorithm for computing M n contains the following steps: 1. For a fixed k, k = 3, 4,..., n 3, split the data into two groups referred to as the left-phase group {(X i, Y i )} k i=1, and the right-phase group {(X i, Y i )} n i=k For each k, fit the points into a linear model to obtain ˆα 0 (k), ˆα 1 (k) from the left-phase group and ˆβ 0 (k), ˆβ 1 (k) from the right-phase group. 3. Calculate ẽ i (k) = Y i [ ˆβ 0 (k) + ˆβ 1 (k)x i ] for i = 1,..., k and ẽ i = y i [ˆα 0 (k) + ˆα 1 (k)x i ] for i = k + 1,..., n. 4. Use {ẽ i (k)} n i=1 as the input in the el.test function in R package (emplik) to compute 2 log ˆR(k). 5. For each possible k, repeat step 1 to 4 to obtain a sequence of { 2 log ˆR(k)} n 3 k=3. The maximum of this sequence is M n. If = M n is larger than the critical value G α where P ( G α ) α, H 0 is rejected. So the corresponding minimum argument of M n, ˆk, is the EL estimator of k if we reject the null hypothesis of no change. Remark: We note that the mean response in a segmented regression model is a single linear piece under the null hypothesis. Hence, step 3 utilizes the single linear piece property to calculate the residuals. While step 4 combines the residuals from step 3 and sets the expected value of the overall residuals equal to zero through el.test. 3 Empirical Distribution of The algorithm presented above enables us to conduct simulation studies to show that the empirical distribution of under the null hypothesis is the Gumbel extreme value distribution, 8

9 a subfamily of the Generalized Extreme Value (GEV) distribution. The GEV distribution has the following cumulative distribution function: { [ F (x; µ, σ, ζ) = exp 1 + ζ ( x µ σ )] 1/ζ }, for x R and 1 + ζ(x µ)/σ > 0, where µ R is the location parameter, σ > 0 is the scale parameter and ζ R is the shape parameter. The shape parameter ζ dominates the tail behavior of the distribution. When ζ 0, the limiting distribution of the GEV distribution is the Gumbel (G) extreme value distribution, given by { F G (x; µ, σ) = exp exp [ (x µ) σ ]}. (10) For a Gumbel extreme value distribution, the mean and the variance are µ + σa and σ 2 π 2 /6, respectively, where a is the Euler-Mascheroni constant We use the fgev function in R package (evd) to estimate the location µ, the scale σ and the critical values of F G. The function fgev uses the maximum-likelihood fitting of the GEV distribution to estimate µ, σ and ζ. We can obtain estimates of µ and σ for the Gumbel extreme value distribution, by setting ζ = 0. Four simulation studies are reported in this section. We simulate 1000 samples with the sample values of the random predictor X generated from N(0, 1). The sample size n ranges from 30 to 500. Simulation I is to test the robustness of the proposed EL based test statistic and computes the critical values of for the most popular nominal levels 0.10, 0.05 and 0.01, under the null hypothesis of no change. Data are generated from the simple linear regression Y i = γ 0 + γ 1 X i + e i, i = 1,..., n (11) with γ = (γ 0, γ 1 ) = (1, 1) and three types of error terms are considered: (i) Normal errors: {e i } n i=1 N(0, ); (ii) Log-normal errors (heavy-tailed): {e i } n i=1 log N(0, ) and (iii) Non-homogeneous errors: {e i } [n/2] i=1 N(0, 0.12 ) and {e i } n i=[n/2]+1 N(0, 1.02 ). Table 1 shows that the estimated location and scale parameters of the asymptotic distribution increase as the sample size increases. More importantly, one observes that both the 9

10 Table 1: Robustness analysis for the estimated location and scale parameters, µ and σ respectively, of the limiting distribution of under three types of error distributions. The type of error distribution (i) Normal (ii) Log-normal (iii) Non-homogeneous n µ σ µ σ µ σ location and scale parameters are robust against changes to the distributions of the errors. This is consistent with the well-known property of the empirical likelihood method being of a nonparametric nature. The histograms and Q-Q plots of for the 1000 replicates are shown in Figures 2-4 corresponding to three types of error distributions. The left panel shows the histograms of where the solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density with Gaussian kernel using density function in R. The right panel shows the Q-Q plots of the Gumbel distributions. Simultaneously, the critical values of with the nominal levels α = 0.10, 0.05 and 0.01 can also be derived by computing the quartiles of the simulated Gumbel extreme value distribution. The critical values of are reported in Table 2. Simulation II was performed to show that the asymptotic distribution is robust against changes in the settings of parameter vector γ = (γ 0, γ 1 ) in model (11). This simulation was carried out with five different settings of γ, a sample size of n = 100 and {e i } n i=1 N(0, ). Table 3 indicates that the estimated location and scale parameters are robust to the settings of γ. Simulation III was carried out to compare the performance of the proposed EL based 10

11 Table 2: The critical values of with α = 0.10, 0.05, and The type of error distribution (i) Normal (ii) Log-normal (iii) Non-homogeneous α n estimator and Chen s SIC estimator of the true time of the change k. Chen s SIC under H 0 is SIC(n) = 2 log L 0 (ˆγ, ˆσ 2 ) + 3 log n, where L 0 (ˆγ, ˆσ 2 ) is the estimated maximum likelihood function under the null hypothesis of no change, ˆγ is the estimator of γ and ˆσ is the estimator of the standard deviation of the errors in model (11). For k ranging from 2 to n 2, Chen s SIC under H 1 is SIC(k) = 2 log L 1 (ˆα 0 (k), ˆα 1 (k), ˆβ 0 (k), ˆβ 1 (k), ˆσ 2 ) + 5 log n, where L 1 (ˆα 0 (k), ˆα 1 (k), ˆβ 0 (k), ˆβ 1 (k), ˆσ 2 ) is the estimated maximum likelihood function under H 1. Therefore, the decision rule for selecting one of the n 3 regression models is: select the model with no change if SIC(n) SIC(k) for all k; select a model with a change at ˆk if SIC(ˆk ) = min{sic(k) : 2 k n 2} < SIC(n). Let ξ be the change in slope between the two phases for the segmented linear regression model (1) with the continuity constraint (2). We examine the effect of three different values of the change in slope on these errors settings: N(0, ) and N(0, ). The three different 11

12 Table 3: Robustness analysis of µ and σ with respect to γ when n = 100. γ = (γ 0, γ 1 ) (-1,1) (2,-1) (-3,-3) (8,3) (-3,5) µ σ values of the change in slope are (a) small change of slopes with ξ = 0.5; (b) moderate change of slopes with ξ = 2; (c) large change of slopes with ξ = 4. When the X values are too close together, it is hard to detect the true time of the change. The acceptable deviation of ˆk from k depends on the sample size and the range of the X values. In order to compare these two methods, we propose the following fine tuned acceptable deviation D: [ ] U L D =, A where A is the range of the X values, L = [ln n] 2 and U = n L are the fine tuning portion in the definition of the trimmed test statistic M n. For X N(0, ), we take A = 6. Then when n = 50, D = 3 with L = 16 and U = 34. For the purpose of illustration, we report the simulation results for the sample size n = 50 and k = 25 for the errors generated from N(0, ) and N(0, ). Let d = ˆk k be the absolute value of the bias between the true time of the change k and the estimate ˆk, and RF be the relative frequency of the deviation d no more than the acceptable deviation D = 3. Table 4 reports the frequency distribution of d and the relative frequency of d 3. The simulation result indicates that the proposed method works slightly better than the SIC method to capture the true time of the change (d = 0). The relative frequency of both methods increases as the standard deviation (σ) of the errors decreases. When σ = 0.5 (the ratio of signal to noise is 2), both methods are not working well, though the proposed method works much better than SIC method for small to moderate change of slopes. When σ = 0.1 (the ratio of signal to noise is 10), these two methods are comparable. One also notices that 12

13 Table 4: The frequency distribution of d = ˆk k and the relative frequency RF = # of {d 3} % for ELR and SIC methods, when sample size n = 50 and the true time of 10 the change k = 25. N(0, ) N(0, ) ξ = 0.5 ξ = 2 ξ = 4 ξ = 0.5 ξ = 2 ξ = 4 d ELR SIC ELR SIC ELR SIC ELR SIC ELR SIC ELR SIC RF (%) as the change of slopes increases, the absolute value of the bias is getting smaller and the relative frequency is increasing. The simulation results for various sample sizes ranging from 30 to 500 show the similar pattern. From Table 4, one notices that the acceptable deviation also depends on the signal to noise ratio and change of slopes. The detection is easier for larger change of slopes than small to moderate, as does the signal to noise ratio. Simulation IV is performed to study the size and the power performance of the proposed test for two types of error distributions. Table 5 shows the size and the power of the proposed test for a variety of sample sizes and true times of the change. The size is computed by using the estimated critical value of Gumbel distribution. We simulated 1000 samples from the model (11) with parameter vector γ = (1, 1), and the size is estimated by the proportion of 13

14 Table 5: The size and the power of. The type of error distribution (i) Normal (ii) Log-normal n k n % % % % % samples resulting rejection of the null hypothesis falsely, which means test statistic is larger than the critical value. This study indicates that the proposed test statistic is able to control the size and attains a high power when the sample size is large. 4 Applications This section applies the proposed ELR method for the segmented linear regression to plasma osmolality data set. The data set was collected to show arginine vasopressin (AVP) concentration in plasma as a function of plasma osmolality in conscious dogs. Using parametric maximum likelihood method under normality assumption for errors, Vieth[31] utilizes the segmented linear model to fit arginine vasopressin (AVP) concentration against plasma osmolality in plasma of conscious dogs to determine the osmotic threshold. Our proposed ELR method does not require the normality assumption of the errors in nature. Figure 1(a) is the scatter plot, overlaid with the fitted segmented regression function, of the data with the estimated change point at ˆτ = 302, corresponding to the osmotic threshold and indicated by the vertical dash line. We used the ELR method to plot 2 log ELR(k) for each possible time of the change between L and U; shown in Figure 1(b). The estimated 14

15 (a) (b) AVP(pg/ml) logELR(k) Plasma osmolality(mosm/kg) k Figure 1: (a) the scatter plot of AVP versus plasma osmolality with fitted segmented linear regression. (b) The plot of 2 log ELR(k) versus all the possible time of the change k in [L, U]. time of the change is ˆk = 42 highlighted by the solid dot. Then the estimated change point is ˆτ = 302, and the least squares fitted segmented linear regression is AVP = plasma osmolality+0.52 (plasma osmolality 302) +, where (a) + = max(0, a). The corresponding R 2 is 73% with estimated standard deviation of Conclusion This paper proposes a nonparametric empirical likelihood based test statistic for the detection of potential change points in segmented linear regression models. If the change point is identified, then an empirical likelihood based change point estimator is defined along with the estimator of the regression coefficients. Under the null hypothesis of no change, the simulation studies show that the proposed test statistic is asymptotically Gumbel extreme value distributed. The asymptotic distributions of the estimated location and scale parameters were shown to be robust under the different 15

16 settings of the parameter vector and the different types of error terms. The location and scale parameters are increasing functions of the sample size. Then, the simulation under the alternative hypothesis shows that the proposed test is able to control the size and attain a high power when the sample size is large. Finally, it is shown that the proposed empirical likelihood method performs better than SIC method in accurately detecting the true time of the change. However, allowing an acceptable deviation, the proposed method and Chen s SIC method are comparable overall. An empirical example on analyzing the plasma osmolality data is given. The simulation and data analysis programs in R are available from the first author. Acknowledgments: The authors wish to thank the Editor and referees for their valuable comments and suggestions that helped to improve the presentation of the paper. References [1] Berman, N.G., et.al. (1996). Applications of segmented regression models for biomedical studies. American Journal of Physiology, 270, [2] Chen, J. (1998). Testing for a change point in linear regression models. Communications in Statistics-Theory and Methods, 27:10, [3] Chow, G. (1960). Tests of equality between two sets of coefficients in two linear regressions. Econometrica, 28, [4] Csörgő, M. and Horváth, L. (1997). Limit theorems in change-point analysis, Wiley Series in Probability and Statistics. [5] Dong, L.B. (2004). Testing for structural change in regression: an empirical likelihood approach, Econometrics Working Paper, [6] Feder, P.I. (1975a). Asymptotic distribution theory in segmented regression problemsidentified case. The Annals of Statistics, 3,

17 [7] Feder, P.I. (1975b). The log likelihood ratio in segmented regimes. The Annals of Statistics, 3, [8] Fiteni, I. (2004). τ-estimators of regression models with structural change of unknown location. Journal of Econometrics, 119, [9] Gbur,E.E., Thomas,G.L. and Miller,F.R. (1979). The use of segmented regression models in the determination of the base temperature in heat accumulation models. Agronomy Journal, 71, [10] Hawkins, D.M. (1980). A Note on Continuous and Discontinuous Segmented Regressions, Technometrics, 22, [11] Hinkley, D.V. (1969). Inference about the Intersection in Two-Phase Regression. Biometrika, 56, [12] Hinkley, D.V. (1971). Inference in two-phase regression.journal of the American Statistical Association, 66, [13] Koul, L.H. and Qian, L.F. (2002). Asymptotics of maximum likelihood estimator in a two-phase linear regression model. Journal of Statistical Planning and Inference, 108, [14] Lund, R. and Reeves, J.(2002). Detection of undocumented change points: A revision of the two-phase regression model. Journal of Climate, 15, [15] Luwel K., Beem A.L., Onghena P. and Verschaffel L. (2001). Using segmented linear regression models with unknown change points to analyze strategy shifts in cognitive tasks. Behavior Research Methods, Instruments, & Computers. 33, (9) [16] Muggeo, V.M.R. (2003). Estimating regression models with unknown break-points. Statistics in Medicine. 22,

18 [17] Owen, A.B. (1991). Empirical likelihood for linear models. The Annals of Statistics, 19, [18] Owen, A.B. (2001). Empirical Likelihood, Chapman & Hall/CRC. [19] Pastor, R. and Guallar, E. (1998). Use of two-segmented logistic Regression to estimate change-points in epidemiologic studies. American Journal of Epidemiology, 148, [20] Piegorsch, W. W. and Bailer, A. J. (1997). Statistics for environmental biology and toxicology. Chapman and Hall. [21] Piepho, H. P. and Ogutu, J.O. (2003). Inference for the break point in segmented regression with application to longitudinal data. Biometrical Journal, 45, [22] Qian, L.F. (1998). On maximum likelihood estimation for a threshold autoregression. Journal of Statistical Planning and Inference, 75, [23] Qian, L.F. and Yao, Q.C. (2002). Software project effort estimation using two-phase linear regression models. Proceeding of The 15th Annual Motorola Software Engineering Symposium (SES). [24] Qian, L.F. and Ryu, S.Y.(2006). Estimating tree resin dose effect on termites. Environmentrics, 17, [25] Quandt, R.E. (1958). The estimation of the parameters of a linear regression system obeying two separate regimes. Journal of the American Statistical Association, 53, [26] Quandt, R.E. (1960). Tests of the hypothesis that a linear regression system obeys two separate regimes. Journal of the American Statistical Association, 55, [27] Schwartz, G. (1978). Estimating the dimension og a model. Annuals of Statistics, 6,

19 [28] Smith,A.M.F. and Cook, D.G. (1980). Straight lines with a change point: A Bayesian analysis of some renal transplant data. Applied Statistics, 29, [29] Toms, J.D. and Lesperance, M.L. (2003). Piecewise regression: A tool for identifying ecological thresholds. Ecology, 84, [30] Ulm, K.W. (1991). A statistical method for assessing a threshold in epidemiological studies. Statistics in medicine, 10, [31] Vieth, E. (1989). Fitting piecewise linear regression functions to biological responses. Journal of Applied Physiology, 67, [32] Zeileis, A. (2006). Implementing a class of structural change tests: an econometric computing approach. Computational Statistics & Data Analysis, 50,

20 (i) n= (i) n= (i) n= (i) n= Figure 2: The histograms and Q-Q plots of under H 0 with normal errors (i){e i } n i=1 N(0, ) for four different sample size settings. The solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density. 20

21 (ii) n= (ii) n= (ii) n= (ii) n= Figure 3: The histograms and Q-Q plots of under H 0 with log-normal errors (ii) {e i } n i=1 log N(0, ) for four different sample size settings. The solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density. 21

22 (iii) n= (iii) n= (iii) n= (iii) n= Figure 4: The histograms and Q-Q plots of under H 0 with non-homogeneous errors (iii) {e i } [n/2] i=1 N(0, ) and {e i } n i=[n/2]+1 N(0, 1.02 ) for four different sample size settings. The solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density. 22

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Empirical Likelihood

Empirical Likelihood Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence

More information

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Lectures on Structural Change

Lectures on Structural Change Lectures on Structural Change Eric Zivot Department of Economics, University of Washington April5,2003 1 Overview of Testing for and Estimating Structural Change in Econometric Models 1. Day 1: Tests of

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Asymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri

Asymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri Asymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri University of Bergamo (Italy) ilia.negri@unibg.it SAPS VIII, Le Mans 21-24 March,

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Published: 26 April 2017

Published: 26 April 2017 Electronic Journal of Applied Statistical Analysis EJASA, Electron. J. App. Stat. Anal. http://siba-ese.unisalento.it/index.php/ejasa/index e-issn: 2070-5948 DOI: 10.1285/i20705948v10n1p194 Tests for smooth-abrupt

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Binary choice 3.3 Maximum likelihood estimation

Binary choice 3.3 Maximum likelihood estimation Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Original Article Robust correlation coefficient goodness-of-fit test for the Gumbel distribution Abbas Mahdavi 1* 1 Department of Statistics, School of Mathematical

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

Adjusted Empirical Likelihood for Long-memory Time Series Models

Adjusted Empirical Likelihood for Long-memory Time Series Models Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Minimum distance tests and estimates based on ranks

Minimum distance tests and estimates based on ranks Minimum distance tests and estimates based on ranks Authors: Radim Navrátil Department of Mathematics and Statistics, Masaryk University Brno, Czech Republic (navratil@math.muni.cz) Abstract: It is well

More information

Studies in Nonlinear Dynamics and Econometrics

Studies in Nonlinear Dynamics and Econometrics Studies in Nonlinear Dynamics and Econometrics Quarterly Journal April 1997, Volume, Number 1 The MIT Press Studies in Nonlinear Dynamics and Econometrics (ISSN 1081-186) is a quarterly journal published

More information

Empirical likelihood for linear models in the presence of nuisance parameters

Empirical likelihood for linear models in the presence of nuisance parameters Empirical likelihood for linear models in the presence of nuisance parameters Mi-Ok Kim, Mai Zhou To cite this version: Mi-Ok Kim, Mai Zhou. Empirical likelihood for linear models in the presence of nuisance

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Variable inspection plans for continuous populations with unknown short tail distributions

Variable inspection plans for continuous populations with unknown short tail distributions Variable inspection plans for continuous populations with unknown short tail distributions Wolfgang Kössler Abstract The ordinary variable inspection plans are sensitive to deviations from the normality

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

L-momenty s rušivou regresí

L-momenty s rušivou regresí L-momenty s rušivou regresí Jan Picek, Martin Schindler e-mail: jan.picek@tul.cz TECHNICKÁ UNIVERZITA V LIBERCI ROBUST 2016 J. Picek, M. Schindler, TUL L-momenty s rušivou regresí 1/26 Motivation 1 Development

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Empirical likelihood-based methods for the difference of two trimmed means

Empirical likelihood-based methods for the difference of two trimmed means Empirical likelihood-based methods for the difference of two trimmed means 24.09.2012. Latvijas Universitate Contents 1 Introduction 2 Trimmed mean 3 Empirical likelihood 4 Empirical likelihood for the

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Testing for a break in persistence under long-range dependencies and mean shifts

Testing for a break in persistence under long-range dependencies and mean shifts Testing for a break in persistence under long-range dependencies and mean shifts Philipp Sibbertsen and Juliane Willert Institute of Statistics, Faculty of Economics and Management Leibniz Universität

More information

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our

More information

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution Pertanika J. Sci. & Technol. 18 (1): 209 221 (2010) ISSN: 0128-7680 Universiti Putra Malaysia Press Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

A measure of radial asymmetry for bivariate copulas based on Sobolev norm A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial

More information

Some Statistical Inferences For Two Frequency Distributions Arising In Bioinformatics

Some Statistical Inferences For Two Frequency Distributions Arising In Bioinformatics Applied Mathematics E-Notes, 14(2014), 151-160 c ISSN 1607-2510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ Some Statistical Inferences For Two Frequency Distributions Arising

More information

Inference via Kernel Smoothing of Bootstrap P Values

Inference via Kernel Smoothing of Bootstrap P Values Queen s Economics Department Working Paper No. 1054 Inference via Kernel Smoothing of Bootstrap P Values Jeff Racine McMaster University James G. MacKinnon Queen s University Department of Economics Queen

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Testing Homogeneity Of A Large Data Set By Bootstrapping

Testing Homogeneity Of A Large Data Set By Bootstrapping Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance

More information

A Measure of Robustness to Misspecification

A Measure of Robustness to Misspecification A Measure of Robustness to Misspecification Susan Athey Guido W. Imbens December 2014 Graduate School of Business, Stanford University, and NBER. Electronic correspondence: athey@stanford.edu. Graduate

More information

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models.

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. 1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

arxiv: v1 [stat.me] 2 Mar 2015

arxiv: v1 [stat.me] 2 Mar 2015 Statistics Surveys Vol. 0 (2006) 1 8 ISSN: 1935-7516 Two samples test for discrete power-law distributions arxiv:1503.00643v1 [stat.me] 2 Mar 2015 Contents Alessandro Bessi IUSS Institute for Advanced

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

Answer Key for STAT 200B HW No. 7

Answer Key for STAT 200B HW No. 7 Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Spatial and temporal extremes of wildfire sizes in Portugal ( )

Spatial and temporal extremes of wildfire sizes in Portugal ( ) International Journal of Wildland Fire 2009, 18, 983 991. doi:10.1071/wf07044_ac Accessory publication Spatial and temporal extremes of wildfire sizes in Portugal (1984 2004) P. de Zea Bermudez A, J. Mendes

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

CHANGE DETECTION IN TIME SERIES

CHANGE DETECTION IN TIME SERIES CHANGE DETECTION IN TIME SERIES Edit Gombay TIES - 2008 University of British Columbia, Kelowna June 8-13, 2008 Outline Introduction Results Examples References Introduction sunspot.year 0 50 100 150 1700

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Bimal Sinha Department of Mathematics & Statistics University of Maryland, Baltimore County,

More information

Confidence intervals for the variance component of random-effects linear models

Confidence intervals for the variance component of random-effects linear models The Stata Journal (2004) 4, Number 4, pp. 429 435 Confidence intervals for the variance component of random-effects linear models Matteo Bottai Arnold School of Public Health University of South Carolina

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Small Sample Corrections for LTS and MCD

Small Sample Corrections for LTS and MCD myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS040) p.4828 Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

More information

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

On the econometrics of the Koyck model

On the econometrics of the Koyck model On the econometrics of the Koyck model Philip Hans Franses and Rutger van Oest Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR, Rotterdam, The Netherlands Econometric Institute

More information

Weighted empirical likelihood estimates and their robustness properties

Weighted empirical likelihood estimates and their robustness properties Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Weighted empirical likelihood estimates and their robustness properties N.L. Glenn a,, Yichuan Zhao b a Department of Statistics,

More information

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University

More information

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Monitoring Wafer Geometric Quality using Additive Gaussian Process Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department

More information

10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers

10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers Computational Methods for Data Analysis Massimo Poesio SUPPORT VECTOR MACHINES Support Vector Machines Linear classifiers 1 Linear Classifiers denotes +1 denotes -1 w x + b>0 f(x,w,b) = sign(w x + b) How

More information

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain 152/304 CoDaWork 2017 Abbadia San Salvatore (IT) Modified Kolmogorov-Smirnov Test of Goodness of Fit G.S. Monti 1, G. Mateu-Figueras 2, M. I. Ortego 3, V. Pawlowsky-Glahn 2 and J. J. Egozcue 3 1 Department

More information

Testing Error Correction in Panel data

Testing Error Correction in Panel data University of Vienna, Dept. of Economics Master in Economics Vienna 2010 The Model (1) Westerlund (2007) consider the following DGP: y it = φ 1i + φ 2i t + z it (1) x it = x it 1 + υ it (2) where the stochastic

More information

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,

More information

LQ-Moments for Statistical Analysis of Extreme Events

LQ-Moments for Statistical Analysis of Extreme Events Journal of Modern Applied Statistical Methods Volume 6 Issue Article 5--007 LQ-Moments for Statistical Analysis of Extreme Events Ani Shabri Universiti Teknologi Malaysia Abdul Aziz Jemain Universiti Kebangsaan

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Tests for Assessment of Agreement Using Probability Criteria

Tests for Assessment of Agreement Using Probability Criteria Tests for Assessment of Agreement Using Probability Criteria Pankaj K. Choudhary Department of Mathematical Sciences, University of Texas at Dallas Richardson, TX 75083-0688; pankaj@utdallas.edu H. N.

More information

Change Point Analysis of Extreme Values

Change Point Analysis of Extreme Values Change Point Analysis of Extreme Values TIES 2008 p. 1/? Change Point Analysis of Extreme Values Goedele Dierckx Economische Hogeschool Sint Aloysius, Brussels, Belgium e-mail: goedele.dierckx@hubrussel.be

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information