Change-point Estimation via Empirical Likelihood for a Segmented Linear Regression
|
|
- Silvia Lambert
- 5 years ago
- Views:
Transcription
1 Change-point Estimation via Empirical Likelihood for a Segmented Linear Regression Zhihua Liu and Lianfen Qian Department of Mathematical Science, Florida Atlantic University, Boca Raton, FL 33431, USA Abstract For a segmented regression system with an unknown change-point over two domains of a predictor, a new empirical likelihood ratio statistic is proposed to test the null hypothesis of no change. Under the null hypothesis of no change, the proposed test statistic is empirically shown asymptotically Gumbel distributed with robust location and scale parameters against various parameter settings and error distributions. Under the alternative hypothesis with a change-point, the test statistic is utilized to estimate the change point between the two domains. The power analysis shows that the proposed test is tractable. An empirical example on analyzing the plasma osmolality data is given. Keywords: Empirical likelihood ratio, Gumbel extreme value distribution, segmented linear regression, change-point. 1 Introduction In the classical regression setting, the regression model is usually assumed to be of a single parametric form on the whole domain of predictors. However, a piecewise regression model Corresponding author. lqian@fau.edu 1
2 is used to show that the parameters of the model can be different on different domains of the predictors. In the last thirty years, a considerable body of techniques have been developed for hypothesis testing, parameter estimation and related computing program on detecting change points for piecewise regression models. One special and commonly used piecewise regression model is the two-phase linear regression model. The regression function of this model is a piecewise linear function. One can define this more precisely as follows. Let Y be the response variable and X be a univariate predictor so that (X, Y ) is a bivariate random vector with E Y <. Suppose that {(X i, Y i )} n i=1 is a sequence of independent observations of (X, Y ) satisfying the following model: Y i = (α 0 + α 1 X i )I(X i τ) + (β 0 + β 1 X i )I(X i > τ) + e i (1) where {e i } n i=1 are independent error terms with mean zero. Let {X (i) } n i=1 be the order statistics of {X i } n i=1. If there is an unknown time k such that X (k ) τ < X (k +1), then we shall call k the time of the change and τ the change point. Wide applications of two-phase linear regression models have appeared in diverse research areas. For example in environmental sciences, Piegorsch and Bailer [20] in their section 2.2 illustrate the usefulness of two-phase linear regression models with a series of examples. Lund and Reeves [14] utilize the model to detect undocumented change points. Qian and Ryu [24] fit the model with termite survival as Y and tree resin dosage as X. In the biological sciences, Vieth [31] applies the model to determine the osmotic threshold by fitting arginine vasopressin (AVP) concentration against plasma osmolality in plasma of conscious dogs. In medical science, Smith and Cook [28] use piecewise linear regression model to fit some renal transplant data. Other applications can be found in epidemiology (Ulm [30], Pastor and Guallar [19]), software engineering (Qian and Yao [22]), econometrics (Chow [3], Koul and Qian [13], Fiteni [8], Zeileis [32]) and so on. Hawkins [10] classifies the two-phase linear regression model (1) into two types of models: the continuous and the discontinuous. By continuous, it means that the regression function is 2
3 continuous at the change point τ; that is, the change point τ satisfies the following equation: α 0 + α 1 τ = β 0 + β 1 τ. (2) If equation (2) is not satisfied, the model is discontinuous. The continuous model is also called the segmented linear regression model (Feder [6, 7]). Before applying the model (1), it is usual to test for the existence of a change point. There are two existing types of likelihood based approaches: The Schwartz Information Criteria (SIC) method (Chen [2]) and the classical parametric likelihood approach (Quandt [25, 26]). The SIC method, proposed by Schwartz [27], is a model selection criteria, defined as SIC = 2 log L(ˆθ) + k log n, where ˆθ is the maximum likelihood estimator of the parameter vector, L(ˆθ) is the maximum likelihood function, k is the number of free parameters in the model, and n is the sample size. Chen [2] changes the task of hypothesis testing into model selection process by applying SIC method. See more details in Section 3. The classical parametric likelihood approach was first proposed by Quandt [25, 26] to detect the presence of a change point in a simple linear regression model. Quandt assumes that the error terms {e i } n i=1 are normally and independently distributed with mean zero and standard deviations σ 1 if i k and σ 2 if i > k. The likelihood ratio test statistic is Λ = max {λ(k)}, 3 k n 3 ( ˆσ k with λ(k) = 2 log 1 (k)ˆσ 2 n k (k) ), ˆσ n where ˆσ is the estimator of the standard deviation of the errors for simple linear regression based on all observations; ˆσ 1 (k) and ˆσ 2 (k) are the estimators of σ 1 and σ 2 for fixed k, respectively. Large values of Λ suggest the existence of a change point. Quandt [25] conjectures that the asymptotic distribution of λ(k) is χ 2 4 under the null hypothesis of no change (H 0 ) for all k between 2 and n 2. Under H 0 assuming σ 1 = σ 2 for the segmented (continuous two-phase) linear regression model, Hinkley [11] and [12] claim that the asymptotic distribution of Λ is χ 2 1 and χ 2 3, respectively. Feder [6, 7] comments that the distribution of Λ is not asymptotically χ 2. However, Feder indicates that there is 3
4 evidence for the existence of a limiting distribution of the likelihood ratio. Lund and Leeves [14] point out that the components of {λ(k)} are not independent. In fact, λ(k) and λ(k 1) are correlated for a fixed k. This correlation makes the proof of the asymptotic distribution difficult. Lund and Leeves conjecture that the asymptotic distribution of Λ is related to the Gumbel extreme value distribution. In this paper, we address the afore-mentioned asymptotic distribution problem using a recently developed nonparametric empirical likelihood approach. Empirical likelihood (EL) as a nonparametric data-driven technique is first proposed by Owen [17]. EL employs the likelihood function without specifically assuming the distribution of the data. It incorporates the side information, through constraints or prior distribution, which maximizes the efficiency of the method (Owen [18]). First, we propose an EL based test statistic for testing the null hypothesis of no change. Through simulation studies, we have confirmed, under null hypothesis, that Lund and Leeves s conjecture is correct for the empirical likelihood based method, though the original conjecture is for classical likelihood method, which is not solved yet. Then, if the null hypothesis is rejected, we construct an EL based estimator of the change point for the model (1) under the continuity constraint (2). The rest of the paper is organized as follows. Section 2 proposes the empirical likelihood ratio test statistic for the segmented linear regression model and defines the estimator of the time of the change if it exists. Section 3 shows empirically that the limiting null distribution of the proposed test statistic is the Gumbel extreme value distribution. It observed that the location and scale parameters of this asymptotic distribution were insensitive to the different settings of parameter vector and changes in the error distribution. The critical values for various significance levels, the size and the power performance of the test are reported. Furthermore, a comparison between the proposed EL based method and Chen s [2] Schwartz information criteria (SIC) method is conducted. Section 4 presents an empirical example on analyzing the plasma osmolality data using the proposed ELR method. 4
5 2 EL Ratio Test and its Computing Algorithm Assuming a known change point, Dong [5] derives an empirical likelihood type Wald (ELw) statistic to test the equality of two coefficient vectors from two linear regression models. To be more precise, let ˆα and ˆβ be the least squares estimators of the regression coefficient vectors, respectively. Under the normality assumption of the errors, Dong s ELw test statistic has the form ELw = (ˆα ˆβ) [ σ 1(X 2 1X 1 ) 1 + σ 2(X 2 2X 2 ) 1 ] 1 (ˆα ˆβ) where X i is the design matrix and σ 2 i is the EL estimator of σ 2 i, the variance of the errors, for the ith regression model (i = 1, 2). Dong concludes that the ELw test is asymptotically χ 2 p distributed under the null hypothesis H 0 : α = β R p. Instead of assuming a known change point, we first derive an empirical likelihood based test statistic for testing the null hypothesis of no change. If a change point does exist, we construct the EL based estimator for the time of the change (k ), and hence for the change point τ. Unlike the method used by Dong, we neither require the assumption of normality on the errors nor do we need the time of the change between the two phases to be known. However, we do require that the two phases be continuous at τ [X (k ), X (k +1)). We address the following two important research issues for model (1) with continuity constraint (2): To test simple linear regression versus two-phase linear regression with one single unknown change point. To estimate the time of the change if it exists. Let α = (α 0, α 1 ) and β = (β 0, β 1 ) in model (1). Then the test of no change is equivalent to the test H 0 : α = β. Throughout the rest of the paper, we assume that {X i } n i=1 are already ordered. For a fixed k, we can separate the data into two groups: {(X i, Y i )} k i=1 and {(X i, Y i )} n i=k+1. For each group, we apply simple linear regression to fit the data points by using the ordinary least squares (OLS) method. Let ˆα(k) = (ˆα 0 (k), ˆα 1 (k)) be the OLS estimator of α computed from {(X i, Y i )} k i=1 and ˆβ(k) = ( ˆβ 0 (k), ˆβ1 (k)) be the OLS 5
6 estimator of β computed from {(X i, Y i )} n i=k+1. Then, the estimated errors are Y i [ˆα 0 (k) + ˆα 1 (k)x i ], i = 1,..., k; ê i (k) = Y i [ ˆβ 0 (k) + ˆβ 1 (k)x i ], i = k + 1,..., n. Under the null hypothesis H 0 : α = β, ˆα 0 (k) and ˆβ 0 (k) should be close to each other, and similarly for ˆα 1 (k) and ˆβ 1 (k). Therefore, we propose to switch the rules of the estimated regression coefficient vectors in estimating the errors for these two phases. estimated errors under H 0 can be represented as follows: Y i [ ˆβ 0 (k) + ˆβ 1 (k)x i ], i = 1,..., k; ẽ i (k) = Y i [ˆα 0 (k) + ˆα 1 (k)x i ], i = k + 1,..., n. That is, the (3) Notice that under H 0, E[ẽ i (k)] = 0 for all k. Following Owen (1991), when a change does occur, we should reject H 0 if the empirical likelihood ratio (ELR) { n n R(k) = sup nw i w i ẽ i (k) = 0, w i 0, i=1 i=1 is small. The corresponding logarithm of ELR is { n 2 log R(k) = 2 sup log(nw i ) i=1 i=1 n i=1 n w i ẽ i (k) = 0, w i 0, i=1 } w i = 1 By the Lagrange multiplier method, we write ( n n n ) G = log(nw i ) nλ w i ẽ i (k) γ w i 1, i=1 i=1 n i=1 (4) } w i = 1. (5) where λ and γ are Lagrange multipliers. Take the derivative of G with respect to w i, set it equal to zero and solve to obtain It follows that Define the score function w i (k) = 1 n[1 + λẽ i (k)]. { n } 2 log R(k, λ) = 2 log [1 + λẽ i (k)]. (6) φ(k, λ) = i=1 2 log R(k, λ) 2 λ = n i=1 ẽ i (k) 1 + λẽ i (k). 6
7 Then, we have the profile logarithm of ELR for a fixed k: 2 log ˆR(k) = 2 log R(k, ˆλ), where ˆλ is determined by φ(k, ˆλ) = 0. Notice that the true time of the change k is unidentifiable under the null hypothesis H 0. Large values of 2 log ˆR(k) correspond to a two-tailed alternative hypothesis being true. Therefore, we propose the following test statistic: M n = max { 2 log ˆR(k)}. (7) 3 k n 3 When 2 log ˆR(k) is small for each possible k, M n will be small, as is the case under the null hypothesis. If a change occurs at k, then 2 log ˆR(k ) and M n should be statistically large, thus we reject H 0. Notice that 2 log ˆR(k) is an asymptotic χ 2 1 statistic for each fixed k and the components of { 2 log ˆR(k)} n 3 k=3 are not independent. In this paper, we show, through simulation study, that the limiting null distribution of = M n is Gumbel extreme value distribution which is similar to the parametric likelihood ratio result, see Csörgő and Horváth [4]. If the null hypothesis is rejected, we need to estimate the change point by maximizing 2 log ˆR(k). Simulation studies show that 2 log ˆR(k) is sensitive to outliers when k is too small or too close to the sample size n. This phenomena exists for the parametric likelihood ratio approach. In order to overcome this situation, we adopt the trimmed test statistic defined below: M n = max L k U { 2 log ˆR(k)}, (8) where the choice of L and U are arbitrary. Anything ranges from [ln n] to n 1/2 has been used in the literature for the parametric likelihood ratio approach. For empirical likelihood method, we have tested various trimmed portion. For the sample sizes used in the simulation, too small tail portions do not work well. Hence in this paper, we choose L = [ln n] 2 and U = n L, where [x] means the smallest integer larger than x. Thus the empirical likelihood estimator of k is defined by ˆk = min {k : M n = 2 log ˆR(k)} (9) 7
8 and hence the empirical likelihood estimator of τ is defined as ˆτ = Xˆk. To test whether H 0 is true, we need to compute M n. Without loss of generality, let the ordered predictor values be X 1 X 2... X n. The algorithm for computing M n contains the following steps: 1. For a fixed k, k = 3, 4,..., n 3, split the data into two groups referred to as the left-phase group {(X i, Y i )} k i=1, and the right-phase group {(X i, Y i )} n i=k For each k, fit the points into a linear model to obtain ˆα 0 (k), ˆα 1 (k) from the left-phase group and ˆβ 0 (k), ˆβ 1 (k) from the right-phase group. 3. Calculate ẽ i (k) = Y i [ ˆβ 0 (k) + ˆβ 1 (k)x i ] for i = 1,..., k and ẽ i = y i [ˆα 0 (k) + ˆα 1 (k)x i ] for i = k + 1,..., n. 4. Use {ẽ i (k)} n i=1 as the input in the el.test function in R package (emplik) to compute 2 log ˆR(k). 5. For each possible k, repeat step 1 to 4 to obtain a sequence of { 2 log ˆR(k)} n 3 k=3. The maximum of this sequence is M n. If = M n is larger than the critical value G α where P ( G α ) α, H 0 is rejected. So the corresponding minimum argument of M n, ˆk, is the EL estimator of k if we reject the null hypothesis of no change. Remark: We note that the mean response in a segmented regression model is a single linear piece under the null hypothesis. Hence, step 3 utilizes the single linear piece property to calculate the residuals. While step 4 combines the residuals from step 3 and sets the expected value of the overall residuals equal to zero through el.test. 3 Empirical Distribution of The algorithm presented above enables us to conduct simulation studies to show that the empirical distribution of under the null hypothesis is the Gumbel extreme value distribution, 8
9 a subfamily of the Generalized Extreme Value (GEV) distribution. The GEV distribution has the following cumulative distribution function: { [ F (x; µ, σ, ζ) = exp 1 + ζ ( x µ σ )] 1/ζ }, for x R and 1 + ζ(x µ)/σ > 0, where µ R is the location parameter, σ > 0 is the scale parameter and ζ R is the shape parameter. The shape parameter ζ dominates the tail behavior of the distribution. When ζ 0, the limiting distribution of the GEV distribution is the Gumbel (G) extreme value distribution, given by { F G (x; µ, σ) = exp exp [ (x µ) σ ]}. (10) For a Gumbel extreme value distribution, the mean and the variance are µ + σa and σ 2 π 2 /6, respectively, where a is the Euler-Mascheroni constant We use the fgev function in R package (evd) to estimate the location µ, the scale σ and the critical values of F G. The function fgev uses the maximum-likelihood fitting of the GEV distribution to estimate µ, σ and ζ. We can obtain estimates of µ and σ for the Gumbel extreme value distribution, by setting ζ = 0. Four simulation studies are reported in this section. We simulate 1000 samples with the sample values of the random predictor X generated from N(0, 1). The sample size n ranges from 30 to 500. Simulation I is to test the robustness of the proposed EL based test statistic and computes the critical values of for the most popular nominal levels 0.10, 0.05 and 0.01, under the null hypothesis of no change. Data are generated from the simple linear regression Y i = γ 0 + γ 1 X i + e i, i = 1,..., n (11) with γ = (γ 0, γ 1 ) = (1, 1) and three types of error terms are considered: (i) Normal errors: {e i } n i=1 N(0, ); (ii) Log-normal errors (heavy-tailed): {e i } n i=1 log N(0, ) and (iii) Non-homogeneous errors: {e i } [n/2] i=1 N(0, 0.12 ) and {e i } n i=[n/2]+1 N(0, 1.02 ). Table 1 shows that the estimated location and scale parameters of the asymptotic distribution increase as the sample size increases. More importantly, one observes that both the 9
10 Table 1: Robustness analysis for the estimated location and scale parameters, µ and σ respectively, of the limiting distribution of under three types of error distributions. The type of error distribution (i) Normal (ii) Log-normal (iii) Non-homogeneous n µ σ µ σ µ σ location and scale parameters are robust against changes to the distributions of the errors. This is consistent with the well-known property of the empirical likelihood method being of a nonparametric nature. The histograms and Q-Q plots of for the 1000 replicates are shown in Figures 2-4 corresponding to three types of error distributions. The left panel shows the histograms of where the solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density with Gaussian kernel using density function in R. The right panel shows the Q-Q plots of the Gumbel distributions. Simultaneously, the critical values of with the nominal levels α = 0.10, 0.05 and 0.01 can also be derived by computing the quartiles of the simulated Gumbel extreme value distribution. The critical values of are reported in Table 2. Simulation II was performed to show that the asymptotic distribution is robust against changes in the settings of parameter vector γ = (γ 0, γ 1 ) in model (11). This simulation was carried out with five different settings of γ, a sample size of n = 100 and {e i } n i=1 N(0, ). Table 3 indicates that the estimated location and scale parameters are robust to the settings of γ. Simulation III was carried out to compare the performance of the proposed EL based 10
11 Table 2: The critical values of with α = 0.10, 0.05, and The type of error distribution (i) Normal (ii) Log-normal (iii) Non-homogeneous α n estimator and Chen s SIC estimator of the true time of the change k. Chen s SIC under H 0 is SIC(n) = 2 log L 0 (ˆγ, ˆσ 2 ) + 3 log n, where L 0 (ˆγ, ˆσ 2 ) is the estimated maximum likelihood function under the null hypothesis of no change, ˆγ is the estimator of γ and ˆσ is the estimator of the standard deviation of the errors in model (11). For k ranging from 2 to n 2, Chen s SIC under H 1 is SIC(k) = 2 log L 1 (ˆα 0 (k), ˆα 1 (k), ˆβ 0 (k), ˆβ 1 (k), ˆσ 2 ) + 5 log n, where L 1 (ˆα 0 (k), ˆα 1 (k), ˆβ 0 (k), ˆβ 1 (k), ˆσ 2 ) is the estimated maximum likelihood function under H 1. Therefore, the decision rule for selecting one of the n 3 regression models is: select the model with no change if SIC(n) SIC(k) for all k; select a model with a change at ˆk if SIC(ˆk ) = min{sic(k) : 2 k n 2} < SIC(n). Let ξ be the change in slope between the two phases for the segmented linear regression model (1) with the continuity constraint (2). We examine the effect of three different values of the change in slope on these errors settings: N(0, ) and N(0, ). The three different 11
12 Table 3: Robustness analysis of µ and σ with respect to γ when n = 100. γ = (γ 0, γ 1 ) (-1,1) (2,-1) (-3,-3) (8,3) (-3,5) µ σ values of the change in slope are (a) small change of slopes with ξ = 0.5; (b) moderate change of slopes with ξ = 2; (c) large change of slopes with ξ = 4. When the X values are too close together, it is hard to detect the true time of the change. The acceptable deviation of ˆk from k depends on the sample size and the range of the X values. In order to compare these two methods, we propose the following fine tuned acceptable deviation D: [ ] U L D =, A where A is the range of the X values, L = [ln n] 2 and U = n L are the fine tuning portion in the definition of the trimmed test statistic M n. For X N(0, ), we take A = 6. Then when n = 50, D = 3 with L = 16 and U = 34. For the purpose of illustration, we report the simulation results for the sample size n = 50 and k = 25 for the errors generated from N(0, ) and N(0, ). Let d = ˆk k be the absolute value of the bias between the true time of the change k and the estimate ˆk, and RF be the relative frequency of the deviation d no more than the acceptable deviation D = 3. Table 4 reports the frequency distribution of d and the relative frequency of d 3. The simulation result indicates that the proposed method works slightly better than the SIC method to capture the true time of the change (d = 0). The relative frequency of both methods increases as the standard deviation (σ) of the errors decreases. When σ = 0.5 (the ratio of signal to noise is 2), both methods are not working well, though the proposed method works much better than SIC method for small to moderate change of slopes. When σ = 0.1 (the ratio of signal to noise is 10), these two methods are comparable. One also notices that 12
13 Table 4: The frequency distribution of d = ˆk k and the relative frequency RF = # of {d 3} % for ELR and SIC methods, when sample size n = 50 and the true time of 10 the change k = 25. N(0, ) N(0, ) ξ = 0.5 ξ = 2 ξ = 4 ξ = 0.5 ξ = 2 ξ = 4 d ELR SIC ELR SIC ELR SIC ELR SIC ELR SIC ELR SIC RF (%) as the change of slopes increases, the absolute value of the bias is getting smaller and the relative frequency is increasing. The simulation results for various sample sizes ranging from 30 to 500 show the similar pattern. From Table 4, one notices that the acceptable deviation also depends on the signal to noise ratio and change of slopes. The detection is easier for larger change of slopes than small to moderate, as does the signal to noise ratio. Simulation IV is performed to study the size and the power performance of the proposed test for two types of error distributions. Table 5 shows the size and the power of the proposed test for a variety of sample sizes and true times of the change. The size is computed by using the estimated critical value of Gumbel distribution. We simulated 1000 samples from the model (11) with parameter vector γ = (1, 1), and the size is estimated by the proportion of 13
14 Table 5: The size and the power of. The type of error distribution (i) Normal (ii) Log-normal n k n % % % % % samples resulting rejection of the null hypothesis falsely, which means test statistic is larger than the critical value. This study indicates that the proposed test statistic is able to control the size and attains a high power when the sample size is large. 4 Applications This section applies the proposed ELR method for the segmented linear regression to plasma osmolality data set. The data set was collected to show arginine vasopressin (AVP) concentration in plasma as a function of plasma osmolality in conscious dogs. Using parametric maximum likelihood method under normality assumption for errors, Vieth[31] utilizes the segmented linear model to fit arginine vasopressin (AVP) concentration against plasma osmolality in plasma of conscious dogs to determine the osmotic threshold. Our proposed ELR method does not require the normality assumption of the errors in nature. Figure 1(a) is the scatter plot, overlaid with the fitted segmented regression function, of the data with the estimated change point at ˆτ = 302, corresponding to the osmotic threshold and indicated by the vertical dash line. We used the ELR method to plot 2 log ELR(k) for each possible time of the change between L and U; shown in Figure 1(b). The estimated 14
15 (a) (b) AVP(pg/ml) logELR(k) Plasma osmolality(mosm/kg) k Figure 1: (a) the scatter plot of AVP versus plasma osmolality with fitted segmented linear regression. (b) The plot of 2 log ELR(k) versus all the possible time of the change k in [L, U]. time of the change is ˆk = 42 highlighted by the solid dot. Then the estimated change point is ˆτ = 302, and the least squares fitted segmented linear regression is AVP = plasma osmolality+0.52 (plasma osmolality 302) +, where (a) + = max(0, a). The corresponding R 2 is 73% with estimated standard deviation of Conclusion This paper proposes a nonparametric empirical likelihood based test statistic for the detection of potential change points in segmented linear regression models. If the change point is identified, then an empirical likelihood based change point estimator is defined along with the estimator of the regression coefficients. Under the null hypothesis of no change, the simulation studies show that the proposed test statistic is asymptotically Gumbel extreme value distributed. The asymptotic distributions of the estimated location and scale parameters were shown to be robust under the different 15
16 settings of the parameter vector and the different types of error terms. The location and scale parameters are increasing functions of the sample size. Then, the simulation under the alternative hypothesis shows that the proposed test is able to control the size and attain a high power when the sample size is large. Finally, it is shown that the proposed empirical likelihood method performs better than SIC method in accurately detecting the true time of the change. However, allowing an acceptable deviation, the proposed method and Chen s SIC method are comparable overall. An empirical example on analyzing the plasma osmolality data is given. The simulation and data analysis programs in R are available from the first author. Acknowledgments: The authors wish to thank the Editor and referees for their valuable comments and suggestions that helped to improve the presentation of the paper. References [1] Berman, N.G., et.al. (1996). Applications of segmented regression models for biomedical studies. American Journal of Physiology, 270, [2] Chen, J. (1998). Testing for a change point in linear regression models. Communications in Statistics-Theory and Methods, 27:10, [3] Chow, G. (1960). Tests of equality between two sets of coefficients in two linear regressions. Econometrica, 28, [4] Csörgő, M. and Horváth, L. (1997). Limit theorems in change-point analysis, Wiley Series in Probability and Statistics. [5] Dong, L.B. (2004). Testing for structural change in regression: an empirical likelihood approach, Econometrics Working Paper, [6] Feder, P.I. (1975a). Asymptotic distribution theory in segmented regression problemsidentified case. The Annals of Statistics, 3,
17 [7] Feder, P.I. (1975b). The log likelihood ratio in segmented regimes. The Annals of Statistics, 3, [8] Fiteni, I. (2004). τ-estimators of regression models with structural change of unknown location. Journal of Econometrics, 119, [9] Gbur,E.E., Thomas,G.L. and Miller,F.R. (1979). The use of segmented regression models in the determination of the base temperature in heat accumulation models. Agronomy Journal, 71, [10] Hawkins, D.M. (1980). A Note on Continuous and Discontinuous Segmented Regressions, Technometrics, 22, [11] Hinkley, D.V. (1969). Inference about the Intersection in Two-Phase Regression. Biometrika, 56, [12] Hinkley, D.V. (1971). Inference in two-phase regression.journal of the American Statistical Association, 66, [13] Koul, L.H. and Qian, L.F. (2002). Asymptotics of maximum likelihood estimator in a two-phase linear regression model. Journal of Statistical Planning and Inference, 108, [14] Lund, R. and Reeves, J.(2002). Detection of undocumented change points: A revision of the two-phase regression model. Journal of Climate, 15, [15] Luwel K., Beem A.L., Onghena P. and Verschaffel L. (2001). Using segmented linear regression models with unknown change points to analyze strategy shifts in cognitive tasks. Behavior Research Methods, Instruments, & Computers. 33, (9) [16] Muggeo, V.M.R. (2003). Estimating regression models with unknown break-points. Statistics in Medicine. 22,
18 [17] Owen, A.B. (1991). Empirical likelihood for linear models. The Annals of Statistics, 19, [18] Owen, A.B. (2001). Empirical Likelihood, Chapman & Hall/CRC. [19] Pastor, R. and Guallar, E. (1998). Use of two-segmented logistic Regression to estimate change-points in epidemiologic studies. American Journal of Epidemiology, 148, [20] Piegorsch, W. W. and Bailer, A. J. (1997). Statistics for environmental biology and toxicology. Chapman and Hall. [21] Piepho, H. P. and Ogutu, J.O. (2003). Inference for the break point in segmented regression with application to longitudinal data. Biometrical Journal, 45, [22] Qian, L.F. (1998). On maximum likelihood estimation for a threshold autoregression. Journal of Statistical Planning and Inference, 75, [23] Qian, L.F. and Yao, Q.C. (2002). Software project effort estimation using two-phase linear regression models. Proceeding of The 15th Annual Motorola Software Engineering Symposium (SES). [24] Qian, L.F. and Ryu, S.Y.(2006). Estimating tree resin dose effect on termites. Environmentrics, 17, [25] Quandt, R.E. (1958). The estimation of the parameters of a linear regression system obeying two separate regimes. Journal of the American Statistical Association, 53, [26] Quandt, R.E. (1960). Tests of the hypothesis that a linear regression system obeys two separate regimes. Journal of the American Statistical Association, 55, [27] Schwartz, G. (1978). Estimating the dimension og a model. Annuals of Statistics, 6,
19 [28] Smith,A.M.F. and Cook, D.G. (1980). Straight lines with a change point: A Bayesian analysis of some renal transplant data. Applied Statistics, 29, [29] Toms, J.D. and Lesperance, M.L. (2003). Piecewise regression: A tool for identifying ecological thresholds. Ecology, 84, [30] Ulm, K.W. (1991). A statistical method for assessing a threshold in epidemiological studies. Statistics in medicine, 10, [31] Vieth, E. (1989). Fitting piecewise linear regression functions to biological responses. Journal of Applied Physiology, 67, [32] Zeileis, A. (2006). Implementing a class of structural change tests: an econometric computing approach. Computational Statistics & Data Analysis, 50,
20 (i) n= (i) n= (i) n= (i) n= Figure 2: The histograms and Q-Q plots of under H 0 with normal errors (i){e i } n i=1 N(0, ) for four different sample size settings. The solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density. 20
21 (ii) n= (ii) n= (ii) n= (ii) n= Figure 3: The histograms and Q-Q plots of under H 0 with log-normal errors (ii) {e i } n i=1 log N(0, ) for four different sample size settings. The solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density. 21
22 (iii) n= (iii) n= (iii) n= (iii) n= Figure 4: The histograms and Q-Q plots of under H 0 with non-homogeneous errors (iii) {e i } [n/2] i=1 N(0, ) and {e i } n i=[n/2]+1 N(0, 1.02 ) for four different sample size settings. The solid line represents the estimated Gumbel density and the dashed line represents the estimated kernel density. 22
ROBUSTNESS OF TWO-PHASE REGRESSION TESTS
REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationEmpirical Likelihood
Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence
More informationEmpirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications
Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationLectures on Structural Change
Lectures on Structural Change Eric Zivot Department of Economics, University of Washington April5,2003 1 Overview of Testing for and Estimating Structural Change in Econometric Models 1. Day 1: Tests of
More informationUNIVERSITÄT POTSDAM Institut für Mathematik
UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam
More informationReliability of inference (1 of 2 lectures)
Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationAsymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri
Asymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri University of Bergamo (Italy) ilia.negri@unibg.it SAPS VIII, Le Mans 21-24 March,
More informationTESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST
Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department
More informationAN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY
Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationPublished: 26 April 2017
Electronic Journal of Applied Statistical Analysis EJASA, Electron. J. App. Stat. Anal. http://siba-ese.unisalento.it/index.php/ejasa/index e-issn: 2070-5948 DOI: 10.1285/i20705948v10n1p194 Tests for smooth-abrupt
More informationRobustness and Distribution Assumptions
Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology
More informationBinary choice 3.3 Maximum likelihood estimation
Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationJournal of Biostatistics and Epidemiology
Journal of Biostatistics and Epidemiology Original Article Robust correlation coefficient goodness-of-fit test for the Gumbel distribution Abbas Mahdavi 1* 1 Department of Statistics, School of Mathematical
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationChapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)
Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December
More information3. Linear Regression With a Single Regressor
3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)
More informationAdjusted Empirical Likelihood for Long-memory Time Series Models
Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics
More informationLeast Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions
Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error
More informationMinimum distance tests and estimates based on ranks
Minimum distance tests and estimates based on ranks Authors: Radim Navrátil Department of Mathematics and Statistics, Masaryk University Brno, Czech Republic (navratil@math.muni.cz) Abstract: It is well
More informationStudies in Nonlinear Dynamics and Econometrics
Studies in Nonlinear Dynamics and Econometrics Quarterly Journal April 1997, Volume, Number 1 The MIT Press Studies in Nonlinear Dynamics and Econometrics (ISSN 1081-186) is a quarterly journal published
More informationEmpirical likelihood for linear models in the presence of nuisance parameters
Empirical likelihood for linear models in the presence of nuisance parameters Mi-Ok Kim, Mai Zhou To cite this version: Mi-Ok Kim, Mai Zhou. Empirical likelihood for linear models in the presence of nuisance
More informationAnalysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems
Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA
More informationA NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL
Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl
More informationMidwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter
Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationHypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations
Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate
More informationA nonparametric two-sample wald test of equality of variances
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David
More informationDo Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods
Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of
More informationVariable inspection plans for continuous populations with unknown short tail distributions
Variable inspection plans for continuous populations with unknown short tail distributions Wolfgang Kössler Abstract The ordinary variable inspection plans are sensitive to deviations from the normality
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationL-momenty s rušivou regresí
L-momenty s rušivou regresí Jan Picek, Martin Schindler e-mail: jan.picek@tul.cz TECHNICKÁ UNIVERZITA V LIBERCI ROBUST 2016 J. Picek, M. Schindler, TUL L-momenty s rušivou regresí 1/26 Motivation 1 Development
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationEmpirical likelihood-based methods for the difference of two trimmed means
Empirical likelihood-based methods for the difference of two trimmed means 24.09.2012. Latvijas Universitate Contents 1 Introduction 2 Trimmed mean 3 Empirical likelihood 4 Empirical likelihood for the
More informationAFT Models and Empirical Likelihood
AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t
More informationTesting for a break in persistence under long-range dependencies and mean shifts
Testing for a break in persistence under long-range dependencies and mean shifts Philipp Sibbertsen and Juliane Willert Institute of Statistics, Faculty of Economics and Management Leibniz Universität
More informationGoodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links
Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department
More informationQuick Review on Linear Multiple Regression
Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao
More informationThe assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values
Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our
More informationBootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution
Pertanika J. Sci. & Technol. 18 (1): 209 221 (2010) ISSN: 0128-7680 Universiti Putra Malaysia Press Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationEconometrics II - EXAM Answer each question in separate sheets in three hours
Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationA measure of radial asymmetry for bivariate copulas based on Sobolev norm
A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial
More informationSome Statistical Inferences For Two Frequency Distributions Arising In Bioinformatics
Applied Mathematics E-Notes, 14(2014), 151-160 c ISSN 1607-2510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ Some Statistical Inferences For Two Frequency Distributions Arising
More informationInference via Kernel Smoothing of Bootstrap P Values
Queen s Economics Department Working Paper No. 1054 Inference via Kernel Smoothing of Bootstrap P Values Jeff Racine McMaster University James G. MacKinnon Queen s University Department of Economics Queen
More informationRegression Analysis for Data Containing Outliers and High Leverage Points
Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain
More informationSMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES
Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and
More informationBootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator
Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos
More informationTesting Homogeneity Of A Large Data Set By Bootstrapping
Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationThe outline for Unit 3
The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.
More informationRobust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study
Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance
More informationA Measure of Robustness to Misspecification
A Measure of Robustness to Misspecification Susan Athey Guido W. Imbens December 2014 Graduate School of Business, Stanford University, and NBER. Electronic correspondence: athey@stanford.edu. Graduate
More informationDiagnostics can identify two possible areas of failure of assumptions when fitting linear models.
1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationarxiv: v1 [stat.me] 2 Mar 2015
Statistics Surveys Vol. 0 (2006) 1 8 ISSN: 1935-7516 Two samples test for discrete power-law distributions arxiv:1503.00643v1 [stat.me] 2 Mar 2015 Contents Alessandro Bessi IUSS Institute for Advanced
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationAnswer Key for STAT 200B HW No. 7
Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice
The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationSpatial and temporal extremes of wildfire sizes in Portugal ( )
International Journal of Wildland Fire 2009, 18, 983 991. doi:10.1071/wf07044_ac Accessory publication Spatial and temporal extremes of wildfire sizes in Portugal (1984 2004) P. de Zea Bermudez A, J. Mendes
More information1 Degree distributions and data
1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.
More informationCHANGE DETECTION IN TIME SERIES
CHANGE DETECTION IN TIME SERIES Edit Gombay TIES - 2008 University of British Columbia, Kelowna June 8-13, 2008 Outline Introduction Results Examples References Introduction sunspot.year 0 50 100 150 1700
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationMultivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation
Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics
More informationSome New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary
Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Bimal Sinha Department of Mathematics & Statistics University of Maryland, Baltimore County,
More informationConfidence intervals for the variance component of random-effects linear models
The Stata Journal (2004) 4, Number 4, pp. 429 435 Confidence intervals for the variance component of random-effects linear models Matteo Bottai Arnold School of Public Health University of South Carolina
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)
The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE
More informationSmall Sample Corrections for LTS and MCD
myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling
More informationDoes k-th Moment Exist?
Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,
More informationStatistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS040) p.4828 Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions
More informationExtending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie
Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationOn the econometrics of the Koyck model
On the econometrics of the Koyck model Philip Hans Franses and Rutger van Oest Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR, Rotterdam, The Netherlands Econometric Institute
More informationWeighted empirical likelihood estimates and their robustness properties
Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Weighted empirical likelihood estimates and their robustness properties N.L. Glenn a,, Yichuan Zhao b a Department of Statistics,
More informationWEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION
WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University
More informationMonitoring Wafer Geometric Quality using Additive Gaussian Process
Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department
More information10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers
Computational Methods for Data Analysis Massimo Poesio SUPPORT VECTOR MACHINES Support Vector Machines Linear classifiers 1 Linear Classifiers denotes +1 denotes -1 w x + b>0 f(x,w,b) = sign(w x + b) How
More informationModified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain
152/304 CoDaWork 2017 Abbadia San Salvatore (IT) Modified Kolmogorov-Smirnov Test of Goodness of Fit G.S. Monti 1, G. Mateu-Figueras 2, M. I. Ortego 3, V. Pawlowsky-Glahn 2 and J. J. Egozcue 3 1 Department
More informationTesting Error Correction in Panel data
University of Vienna, Dept. of Economics Master in Economics Vienna 2010 The Model (1) Westerlund (2007) consider the following DGP: y it = φ 1i + φ 2i t + z it (1) x it = x it 1 + υ it (2) where the stochastic
More informationSupport Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina
Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,
More informationLQ-Moments for Statistical Analysis of Extreme Events
Journal of Modern Applied Statistical Methods Volume 6 Issue Article 5--007 LQ-Moments for Statistical Analysis of Extreme Events Ani Shabri Universiti Teknologi Malaysia Abdul Aziz Jemain Universiti Kebangsaan
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationTests for Assessment of Agreement Using Probability Criteria
Tests for Assessment of Agreement Using Probability Criteria Pankaj K. Choudhary Department of Mathematical Sciences, University of Texas at Dallas Richardson, TX 75083-0688; pankaj@utdallas.edu H. N.
More informationChange Point Analysis of Extreme Values
Change Point Analysis of Extreme Values TIES 2008 p. 1/? Change Point Analysis of Extreme Values Goedele Dierckx Economische Hogeschool Sint Aloysius, Brussels, Belgium e-mail: goedele.dierckx@hubrussel.be
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More information