Weighted empirical likelihood estimates and their robustness properties

Size: px
Start display at page:

Download "Weighted empirical likelihood estimates and their robustness properties"

Transcription

1 Computational Statistics & Data Analysis ( ) Weighted empirical likelihood estimates and their robustness properties N.L. Glenn a,, Yichuan Zhao b a Department of Statistics, University of South Carolina, Columbia, SC 29208, USA b Department of Mathematics & Statistics, Georgia State University, Atlanta, GA 30303, USA Abstract Maximum likelihood methods are by far the most popular methods for deriving statistical estimators. However, parametric likelihoods require distributional specifications. The empirical likelihood is a nonparametric likelihood function that does not require such distributional assumptions, but is otherwise analogous to its parametric counterpart. Both likelihoods assume that the random variables are independent with a common distribution. A nonparametric likelihood function for data that are independent, but not necessarily identically distributed is introduced. The contaminated normal density is used to compare the robustness properties of weighted empirical likelihood estimators to empirical likelihood estimators. It is shown that as the contamination level of the sample increases, the root mean squared error of the empirical likelihood estimator for the mean increases. Conversely, the root mean squared error of the weighted empirical likelihood estimator for the mean remains closer to the theoretical root mean squared error Elsevier B.V. All rights reserved. Keywords: Empirical likelihood; Weighted likelihood; Contaminated normal distribution 1. Introduction Likelihood-based methods are effective in finding efficient estimators, constructing tests with good power properties, and quantifying uncertainty through confidence intervals and confidence regions (Owen, 2001). Nevertheless, when the distribution is misspecified, parametric likelihood-based estimates are inefficient. In this case, the empirical likelihood is preferred because it does not require such distributional assumptions. Weighted empirical likelihood requires even fewer distributional assumptions since it relaxes the identical distributed assumption. Hence, it can offset problems arising from auxiliary information such as contamination by modifying the constraints or objective function. We develop weighted empirical likelihood for point estimators which extends to confidence intervals, confidence regions, and likelihood-based methods. Empirical likelihood methods extend classical imum likelihood methods for random samples from a common distribution of known functional form to the situation where the form of the distribution is unknown. In many applications, the observations may be from different distributions having a common mean. For example, if two types of measuring instruments have different variabilities and these instruments are used to obtain data, a fraction of these data Corresponding author. Tel.: ; fax: address: nglenn@stat.sc.edu (N.L. Glenn) /$ - see front matter 2006 Elsevier B.V. All rights reserved. doi: /j.csda

2 2 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) comes from a more variable distribution. Classical robustness considered the problem of estimating the mean in one such case. In particular, Tukey (1960) introduced the contaminated normal family of densities CN ( γ, σ 2) : f γ,σ (x) = 1 (1 γ)e x2 / π σ /2σ 2 2π γe x2, (1) where γ,0 γ 1is the contamination parameter, and σ is the scale parameter. The robustness implications of contaminated data were studied by Lehmann (1983), Gastwirth and Cohen (1970), Andrews and Mallows (1974), and more recently by Taskinen et al. (2003). They also arise in regression (Sinha and Wiens, 2002), and financial data (Ellis et al., 2003). We compare the properties of the weighted empirical likelihood estimate to those of the usual empirical likelihood, the trimmed mean, and the Winsorized mean in the context of robust estimation. Results indicate that the weighted empirical likelihood estimate is somewhat more robust to contamination than the usual empirical likelihood estimate. As the sample size increases, weighted empirical likelihood estimators are comparable to the trimmed mean and the Winsorized mean. The trimmed mean and Winsorized mean s disadvantages are that the amount of trim and the number of observations replaced are somewhat arbitrary. In Section 2, we review the empirical likelihood approach and introduce the weighted empirical likelihood for the mean. The major difference between weighted empirical likelihood and empirical likelihood is that the former incorporates a weight vector that weighs each observation s contribution to the likelihood function. Section 3 details how to obtain the weight vector, presents the nonlinear programming problem, then solves it to obtain the weighted empirical likelihood estimator for the mean. Section 4 explores the properties of the weighted empirical likelihood estimator when estimating the mean in the contaminated normal model. Concluding remarks are presented in Section Definitions 2.1. Empirical likelihood For a random sample X = {X 1,X 2,...,X n } of size n, the parametric likelihood L(θ) = L (θ; X 1,X 2,...,X n ), θ Θ = f (X i ; θ) if {X 1,X 2,...,X n } are independent (2) is a function of θ which takes values in the space Θ. When the density function f(x; θ) is unknown one can use the empirical likelihood method (Owen, 1988, 2001). Definition 2.1. Suppose {X 1,X 2,...,X n } is a random sample from an unknown distribution. Let the parameter θ denote the mean. Suppose p i is the probability mass placed on X i, n p i = 1, p i 0. Let t(p) = n p i X i denote the value t assumes at p. The empirical likelihood (Owen, 1988, 2001) for θ is defined as L(θ) = p,t(p)=θ For all θ in the convex hull of X, p,t(p)=θ p i p p i. (3) p i 1 n ) = L (ˆθ, (4)

3 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) 3 where ˆθ = ) n X i /n. Therefore, L (ˆθ = n n = θ L(θ). The corresponding empirical log-likelihood is ( l(θ) = log L(θ) / )) L (ˆθ. (5) To determine p i,1 i n, we consider the following nonlinear programming problem: p log ( ) np i subject to p i X i = θ, and p i = 1 p i 0. Example 2.1 illustrates the definition. (6) Example 2.1. Let θ equal the population mean μ. Plot the empirical likelihood for the population mean, Eq. (3), for the small random sample X ={1, 3, 7} with sample mean x = Solution: We use an equilateral triangle to represent the probability vector p = (p 1,p 2,p 3 ) that corresponds to a specific μ. Since n = 3, p = (p 1,p 2,p 3 ) can be represented in Baricentric coordinates. Each data point is associated with a vertex of an equilateral triangle. Without loss of generality, suppose that X 1 = 1 and X 2 = 3 are associated with the bottom left and right vertices, respectively, and that X 3 = 7 is associated with the top vertex. The p i s, i = 1, 2, 3 represent probabilities and p = (p 1,p 2,p 3 ) gives a multinomial distribution on the n = 3 points. Hence, the probability vectors that correspond to 1 and 3 are (1, 0, 0) and (0, 1, 0), respectively. For μ values other than {1, 3, 7} a convex combination determines corresponding probability vectors: p γ = γp i + (1 γ)p j, (7) where γ [0, 1] and i, j {1, 2, 3},i = j. Since empirical likelihood is supported on the data, μ s that lie outside the convex hull of the data are not considered. All probability vectors corresponding to specific mean values lie on lines that intersect distinct sides of the equilateral triangle. Fig. 1 is a plot of the product of the elements of p γ for five possible values of μ. The plot in Fig. 1 uses integer values for convenience, but real numbers could have been used. The imum value of each curve is the value of the empirical likelihood for that particular value of the mean. Hence, the empirical likelihood is a profile likelihood. The empirical likelihood ratio is the empirical likelihood divided by the largest possible value of the empirical likelihood, n n. Example 2.1 describes one approach for finding the empirical likelihood value for the mean. It is useful to summarize the steps involved in obtaining the empirical likelihood function for the mean: 1. Choose an interval of hypothesized mean values. 2. For each mean value μ in the interval, find several corresponding probability vectors p γ, γ [0, 1]. 3. For each μ, graph γ vs. the product of the elements of p γ. Graph this product for several values of p γ ; see Fig Find the imum μ for each curve; see Fig Each optimal value found in the previous step is the empirical likelihood for each corresponding μ. The empirical likelihood ratio function is the empirical likelihood divided by the nonparametric imum likelihood estimator of the cumulative distribution function; see Fig. 2.

4 4 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) X = {1,3,7} probability products Mu=2 Mu=3 Mu=4 Mu=5 Mu= gamma Fig. 1. Plots of γ [0, 1] vs. the product of the elements of p γ. The indicated imums (filled points) are the values of the empirical likelihood function for corresponding mean values, μ. 1.0 X = {1,3,7} empirical likelihood ratio Mu 5 6 Fig. 2. Empirical likelihood ratio curve for the parameter μ for data in Example 2.1. Note that the curve reaches its imum value at the sample mean of Definition 2.1 assumes the X i are independent and identically distributed. Often one has a sample where the X i s are independent, but not identically distributed. Choi et al. (2000) use a weighted parametric likelihood in the nonidentical distribution case. Weighted empirical likelihood extends the parametric method by Choi et al. (2000) to the nonparametric setting by incorporating a weight vector that tilts the log-empirical likelihood function in order to allow an assignment of differing weights to each term in the sum of the log empirical likelihood ratio objective function. This methodology generalizes Eq. (6) Weighted empirical likelihood To extend the procedure by Choi et al. (2000) to empirical likelihood, we weight the contribution of each data point to the log-empirical likelihood function. Weighting the data s contribution amounts to starting with a discrete uniform distribution on n points and tilting the distribution to achieve a desired empirical log-likelihood. The resulting tilted distribution is a multinomial distribution on n points. Notice that each term in the objective function Eq. (6) receives the

5 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) 5 same weight. Each term in an analogous weighted empirical likelihood ratio function receives weight w i, n w i =1, w i > 0, as demonstrated in the weighted empirical likelihood definition for the mean: Definition 2.2. Suppose X={X 1,X 2,...,X n } come from distributions with a common mean θ, and different variances. Suppose p i is the probability mass placed on X i, n p i =1, p i 0. Let t(p)= n p i X i denote the value t takes on when p i is the probability mass placed on X i. Given the weight vector w, n w i = 1, w i > 0, the weighted empirical likelihood for the parameter θ is defined as WEL(θ) = p,t(p)=θ p nw i i. (8) For all θ in the convex hull of X, np nw i i w nw i i. (9) Therefore, p,t(p)=θ p nw i i w nw i i ) = L (ˆθ, where ˆθ = n w i X i. From this, we obtain the weighted empirical log-likelihood ratio for the mean: ) wel(θ) = log WEL(θ)/WEL (ˆθ = nw i log (p i /w i ), (10) p,t(p)=θ where ) WEL (ˆθ = w nw i i. (11) Each term in the last equality of Eq. (10) receives weight w i, scaled by n. When w i = 1/n, (10) reduces to (6). 3. The weighted empirical likelihood estimator 3.1. The weight vector w Consider the case where σ > 1. We use the data-labeling rule of Tietjen and Moore (1972) to detect the contamination. After detecting the contamination we define the weight vector w = w b + cd, (12) where w b = (1/n,...,1/n) c is a scalar d is a vector of unit length. Theorem 3.1. If X = {X 1,X 2,...,X n } is an sample of size n, and k is the number of points labeled as contamination, then d is the vector of unit length defined as d = 1 kn y, (13) n k

6 6 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) where y is the rank ordered vector defined as k n k. k n k The rank ordered vector y contains n k elements equal to k/(n k) and k elements equal to 1. See Appendix A for proof. The weight vector decreases the contribution of the contamination to the likelihood function. The scalar c is a positive constant that is bounded above by a function of the sample size. Theorem 3.2. Suppose X = {X 1,X 2,...,X n } is an sample of size n. Without loss of generality, assume X is contaminated by one point. If w is defined as in Eq. (12) then 0 c< 1 n n n 1. (14) See Appendix A for proof Obtaining weighted empirical likelihood estimators using nonlinear programming Suppose w is determined as in Section 3.1, p is defined as in Definition 2.2, and μ is the population mean. Determining a weighted empirical likelihood estimator for μ begins with solving the following nonlinear programming problem: p nw i log (p i /w i ) subject to p i X i = μ, and p i = 1 (15) p i 0. The above nonlinear programming problem yields optimal probability vectors p for weighted empirical likelihood. To formulate the objective function (15), assume that the weight vector w is given; it is computed as explained in Section 3.1. As with empirical likelihood, imize the product of the elements of the weighted empirical likelihood ) probability vector since the imum indicates the most plausible parameter value. The WEL(θ) and WEL (ˆθ are defined as in Eqs. (8) and (11), and the imum of the weighted empirical likelihood ratio is denoted: p n (p i ) nw i n (w i ) nw i. (16)

7 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) 7 As usual, we imize the logarithm of Eq. (16), which yields p nw i log (p i /w i ). (17) The steps outlined in the above paragraphs yield the objective function in Eq. (15). Because the weighted empirical likelihood objective function is a strictly concave function on a convex set of probability vectors, the Karush Kuhn Tucker Theorem states that a unique global imum exists (Nocedal and Wright, 1999). The form of this optimum is derived in Section Karush Kuhn Tucker theorem Obtaining univariate weighted empirical likelihood point estimators starts with solving the nonlinear programming problem presented in Section 3.2. The first step is to determine the Lagrangian for (15), which is ) ) L (p, λ 1, λ 2 w, x, μ) = nw i log (p i /w i ) + λ 1 (1 p i + λ 2 (μ X i p i. In the following paragraphs, we use the above equation to derive the form of the optimal weighted empirical likelihood probability vectors. The optimality conditions for the constrained optimization problem (15) are the Karush Kuhn Tucker conditions. Therefore, a first-order necessary condition is that L ( p, λ 1, ) λ 2 w, x, μ = 0, (18) where p is the optimal probability vector for the weighted empirical likelihood, and λ = ( λ 1, 2) λ is a Lagrange multiplier vector such that the Karush Kuhn Tucker conditions are satisfied at ( p, λ ). In order to derive the form of the optimal probability vector for weighted empirical likelihood, derive an expression for the Lagrange multiplier, λ 1 as follows. Determine the gradient of the Lagrangian, Eq. (18), L = nw i λ 1 λ 2 X i = 0, p i p i i = 1, 2,...,n. (19) Multiplying Eq. (19) by p i yields, nw i λ 1 p i λ 2 X i p i = 0. (20) Since both w and p satisfy the axioms of probability vectors, summing each term in (20) gives n n w i λ 1 λ 2 μ=0. Therefore, λ 1 = n w i λ 2 μ is an expression for λ 1. Solving Eq. (20) for nw i yields nw i = λ 1 p i + λ 2 X i p i. Replacing λ 1 in this equation by the right-hand side of Eq. (21) yields nw i = (n λ 2 μ) p i + λ 2 X i p i. The final step is to solve for p i. Therefore, (21) p i = nw i n n j=1 w j λ 2 (μ X i ) (22) is the form of the ith element of the optimal probability vector for the weighted empirical likelihood. To evaluate Eq. (22), we provide values for w i, μ, and λ 2. First consider w i, the ith (1 i n) element of the weight vector, w. We use scale estimation to find an appropriate w, as detailed in Section 3.1. A scale estimator is a statistic that is equivariant under scale transformations. That is, if the parameter is transformed then the answer is also transformed (Glenn, 2002).

8 8 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) After determining w we consider the parameter μ. As the values of μ must lie in the convex hull of the data, μ must be in ( X (1),X (n) ) in order to satisfy the constraint n p i X i = μ, where p i 0 and n p i = 1. The global imum of the weighted empirical likelihood function is μ.tofindμ given w note that wel(μ w, x) = ( p ) nw i log i (μ), (23) w i where p i (μ) is the ith element of p. It satisfies the constraints n p i =1 and n p i X i =μ forafixedμ. Therefore, μ = arg wel(μ w, x). (24) μ The Lagrange multiplier λ 2 is chosen such that the constraints and p i = 1 (25) p i X i = μ (26) are satisfied. Suppose p i is defined as in Eq. (22). Let λ 2 be a Lagrange multiplier value that allows the constraints (25) and (26) to be satisfied. The first of three steps for choosing λ 2 is to recall Eq. (22), the form of the elements of weighted empirical likelihood s optimal probability vector. From Eqs. (22) and (25) we define the function f (λ 2 ) = nw i n n 1. (27) j=1 w j λ 2 (μ X i ) The Lagrange multiplier λ 2 is a root of n / ) (nw i n nj=1 w j λ 2 (μ X i ) 1 = 0 that satisfies the constraints (25) and (26). Hence, f ( λ 2) = 0. Notice that λ2 = 0 is always a solution for Eq. (27). In order to find the root λ 2 of f (λ 2 ) take the derivative f (λ 2 ) = nw i (μ X i ) { n } 2, (28) n j=1 w j λ 2 (μ X i ) then evaluate at λ 2 = 0. The process of finding the root λ 2 of f (λ 2) separates into three cases. All three cases use the fact that λ 2 = 0is always a solution. Cases II and III have two roots that are either negative or positive. Case I: f (0) = 0. In this case, Eq. (27) has one root. Therefore, λ 2 = 0. Case II: f (0)>0. This implies that n w i (μ X i ) > 0. Therefore, λ 2 < 0. Case III: f (0)<0. This implies that n w i (μ X i ) < 0. Therefore, λ 2 > 0. From the two previous steps, one can infer whether λ 2 is positive, negative, or zero. The third and final step for choosing λ 2 is to find its exact value. We apply the bisection method to: { } n g (λ 2 ) = w i n n 1 = 0, (29) i=j w j λ 2 (μ X i )

9 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) 9 which uses the lower (λ ) and upper (λ + ) bounds of λ 2. To derive the bounds, notice that μ >X i implies λ 2 < 1/ (μ X i ). μ <X i implies λ 2 > 1/ (μ X i ). μ = X i implies λ 2 unrestricted. The bounds for λ are: λ = 1/ ( ) μ X (n) and λ+ = 1/ ( ) μ X (1). Therefore, λ 2 is the value of λ 2 that lies in the interval (λ, λ + ) and allows the constraints Eqs. (25) and (26) to be satisfied. The above argument focuses on Eq. (25). However, when Eq. (25) is satisfied, Eq. (26) is implicitly satisfied. 4. Empirical results The root mean squared error of an estimator, the square root of the average squared difference between an estimator and a parameter, is a good measure of the performance of an estimator. To compare robustness properties of the weighted empirical likelihood estimator for the mean to those of the empirical likelihood estimator for the mean, trimmed mean, and Winsorized mean, we compute point estimators for the mean for the stated estimators for various sample sizes. We then compare the root mean squared errors of the mean in each case. In the first case, the samples are not contaminated; see the first row of Tables 1 4. We simulate 1000 samples of sizes n = 10, 20, 50, 100 from a standard normal distribution. In the second case, the level of contamination is 10%. We simulate 1000 samples of sizes n = 10, 20, 50, 100 from a contaminated normal distribution in which a sample is from a N(0, 9) distribution with probability 0.10 and otherwise from a N(0, 1) distribution. In the third case, the level of contamination is 20%. We simulate 1000 samples of sizes n = 10, 20, 50, 100 from a contaminated normal distribution in which a sample is from a N(0, 9) distribution with probability 0.20 and otherwise from N(0, 1). For each level of contamination, Tables 1 4 contain the theoretical root mean squared error of the mean, the root mean squared error for the weighted empirical likelihood, empirical likelihood, trimmed, and Winsorized estimators of the mean. The tables also contain the fraction of times the Shapiro Wilk test fails to reject the null hypothesis of standard normality. Simulations were carried out with an S-PLUS program that is available from the authors. When there is no contamination, the theoretical root mean squared error of the mean given n = 10 and σ 2 = 1is σ 2 X = σ 2 /n = 1/10 = In the case of contamination, the theoretical mean square error of the mean is ( ) MSE = (1 γ)n 1 + γ σ 2 /n. (30) The root mean square of the mean is the square root of Eq. (30). To simulate the root mean squared error of the weighted empirical likelihood, empirical likelihood, trimmed and Winsorized mean, we estimate the mean for each sample using each method. For each method we square the errors, sum the squared errors, then divide by the number of simulations. Finally, we take the square root of the average squared errors. The simulation study indicates that as the contamination level increases, the empirical likelihood s root mean squared error increases. However, the weighted empirical likelihood s root mean squared error remains closer to the theoretical root mean squared error. As the sample size increases, the Winsorized mean is comparable to weighted empirical likelihood. In some cases involving larger sample sizes, the trimmed mean performs better than the weighted empirical likelihood estimator. However, the amount of trim is somewhat arbitrary. We chose 5% as it is typically used. When contamination is not present, weighted empirical likelihood reduces to empirical likelihood. The Shapiro Wilk test, which is commonly regarded as one of the most powerful tests of normality, has low power to differentiate a contaminated normal from a standard normal distribution in all cases. This indicates that the weighted empirical likelihood estimator for the mean is superior to the empirical likelihood estimator for the mean when the Shapiro Wilk test is incapable of differentiating a contaminated normal from a standard normal distribution. Therefore, the weighted empirical likelihood is less sensitive to contamination than the empirical likelihood. However, the relative robustness of weighted empirical likelihood did not increase with the amount of contamination. In an unreported simulation we kept track of which observations came from the contaminant, and found that the robustness of the weighted empirical likelihood relative to that of the empirical likelihood increased with the level of contamination.

10 10 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) Table 1 This table contains results from simulating 1000 samples of size n = 10 from a N(0, 1) distribution, and 1000 samples of size n = 10 from a contaminated normal distribution Cont. (%) RMSE WEL (EL) TM (WM) SW ( ) ( ) ( ) ( ) ( ) ( ) The contamination is from a N(0, 9) distribution with probabilities 0.10 and RMSE is the theoretical root mean squared error of the mean. WEL is the root mean squared error for the weighted empirical likelihood estimate of the mean; empirical likelihood s root mean squared error is in parentheses. TM is the root mean squared error for the trimmed mean; the root mean squared mean for the Winsorized mean is in parentheses. SW is the fraction of times the Shapiro Wilk test rejects the null hypothesis of standard normality. Table 2 This table contains results from simulating 1000 samples of size n = 20 from a N(0, 1) distribution, and 1000 samples of size n = 20 from a contaminated normal distribution Cont. (%) RMSE WEL (EL) TM (WM) SW ( ) ( ) ( ) ( ) ( ) ( ) The contamination is from a N(0, 9) distribution with probabilities 0.10 and RMSE is the theoretical root mean squared error of the mean. WEL is the root mean squared error for the weighted empirical likelihood estimate of the mean; empirical likelihood s root mean squared error is in parentheses. TM is the root mean squared error for the trimmed mean; the root mean squared mean for the Winsorized mean is in parentheses. SW is the fraction of times the Shapiro Wilk test rejects the null hypothesis of standard normality. Table 3 This table contains results from simulating 1000 samples of size n = 50 from a N(0, 1) distribution, and 1000 samples of size n = 50 from a contaminated normal distribution Cont. (%) RMSE WEL (EL) TM (WM) SW ( ) ( ) ( ) ( ) ( ) ( ) The contamination is from a N(0, 9) distribution with probabilities 0.10 and RMSE is the theoretical root mean squared error of the mean. WEL is the root mean squared error for the weighted empirical likelihood estimate of the mean; empirical likelihood s root mean squared error is in parentheses. TM is the root mean squared error for the trimmed mean; the root mean squared mean for the Winsorized mean is in parentheses. SW is the fraction of times the Shapiro Wilk test rejects the null hypothesis of standard normality. Table 4 This table contains results from simulating 1000 samples of size n = 100 from a N(0, 1) distribution, and 1000 samples of size n = 100 from a contaminated normal distribution Cont. (%) RMSE WEL (EL) TM (WM) SW ( ) ( ) ( ) ( ) ( ) ( ) The contamination is from a N(0, 9) distribution with probabilities 0.10 and RMSE is the theoretical root mean squared error of the mean. WEL is the root mean squared error for the weighted empirical likelihood estimate of the mean; empirical likelihood s root mean squared error is in parentheses. TM is the root mean squared error for the trimmed mean; the root mean squared mean for the Winsorized mean is in parentheses. SW is the fraction of times the Shapiro Wilk test rejects the null hypothesis of standard normality. 5. Discussion We construct the weighted empirical likelihood function, a nonparametric likelihood function that is suitable for data that are independent but not necessarily identically distributed. We compare the robustness properties of weighted empirical likelihood to those of empirical likelihood, trimmed mean, and Winsorized mean by comparing the root mean

11 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) 11 squared errors of the mean for each method. Using the contaminated normal distribution, we showed that as the level of contamination increases, the weighted empirical likelihood s root mean squared error for the mean remains closer to the theoretical root mean squared error for the mean as compared to the empirical likelihood. The weighted empirical likelihood is comparable to the trimmed mean and Winsorized mean as the sample size increases. We also conducted a Shapiro Wilk test on the data. The Shapiro Wilk test was not capable of differentiating a contaminated normal and a standard normal even though the sample sizes were small. When the Shapiro Wilk test did not detect nonnormality, the weighted empirical likelihood estimator was less sensitive to contamination than the empirical likelihood estimator. In future research, we will explore large sample property derivations of the weighted empirical likelihood estimator. Acknowledgments The authors would like to thank Professor Joseph Gastwirth of the Department of Statistics, George Washington University for suggestions regarding robustness. The authors also thank Professors David W. Scott and Katherine Ensor of the Department of Statistics, Rice University, and Professor Yin Zhang of the Department of Computational and Applied Mathematics, Rice University. Appendix A. Proofs Proof of Theorem 3.1. By definition of unit vector, d = (1/ y )y. Since y = kn/(n k), k n k 1. k d =. kn n k n k This completes the proof. Proof of Theorem 3.2. Without loss of generality, assume that X n is the contamination. From Eq. (12), 1 1 n 1 ṇ w =.. + c.. (31) n 1 1 n 1 n 1 n 1 Therefore w[n]= 1 n + c n n 1 is the element of w that corresponds to X n. In order for the contamination to receive a weight less than or equal to the other data, the following must hold: 0 < w[n] 1 n. (33) Substituting Eq. (32) into Eq. (33), yields the following inequality: 0 < 1 n + c n 1 n. (34) n 1 (32)

12 12 N.L. Glenn, Y. Zhao / Computational Statistics & Data Analysis ( ) Solving (34) for c yields, 0 c< 1 n n n 1. (35) This completes the proof. References Andrews, D.F., Mallows, C.L., Scale mixtures of normal distributions. J. Roy. Statist. Soc. Ser. B 36, Choi, E., Hall, P., Presnell, B., Rendering parametric procedures more robust by empirically tilting the model. Biometrika 87 (2), Ellis, S., Steyn, Faans, S., Venter, H., Fitting a pareto-normal-pareto distribution to the residuals of financial data. Comput. Statist. 18, Gastwirth, J., Cohen, M., Small sample behavior of some robust linear estimators of location. J. Amer. Statist. Assoc. 65, Glenn, N., Robust empirical likelihood. Ph.D. Thesis, Rice University, Houston, TX. Lehmann, E.L., Theory of Point Estimation. Wiley, Inc., Canada. Nocedal, J., Wright, S.J., Numerical Optimization. Springer, New York. Owen, A., Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, Owen, A., Empirical Likelihood. Chapman & Hall/CRC Press, London, Boca Raton, FL. Sinha, S., Wiens, D.P., Mini weights for generalised M-estimation in biased regression models. Canadian J. Statist. 30 (3), Taskinen, S., Kankainen, A., Oja, H., Sign test of independence between two independent random vectors. Statist. Probab. Lett. 62 (1), Tietjen, G.L., Moore, R.H., Some Grubbs-type statistics for the detection of several outliers. Technometrics 14, Tukey, J.W., A survey of sampling from contaminated distributions. In: Olkin, I. et al. (Eds.), Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press, CA, pp

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Jackknife Empirical Likelihood for the Variance in the Linear Regression Model

Jackknife Empirical Likelihood for the Variance in the Linear Regression Model Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics Summer 7-25-2013 Jackknife Empirical Likelihood for the Variance in the Linear

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION (REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department

More information

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

Lecture 4 September 15

Lecture 4 September 15 IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric

More information

More on Lagrange multipliers

More on Lagrange multipliers More on Lagrange multipliers CE 377K April 21, 2015 REVIEW The standard form for a nonlinear optimization problem is min x f (x) s.t. g 1 (x) 0. g l (x) 0 h 1 (x) = 0. h m (x) = 0 The objective function

More information

Minimum Hellinger Distance Estimation with Inlier Modification

Minimum Hellinger Distance Estimation with Inlier Modification Sankhyā : The Indian Journal of Statistics 2008, Volume 70-B, Part 2, pp. 0-12 c 2008, Indian Statistical Institute Minimum Hellinger Distance Estimation with Inlier Modification Rohit Kumar Patra Indian

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Link lecture - Lagrange Multipliers

Link lecture - Lagrange Multipliers Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com

More information

Empirical Likelihood Inference for Two-Sample Problems

Empirical Likelihood Inference for Two-Sample Problems Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Empirical Likelihood

Empirical Likelihood Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution Journal of Computational and Applied Mathematics 216 (2008) 545 553 www.elsevier.com/locate/cam Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton

More information

Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators

Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators Computational Statistics & Data Analysis 51 (26) 94 917 www.elsevier.com/locate/csda Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators Alberto Luceño E.T.S. de

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Scientific Computing: Optimization

Scientific Computing: Optimization Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture

More information

Bayesian Inference: Concept and Practice

Bayesian Inference: Concept and Practice Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression Robust Statistics robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

An Information Criteria for Order-restricted Inference

An Information Criteria for Order-restricted Inference An Information Criteria for Order-restricted Inference Nan Lin a, Tianqing Liu 2,b, and Baoxue Zhang,2,b a Department of Mathematics, Washington University in Saint Louis, Saint Louis, MO 633, U.S.A. b

More information

Lawrence D. Brown, T. Tony Cai and Anirban DasGupta

Lawrence D. Brown, T. Tony Cai and Anirban DasGupta Statistical Science 2005, Vol. 20, No. 4, 375 379 DOI 10.1214/088342305000000395 Institute of Mathematical Statistics, 2005 Comment: Fuzzy and Randomized Confidence Intervals and P -Values Lawrence D.

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Computing Standard-Deviation-to-Mean and Variance-to-Mean Ratios under Interval Uncertainty is NP-Hard

Computing Standard-Deviation-to-Mean and Variance-to-Mean Ratios under Interval Uncertainty is NP-Hard Journal of Uncertain Systems Vol.9, No.2, pp.124-132, 2015 Online at: www.jus.org.uk Computing Standard-Deviation-to-Mean and Variance-to-Mean Ratios under Interval Uncertainty is NP-Hard Sio-Long Lo Faculty

More information

Likelihood Ratio Tests and Intersection-Union Tests. Roger L. Berger. Department of Statistics, North Carolina State University

Likelihood Ratio Tests and Intersection-Union Tests. Roger L. Berger. Department of Statistics, North Carolina State University Likelihood Ratio Tests and Intersection-Union Tests by Roger L. Berger Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 Institute of Statistics Mimeo Series Number 2288

More information

Binary choice 3.3 Maximum likelihood estimation

Binary choice 3.3 Maximum likelihood estimation Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

FINANCIAL OPTIMIZATION

FINANCIAL OPTIMIZATION FINANCIAL OPTIMIZATION Lecture 1: General Principles and Analytic Optimization Philip H. Dybvig Washington University Saint Louis, Missouri Copyright c Philip H. Dybvig 2008 Choose x R N to minimize f(x)

More information

Minimax design criterion for fractional factorial designs

Minimax design criterion for fractional factorial designs Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:

More information

The formal relationship between analytic and bootstrap approaches to parametric inference

The formal relationship between analytic and bootstrap approaches to parametric inference The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

More information

OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003

OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003 OPTIMIZATION THEORY IN A NUTSHELL Daniel McFadden, 1990, 2003 UNCONSTRAINED OPTIMIZATION 1. Consider the problem of maximizing a function f:ú n 6 ú within a set A f ú n. Typically, A might be all of ú

More information

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Monitoring Wafer Geometric Quality using Additive Gaussian Process Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department

More information

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction J. Appl. Math & Computing Vol. 13(2003), No. 1-2, pp. 457-470 ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS Myongsik Oh Abstract. The comparison of two or more Lorenz

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

Machine Learning And Applications: Supervised Learning-SVM

Machine Learning And Applications: Supervised Learning-SVM Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces Introduction to Optimization Techniques Nonlinear Optimization in Function Spaces X : T : Gateaux and Fréchet Differentials Gateaux and Fréchet Differentials a vector space, Y : a normed space transformation

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Confidence intervals for the variance component of random-effects linear models

Confidence intervals for the variance component of random-effects linear models The Stata Journal (2004) 4, Number 4, pp. 429 435 Confidence intervals for the variance component of random-effects linear models Matteo Bottai Arnold School of Public Health University of South Carolina

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV

APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV DOI 10.1007/s11018-017-1213-4 Measurement Techniques, Vol. 60, No. 5, August, 2017 APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV B. Yu. Lemeshko and T.

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 17 for Applied Multivariate Analysis Outline Multivariate Analysis of Variance 1 Multivariate Analysis of Variance The hypotheses:

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Computation of an efficient and robust estimator in a semiparametric mixture model

Computation of an efficient and robust estimator in a semiparametric mixture model Journal of Statistical Computation and Simulation ISSN: 0094-9655 (Print) 1563-5163 (Online) Journal homepage: http://www.tandfonline.com/loi/gscs20 Computation of an efficient and robust estimator in

More information

Fiducial Inference and Generalizations

Fiducial Inference and Generalizations Fiducial Inference and Generalizations Jan Hannig Department of Statistics and Operations Research The University of North Carolina at Chapel Hill Hari Iyer Department of Statistics, Colorado State University

More information

BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS

BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS Andrew A. Neath 1 and Joseph E. Cavanaugh 1 Department of Mathematics and Statistics, Southern Illinois University, Edwardsville, Illinois 606, USA

More information

p(d θ ) l(θ ) 1.2 x x x

p(d θ ) l(θ ) 1.2 x x x p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to

More information

Microeconomics I. September, c Leopold Sögner

Microeconomics I. September, c Leopold Sögner Microeconomics I c Leopold Sögner Department of Economics and Finance Institute for Advanced Studies Stumpergasse 56 1060 Wien Tel: +43-1-59991 182 soegner@ihs.ac.at http://www.ihs.ac.at/ soegner September,

More information

Minimum distance tests and estimates based on ranks

Minimum distance tests and estimates based on ranks Minimum distance tests and estimates based on ranks Authors: Radim Navrátil Department of Mathematics and Statistics, Masaryk University Brno, Czech Republic (navratil@math.muni.cz) Abstract: It is well

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris University of California, Davis Department of Agricultural and Resource Economics ARE 5 Lecture Notes Quirino Paris Karush-Kuhn-Tucker conditions................................................. page Specification

More information

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes Optimization Charles J. Geyer School of Statistics University of Minnesota Stat 8054 Lecture Notes 1 One-Dimensional Optimization Look at a graph. Grid search. 2 One-Dimensional Zero Finding Zero finding

More information

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Outlier detection and variable selection via difference based regression model and penalized regression

Outlier detection and variable selection via difference based regression model and penalized regression Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression

More information

Weighted tests of homogeneity for testing the number of components in a mixture

Weighted tests of homogeneity for testing the number of components in a mixture Computational Statistics & Data Analysis 41 (2003) 367 378 www.elsevier.com/locate/csda Weighted tests of homogeneity for testing the number of components in a mixture Edward Susko Department of Mathematics

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 12: Nonlinear optimization, continued Prof. John Gunnar Carlsson October 20, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20,

More information

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Maximum likelihood estimation

Maximum likelihood estimation Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization

More information

A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations

A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations Journal of Statistical Planning and Inference 18 (22) 85 97 www.elsevier.com/locate/jspi A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations J. Fortiana,

More information

Duality Theory of Constrained Optimization

Duality Theory of Constrained Optimization Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information