# On robust and efficient estimation of the center of. Symmetry.

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC , U.S.A ( Abstract In this paper, a class of estimators of the center of symmetry based on the empirical characteristic function is examined. In the spirit of the Hodges-Lehmann estimator, the resulting procedures are shown to be a function of the pairwise averages. The proposed procedures are also shown to have an equivalent representation as the minimizers of certain distances between two corresponding kernel density estimators. An alternative characterization of the Hodges-Lehmann estimator is established upon the use of a particularly simple choice of kernel. Keywords: Characteristic function; Efficiency; Hodges-Lehmann estimator; Robustness; Symmetry. 1

2 1 Introduction Many robust and nonparametric approaches to estimation of an unknown point of symmetry, θ, have been proposed in the literature. Among them are the Hodges- Lehmann estimator (Hodges and Lehmann, 1963) and the location M-estimators of Huber (1981). The use of characteristic function based procedures as a technique for robust estimation in the fully parametric setting is discussed by Heathcote (1977), Feuerverger and McDunnough (1981), and others. The use of empirical characteristic function procedures to test the hypothesis of distributional symmetry appears in Heathcote, Rachev, and Cheng (1995) and Henze, Klar, and Meintanis (2003). However, the use of a characteristic function approach to estimation under the simple assumption of distributional symmetry has not received attention. In this paper, a class of procedures based on minimizing a weighted norm of the imaginary part of the empirical characteristic function is investigated as an estimator of the center of symmetry. In the spirit of the Hodges-Lehmann estimator, the resulting procedures are shown to be a function of the pairwise averages. The proposed procedures are also shown to be equivalently represented as the minimizers of L 2 distances between two corresponding kernel density estimators. In view of this representation, an alternative characterization of the Hodges-Lehmann estimator is also established upon the choice of a particularly simple kernel. In the next section, the resulting family of procedures is described and is shown to be based on the Walsh averages. In section 3 the equivalence with an intuitive density estimation problem is shown. The asymptotic distribution is derived via standard theory for U-statistics in section 4. Specific examples, including a generalization of the Hodges-Lehmann estimator, are discussed in section 5, along with a small simulation study. 2

3 2 The estimating procedure Given a sample (x 1,..., x n ), the empirical characteristic function for each t IR is computed as: n n ˆf(t) = n 1 exp(itx j ) = n 1 {cos(tx j ) + i sin(tx j )}, (2.1) j=1 j=1 where i 1 is the imaginary unit. If the underlying distribution were symmetric about zero, the characteristic function would be strictly real. This is the idea behind the tests of symmetry of Heathcote et al. (1995) and Henze et al. (2003), along with the family of estimators of this paper. Heuristically, one would like to find the location parameter whose shift minimizes the imaginary part of the resulting characteristic function in some sense. The choice of objective function to be used here is a weighted integral involving the imaginary part of the characteristic function. Specifically, let σ 2 = Var (X). The chosen quantity to minimize is: [ 2 n D(θ) n 1 sin {t(x j θ)/σ}] w(t) dt, (2.2) j=1 where w(t) is the density function corresponding to some distribution symmetric around zero. Using the identity sin x sin y = {cos(x y) cos(x + y)}/2 one obtains: D(θ) = 1 n n 2 j=1 k=1 [cos {t(x j x k )/σ} cos {t(x j + x k 2θ)/σ}] w(t) dt. Using the symmetry of the distribution W, and the fact that the first term is independent of θ, the minimization problem can be expressed in the following form. The estimator, θ of the center of symmetry, θ, is given by: n n [ { ( θ arg min 1 ˆf 2 xj + x k w θ σ 2 j=1 k=1 )}] θ, (2.3) 3

4 where ˆf w ( ) represents the characteristic function of the distribution with density w(t). To derive the asymptotic distribution in section 4, it is useful to examine the estimating equation. The first order condition associated with (2.3) yields the estimating equation: n n j=1 k=1 { ( 2 xj + x k Ψ σ 2 )} θ = 0, (2.4) where Ψ(t) = ( / t) ˆf w (t), assuming that the derivative exists. Although it is possible that the estimating equation have multiple solutions, the estimator itself is defined as the minimizer of the objective function. 3 Representation via density estimation The estimator defined as the minimizer of (2.2) derived via the characteristic function has an equivalent representation as the minimizer of the L 2 distance between two kernel density estimates for a corresponding choice of kernel. Specifically, the relationship is given by the following theorem. Theorem 1 Let g(x) be a square-integrable density function symmetric around zero with corresponding characteristic function φ(t). Suppose that w(t) = k 1 φ 2 (t) for all t, with k = φ 2 (t) dt. Then the objective function based on the imaginary part of the empirical characteristic function using the weight function w(t) can alternatively be expressed as where and y j (x j θ)/σ. D(θ) = π 2k { fn (x) f n ( x)} 2 dx, f n (x) = n 1 n j=1 g(x y j), 4

5 Proof. First note that { n 1 n j=1 sin(ty j)} 2 w(t) = 1 4k n 1 n j=1 exp(ity j) φ(t) n 1 n j=1 exp( ity j) φ(t) 2. Now n 1 n j=1 exp(ity j) φ(t) is the characteristic function of the convolution of the distribution whose density is given by g(x) with the empirical distribution of y 1,..., y n, whereas n 1 n j=1 exp( ity j) φ(t) is the characteristic function of the convolution with the empirical distribution of y 1,..., y n. The densities of these two convolutions are given by f n (x) and f n ( x), respectively. Now for square-integrable densities f(x) and setting ˆf(t) exp(itx)f(x) dx, the Plancherel-Parseval Relation, gives ˆf(t) 2 dt = 2π f(x) 2 dx. The theorem then follows. Remark: Note that f n is a nonparametric kernel density estimator with kernel g(x) applied to the standardized data, whereas f n is the same density estimator applied to the standardized data after reflection about the origin. This representation in terms of an L 2 distance between the corresponding density estimates yields an alternative characterization of the estimator. One instead could have started by defining an estimator as the point which minimizes the distance between the kernel density estimator and its reflected version, yielding the identical procedure for particular choices of kernels and the corresponding weight functions. Two examples of this dual representation are given in section 5, although others are possible. 5

6 4 Asymptotic Distribution Using a first order Taylor expansion of the estimating equation in (2.4), one obtains the standard asymptotic representation: n 1/2 ( θ θ) a 1 n 1/2 V n, (4.1) where and a { ( )}] 2 [Ψ θ E X1 + X 2 θ, (4.2) σ 2 V n n 2 n n j=1 k=1 { ( 2 xj + x k Ψ σ 2 )} θ. (4.3) Under regularity conditions, one may interchange the derivative and expectation to compute the value of a. An example where this does not apply is given in section 5.2, where the direct computation of (4.2) is used. Meanwhile, V n in (4.3) is a V-statistic of order two, so the distribution of n 1/2 V n can be derived via general theory for the corresponding U-statistic (see, for example, Serfling, 1980). Specifically, for a V-statistic, V n, based on a symmetric kernel, h(x 1, x 2 ), with E{h(X 1, X 2 )} = 0 and E{h(X 1, X 2 )} 2 <, one obtains that V n is asymptotically normal with mean zero and Var (V n ) = 4ξ 1 /n + O(n 2 ), (4.4) where ξ 1 = E{h(X 1, X 2 )h(x 1, X 3 )} = E{h 1 (X)} 2 and h 1 (t) E X {h(t, X)}. Hence the asymptotic distribution of the estimator follows directly as n 1/2 ( θ θ) N(0, a 2 4ξ 1 ). Note that in this setting, the kernel is taken to be 6

7 h(x 1, x 2 ) Ψ { ( 2 x1 +x 2 θ )}. σ 2 To conduct inference, a consistent estimate of ξ 1 is given by n 3 n n n h(x i, x j )h(x i, x k ). (4.5) i=1 j=1 k=1 Alternative estimators of ξ 1 are given by Sen (1960), or a jackknife version by Arvesen (1969). In practice, σ 2 is typically unknown and must be estimated. If one chooses an auxiliary root-n consistent estimate then the asymptotic distribution is unaffected by the estimation of σ 2. A good choice due to its high degree of robustness would be the Median Absolute Deviation (MAD) as in Huber (1981). 5 Examples 5.1 Gaussian weight function A particularly useful choice of w(t) is to set, for c > 0, w(t) = (c/π) 1/2 exp( ct 2 ), (5.1) i.e. the density of N (0, 1/2c). For this particular choice of weight function, one obtains ˆf w (t) = exp( t 2 /4c), φ(t) = exp( ct 2 /2), (5.2) g(t) = (2πc) 1/2 exp( t 2 /2c). Thus, for this choice of weight function on the frequencies, it is equivalent to comparing density estimates using a Gaussian kernel with variance c, which then, as in density estimation, plays the role of the bandwidth. One can choose c based on efficiency considerations, i.e. to achieve a desired asymptotic relative efficiency at a given model 7

8 such as the normal as in typical robust procedures involving a tuning constant. Alternatively, one can use bandwidth selection techniques from density estimation to choose an appropriate value. In this paper, the approach based on asymptotic efficiency is further discussed. The score function involved in the estimating equation (2.4) for this choice of weight function is given by Ψ(t) = (t/2c) exp( t 2 /4c). From this, one can interchange differentiation and integration in (4.2) to obtain where Z = 2 ( X 1 +X 2 2 θ ) /σ. { Z 2 c a = E σc 2 } exp( Z 2 /2c), (5.3) For this choice of weight function, an explicit expression can be derived for the asymptotic variance of the estimator θ 1 under the assumption of normality of the underlying distribution. This allows one to choose the constant c based on a desired efficiency level. Under the assumption that X N(θ, σ 2 ), to compute the asymptotic distribution, the random variable Z in (5.3) is a standard normal, and by a standard calculation, one then obtains c a = σ 2 (c + 1) 3. (5.4) After some straightforward, but tedious, algebra one also obtains, ξ 1 = 2c {(2c + 1)(2c + 3)} 3/2. (5.5) Hence using (5.4) and (5.5), along with the representation given by (4.1), the asymptotic variance under normality has the following simple form: { } 3 2c + 2 n Var ( θ 1 θ) σ 2. (5.6) (2c + 1)(2c + 3) 8

9 As expected, as c the variance converges to σ 2, as θ 1 tends toward the mean of the pairwise averages, which is just the mean, x, itself. To achieve an asymptotic efficiency level of 0 < α < 1, using (5.6), one would choose c = (2 1 α 2/3 ) 1 1. For 95% efficiency at the normal distribution, c A generalization of the Hodges-Lehmann estimator An alternate choice of w(t) is given by w(t) = (πc) 1 {1 cos(ct)}/t 2. (5.7) This distribution is sometimes known as Polya s distribution in reference to its use in the proof of Polya s criterion to construct characteristic functions (Durrett, 1996, chapter 2). Although this choice of weight function may seem strange at first, the reason for this choice shall become apparent as both its corresponding kernel and the connection with the Hodges-Lehmann estimator are demonstrated. For this particular choice, using the fact that 1 cos(t) = 2 sin 2 (t/2), one obtains ( ˆf w (t) = 1 t ) +, c φ(t) = sin(ct/2), (5.8) ct/2 g(t) = c 1 I{t [ c/2, c/2]}. From (5.8), it follows that using the Polya weight function on the frequencies is equivalent to comparing density estimates using a Uniform kernel of width c. The fact that this simple choice of kernel for the density estimation view arises from the choice of Polya s distribution for a weight function makes it intuitively appealing. In addition, the form of the characteristic function of this distribution, ˆfw (t), is now shown to be intimately tied to the Hodges-Lehmann estimator. From (2.3), the estimator, θ 2, derived from using this choice of weight function or, equivalently, the uniform kernel density estimator, is then given as the solution of the 9

10 following minimization problem: { n n θ 2 arg min 1 θ j=1 k=1 ( 1 2 (x ) } + j + x k )/2 θ. (5.9) σc As the median of the pairwise averages, a variant of the Hodges-Lehmann estimator, θ HL can be defined as θ HL arg min θ n j=1 k=1 n (x j + x k )/2 θ. (5.10) Clearly, for sufficiently large c, the two estimators, θ 2 and θ HL will coincide exactly. Hence an alternative view of the Hodges-Lehmann estimator is taken as the limiting value of the family of estimators indexed by the constant c. The estimator θ 2 defined by (5.9) is the solution to the estimating equation as in (2.4) with Ψ(t) = sign(t) I{ t c} = I{0 t c} I{ c t 0}. (5.11) Assuming the symmetry of the underlying distribution, direct use of (4.1) - (4.4) after some calculations then yields the asymptotic variance of θ 2 as n Var ( θ 2 θ) {F(x + cσ) + F(x cσ) 2F(x)} 2 f(x) dx 4 [ {f(x) f(x + cσ)} f(x) dx ] 2, (5.12) where f denotes the density function of the underlying random variable X, and F is the corresponding distribution function. Note that as expected, as c, (5.12) converges to the variance of the Hodges-Lehmann estimator. To be used in practice, estimation of the numerator of (5.12) can be done by substitution of the empirical distribution function for the distribution function, or can be done via the empirical moments as discussed at the end of section 4. To estimate the denominator, a kernel density estimate can be plugged in for the density. Alternatively, one could use a normal approximation or a bootstrap approach to the distribution of 10

11 the estimating function itself and construct confidence intervals and tests of hypothesis based on the distribution of the estimating function as discussed in Jiang and Kalbfleisch (2004). 5.3 Simulation study A small simulation study was carried out to access the efficiency and robustness of the estimator, θ 1, based on the Gaussian weight function, along with the accuracy of the asymptotic approximation. The Hodges-Lehmann estimator, which is equivalent to θ 2 for large c, is compared along with the mean, median, and Huber location estimator. For the Huber estimator, as well as the estimator θ 1, the tuning parameters are chosen to yield 95% asymptotic efficiency at the normal distribution, and the median absolute deviation is used as an auxiliary estimate of scale. Five different choices of symmetric distributions are simulated. The distributions considered are: 1. normal 2. t-distribution with 3 degrees of freedom 3. two-sided exponential 4. a 90/10 mixture of two normals with identical means, but the smaller component having a standard deviation three times the larger component 5. an 80/20 mixture as in the previous case. For each setup, 5000 samples of size 20 are generated and the variances are shown in Table 1. Note that the asymptotic variance formula would yield a value of 1/.95 = for θ 1 under the normal distribution, which agrees very closely with the simulated value of 1.06 even for samples of size 20. Hence the asymptotic approximation appears to be 11

12 reasonable. Overall, the estimator θ 1 performs similarly to both the Huber estimator and the Hodges-Lehmann estimator under the various symmetric distributions in terms of efficiency. (**** TABLE 1 GOES HERE ****) Additionally, contaminated asymmetric distributions are simulated by mixtures of normals with different means. In this situation a contamination bias is created in each of the estimators. Table 2 shows the mean squared error for each of the estimators under 5% contamination. Each setup contaminates a standard normal with a 5% proportion of a shifted normal for various shifts. Table 3 shows the corresponding results for 10% contamination. In both cases, the newly proposed estimator has smaller mean squared error than the other estimators, particularly for more extreme contamination. Note that θ 1 has a redescending influence function so that as the contamination gets further from the true mean, the influence returns to zero. The other estimators in the study do not have this redescending property, but one can instead use alternative M-estimators that do, but their performance is not investigated in this study. (**** TABLES 2 AND 3 GO HERE ****) 6 Discussion This paper has proposed a new class of estimation procedures for estimating the center of a symmetric distribution. This class of estimators has been shown to be equivalently represented as a solution to an optimization problem based on either the 12

13 characteristic function or a seemingly unrelated kernel density estimate. The proposed estimators exhibit performance that compete with the currently implemented procedures, while their intuitive construction and dual interpretation makes them a interesting alternative. As with the typical location estimators, the estimation procedures proposed in this paper have a straightforward extension to the (multiple) regression setting under the assumption of symmetry about zero for the distribution of the residuals. Setting θ = β T x, one can then minimize the objective function in terms of the regression coefficients. The use of these estimators in regression deserve further investigation. References Arvesen, J. N. (1969), Jackknifing U-statistics, Ann. Math. Statist. 40, Durrett, R. (1996), Probability: Theory and Examples, Belmont, CA: Duxbury Press. Feuerverger, A. and McDunnough, P. (1981), On some fourier methods for inference, J. Am. Statist. Assoc. 76, Heathcote, C. R. (1977), The integrated squared error estimation of parameters, Biometrika 64, Heathcote, C. R., Rachev, S. T. and Cheng, B. (1995), Testing multivariate symmetry, J. Multivariate Anal. 54, Henze, N., Klar, B. and Meintanis, S. G. (2003), Invariant tests for symmetry about an unspecified point based on the empirical characteristic function, J. Multivariate Anal. 87, Hodges, J. L. and Lehmann, E. L. (1963), Estimates of location based on rank tests, Ann. Math. Statist. 34, Huber, P. J. (1981). Robust Statistics, New York: Wiley. Jiang, W. and Kalbfleisch, J. (2004), Resampling methods for estimating functions with U- statistic structure, University of Michigan Department of Biostatistics Working Paper 13

14 Series, Working Paper 33. Sen, P. K. (1960), On some convergence properties of U-statistics, Calcutta Statist. Assoc. Bull. 10, Serfling, R. J. (1980), Approximation Theorems of Mathematical Statistics, New York: Wiley. 14

15 Table 1: Variances under various symmetric distributions for samples of size 20. The table entries denote n times the variance. N (0,1) t (3) DE.9 N (0,1) +.1 N (0,9).8 N (0,1) +.2 N (0,9) θ x Median HL Huber Table 2: Mean squared error under various contaminating distributions for samples of size 20 with 5% contamination. The table entries denote n times the mean squared error. N (1,1) N (2,1) N (3,1) N (4,1) N (5,1) θ x Median HL Huber

16 Table 3: Mean squared error under various contaminating distributions for samples of size 20 with 10% contamination. The table entries denote n times the mean squared error. N (1,1) N (2,1) N (3,1) N (4,1) N (5,1) θ x Median HL Huber

### Asymptotic Relative Efficiency in Estimation

Asymptotic Relative Efficiency in Estimation Robert Serfling University of Texas at Dallas October 2009 Prepared for forthcoming INTERNATIONAL ENCYCLOPEDIA OF STATISTICAL SCIENCES, to be published by Springer

### GOODNESS OF FIT TESTS FOR PARAMETRIC REGRESSION MODELS BASED ON EMPIRICAL CHARACTERISTIC FUNCTIONS

K Y B E R N E T I K A V O L U M E 4 5 2 0 0 9, N U M B E R 6, P A G E S 9 6 0 9 7 1 GOODNESS OF FIT TESTS FOR PARAMETRIC REGRESSION MODELS BASED ON EMPIRICAL CHARACTERISTIC FUNCTIONS Marie Hušková and

### Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

### the convolution of f and g) given by

09:53 /5/2000 TOPIC Characteristic functions, cont d This lecture develops an inversion formula for recovering the density of a smooth random variable X from its characteristic function, and uses that

### A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS

A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS Dimitrios Konstantinides, Simos G. Meintanis Department of Statistics and Acturial Science, University of the Aegean, Karlovassi,

### Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

### A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

### σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

### Contents 1. Contents

Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

### 5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

### Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density

### 8.7 Taylor s Inequality Math 2300 Section 005 Calculus II. f(x) = ln(1 + x) f(0) = 0

8.7 Taylor s Inequality Math 00 Section 005 Calculus II Name: ANSWER KEY Taylor s Inequality: If f (n+) is continuous and f (n+) < M between the center a and some point x, then f(x) T n (x) M x a n+ (n

### Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal

### TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION Gábor J. Székely Bowling Green State University Maria L. Rizzo Ohio University October 30, 2004 Abstract We propose a new nonparametric test for equality

### Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

### Smooth simultaneous confidence bands for cumulative distribution functions

Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

### Geographically Weighted Regression as a Statistical Model

Geographically Weighted Regression as a Statistical Model Chris Brunsdon Stewart Fotheringham Martin Charlton October 6, 2000 Spatial Analysis Research Group Department of Geography University of Newcastle-upon-Tyne

### A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

### Density Estimation. Seungjin Choi

Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

### Testing against a linear regression model using ideas from shape-restricted estimation

Testing against a linear regression model using ideas from shape-restricted estimation arxiv:1311.6849v2 [stat.me] 27 Jun 2014 Bodhisattva Sen and Mary Meyer Columbia University, New York and Colorado

### Estimation of a quadratic regression functional using the sinc kernel

Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073

### Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

### STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

### O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

### Math 127: Course Summary

Math 27: Course Summary Rich Schwartz October 27, 2009 General Information: M27 is a course in functional analysis. Functional analysis deals with normed, infinite dimensional vector spaces. Usually, these

### Reproducing Kernel Hilbert Spaces

9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

### Variance Function Estimation in Multivariate Nonparametric Regression

Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cai 1, Michael Levine Lie Wang 1 Abstract Variance function estimation in multivariate nonparametric regression is considered

### ON THE ESTIMATION OF EXTREME TAIL PROBABILITIES. By Peter Hall and Ishay Weissman Australian National University and Technion

The Annals of Statistics 1997, Vol. 25, No. 3, 1311 1326 ON THE ESTIMATION OF EXTREME TAIL PROBABILITIES By Peter Hall and Ishay Weissman Australian National University and Technion Applications of extreme

### 1. Fisher Information

1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

### Estimating and Testing Quantile-based Process Capability Indices for Processes with Skewed Distributions

Journal of Data Science 8(2010), 253-268 Estimating and Testing Quantile-based Process Capability Indices for Processes with Skewed Distributions Cheng Peng University of Southern Maine Abstract: This

### UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

### Discussion Paper No. 28

Discussion Paper No. 28 Asymptotic Property of Wrapped Cauchy Kernel Density Estimation on the Circle Yasuhito Tsuruta Masahiko Sagae Asymptotic Property of Wrapped Cauchy Kernel Density Estimation on

### The loss function and estimating equations

Chapter 6 he loss function and estimating equations 6 Loss functions Up until now our main focus has been on parameter estimating via the maximum likelihood However, the negative maximum likelihood is

### Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

### Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

### Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n

Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n Jiantao Jiao (Stanford EE) Joint work with: Kartik Venkat Yanjun Han Tsachy Weissman Stanford EE Tsinghua

### Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

### IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

### Motivation. A Nonparametric test for the existence of moments. Purpose. Outline

A Nonparametric test for the existence of moments Does the k-th moment exist? K. Hitomi 1, Y. Nishiyama 2 and K. Nagai 3 1 Kyoto Institute of Technology, 2 Kyoto University 3 Yokohama National University

### One-Sample Numerical Data

One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

### Smooth nonparametric estimation of a quantile function under right censoring using beta kernels

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth

### What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

### Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching

The work of Kosten and McKean was partially supported by NIAAA Grant 1R21AA017906-01A1 Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching Bradley E. Huitema Western

### PRINCIPLES OF STATISTICAL INFERENCE

Advanced Series on Statistical Science & Applied Probability PRINCIPLES OF STATISTICAL INFERENCE from a Neo-Fisherian Perspective Luigi Pace Department of Statistics University ofudine, Italy Alessandra

### A Bootstrap Test for Conditional Symmetry

ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

### Antonietta Mira. University of Pavia, Italy. Abstract. We propose a test based on Bonferroni's measure of skewness.

Distribution-free test for symmetry based on Bonferroni's Measure Antonietta Mira University of Pavia, Italy Abstract We propose a test based on Bonferroni's measure of skewness. The test detects the asymmetry

### Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

### Lawrence D. Brown* and Daniel McCarthy*

Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals

### Estimation of Treatment Effects under Essential Heterogeneity

Estimation of Treatment Effects under Essential Heterogeneity James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March

### Institute of Statistics Mimeo Series No Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR

DEPARTMENT OF STATISTICS North Carolina State University 2501 Founders Drive, Campus Box 8203 Raleigh, NC 27695-8203 Institute of Statistics Mimeo Series No. 2583 Simultaneous regression shrinkage, variable

### Non-asymptotic Analysis of Bandlimited Functions. Andrei Osipov Research Report YALEU/DCS/TR-1449 Yale University January 12, 2012

Prolate spheroidal wave functions PSWFs play an important role in various areas, from physics e.g. wave phenomena, fluid dynamics to engineering e.g. signal processing, filter design. Even though the significance

### Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model

204 Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model S. A. Ibrahim 1 ; W. B. Yahya 2 1 Department of Physical Sciences, Al-Hikmah University, Ilorin, Nigeria. e-mail:

### Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

### AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

### Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS-017-0013 Title A Bootstrap Method for Constructing Pointwise and Uniform Confidence Bands for Conditional Quantile Functions Manuscript ID SS-017-0013 URL http://wwwstatsinicaedutw/statistica/

### Exchangeability. Peter Orbanz. Columbia University

Exchangeability Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent noise Peter

### GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

### REFERENCES AND FURTHER STUDIES

REFERENCES AND FURTHER STUDIES by..0. on /0/. For personal use only.. Afifi, A. A., and Azen, S. P. (), Statistical Analysis A Computer Oriented Approach, Academic Press, New York.. Alvarez, A. R., Welter,

### Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

### Solutions for Econometrics I Homework No.3

Solutions for Econometrics I Homework No3 due 6-3-15 Feldkircher, Forstner, Ghoddusi, Pichler, Reiss, Yan, Zeugner April 7, 6 Exercise 31 We have the following model: y T N 1 X T N Nk β Nk 1 + u T N 1

### Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

### Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

1 / 171 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2013 2 / 171 Unpaid advertisement

### Types of Real Integrals

Math B: Complex Variables Types of Real Integrals p(x) I. Integrals of the form P.V. dx where p(x) and q(x) are polynomials and q(x) q(x) has no eros (for < x < ) and evaluate its integral along the fol-

### Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

### The algorithm of noisy k-means

Noisy k-means The algorithm of noisy k-means Camille Brunet LAREMA Université d Angers 2 Boulevard Lavoisier, 49045 Angers Cedex, France Sébastien Loustau LAREMA Université d Angers 2 Boulevard Lavoisier,

### GARCH Models Estimation and Inference

GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

### 1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.

Math120 - Precalculus. Final Review. Fall, 2011 Prepared by Dr. P. Babaali 1 Algebra 1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.

### X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

### Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

### EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS. Eun Yi Chung Joseph P. Romano

EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS By Eun Yi Chung Joseph P. Romano Technical Report No. 20-05 May 20 Department of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065 EXACT AND

### Let R be the line parameterized by x. Let f be a complex function on R that is integrable. The Fourier transform ˆf = F f is. e ikx f(x) dx. (1.

Chapter 1 Fourier transforms 1.1 Introduction Let R be the line parameterized by x. Let f be a complex function on R that is integrable. The Fourier transform ˆf = F f is ˆf(k) = e ikx f(x) dx. (1.1) It

### From Fourier to Wavelets in 60 Slides

From Fourier to Wavelets in 60 Slides Bernhard G. Bodmann Math Department, UH September 20, 2008 B. G. Bodmann (UH Math) From Fourier to Wavelets in 60 Slides September 20, 2008 1 / 62 Outline 1 From Fourier

### Characteristic Functions and the Central Limit Theorem

Chapter 6 Characteristic Functions and the Central Limit Theorem 6.1 Characteristic Functions 6.1.1 Transforms and Characteristic Functions. There are several transforms or generating functions used in

### Local Quantile Regression

Wolfgang K. Härdle Vladimir Spokoiny Weining Wang Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin http://ise.wiwi.hu-berlin.de

### Nonlinear Regression

Nonlinear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Nonlinear Regression 1 / 36 Nonlinear Regression 1 Introduction

### Lecture 1 Measure concentration

CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

### For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance

### Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

### Why experimenters should not randomize, and what they should do instead

Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction

### A NORMAL SCALE MIXTURE REPRESENTATION OF THE LOGISTIC DISTRIBUTION ABSTRACT

#1070R A NORMAL SCALE MIXTURE REPRESENTATION OF THE LOGISTIC DISTRIBUTION LEONARD A. STEFANSKI Department of Statistics North Carolina State University Raleigh, NC 27695 ABSTRACT In this paper it is shown

### Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 12, 2007 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

### Exam C Solutions Spring 2005

Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8

### Understanding Ding s Apparent Paradox

Submitted to Statistical Science Understanding Ding s Apparent Paradox Peter M. Aronow and Molly R. Offer-Westort Yale University 1. INTRODUCTION We are grateful for the opportunity to comment on A Paradox

### The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty

### Regression models for multivariate ordered responses via the Plackett distribution

Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

### D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors

Journal of Data Science 920), 39-53 D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors Chuan-Pin Lee and Mong-Na Lo Huang National Sun Yat-sen University Abstract: Central

### Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

### 17 The functional equation

18.785 Number theory I Fall 16 Lecture #17 11/8/16 17 The functional equation In the previous lecture we proved that the iemann zeta function ζ(s) has an Euler product and an analytic continuation to the

### Parametric Inference

Parametric Inference Moulinath Banerjee University of Michigan April 14, 2004 1 General Discussion The object of statistical inference is to glean information about an underlying population based on a

### Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value

### Quantile Regression: Inference

Quantile Regression: Inference Roger Koenker University of Illinois, Urbana-Champaign Aarhus: 21 June 2010 Roger Koenker (UIUC) Introduction Aarhus: 21.6.2010 1 / 28 Inference for Quantile Regression Asymptotics

### Application of Homogeneity Tests: Problems and Solution

Application of Homogeneity Tests: Problems and Solution Boris Yu. Lemeshko (B), Irina V. Veretelnikova, Stanislav B. Lemeshko, and Alena Yu. Novikova Novosibirsk State Technical University, Novosibirsk,

### Locally Robust Semiparametric Estimation

Locally Robust Semiparametric Estimation Victor Chernozhukov Juan Carlos Escanciano Hidehiko Ichimura Whitney K. Newey The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

### Math 12 Final Exam Review 1

Math 12 Final Exam Review 1 Part One Calculators are NOT PERMITTED for this part of the exam. 1. a) The sine of angle θ is 1 What are the 2 possible values of θ in the domain 0 θ 2π? 2 b) Draw these angles

### Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data

### Nonparametric Methods

Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis