On a Nonparametric Notion of Residual and its Applications

Size: px
Start display at page:

Download "On a Nonparametric Notion of Residual and its Applications"

Transcription

1 On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv: v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014 Abstract Let (X, Z) be a continuous random vector in R R d, d 1. In this paper, we define the notion of a nonparametric residual of X on Z that is always independent of the predictor Z. We study its properties and show that the proposed notion of residual matches with the usual residual (error) in a multivariate normal regression model. Given a random vector (X, Y, Z) in R R R d, we use this notion of residual to show that the conditional independence between X and Y, given Z, is equivalent to the mutual independence of the residuals (of X on Z and Y on Z) and Z. This result is used to develop a test for conditional independence. 1 Introduction Let (X, Z) be a random vector in R R d = R d+1, d 1. We assume that (X, Z) has a joint density on R d+1. If we want to predict X using Z we usually formulate the following regression problem: X = m(z) + ɛ, (1.1) where m(z) = E(X Z = z) is the conditional mean of X given Z = z and ɛ := X m(z) is the residual (although ɛ is usually called the error, and its estimate the residual, for this paper we feel that the term residual is more appropriate). Typically we further assume that the residual ɛ is independent of Z. However, Supported by NSF Grant DMS

2 intuitively, we are just trying to break the information in (X, Z) into two parts: a part that contains all relevant information about X, and the residual (the left over) which does not have anything to do with the relationship between X and Z. In this paper we address the following question: given any random vector (X, Z) how do we define the notion of a residual of X on Z that matches with the above intuition? Thus, formally, we want to find a function ϕ : R d+1 R such that the residual ϕ(x, Z) satisfies the following two conditions: (C.1) the residual ϕ(x, Z) is independent of the predictor Z, i.e., ϕ(x, Z) Z, and (1.2) (C.2) i.e., the information content of (X, Z) is the same as that of (ϕ(x, Z), Z), σ(x, Z) = σ(ϕ(x, Z), Z), (1.3) where σ(x, Z) denotes the σ-field generated by X and Z. We can also express (1.3) as: there exists a measurable function h : R d+1 R such that see e.g., Theorem 20.1 of Billingsley (1995). X = h(z, ϕ(x, Z)); (1.4) In this paper we propose a notion of a residual that satisfies (slightly stronger forms of) the above two conditions, under any joint distribution of X and Z; see Section 2 for the details. We investigate the properties of this notion of residual in Section 2. We show that this notion indeed reduces to the usual residual (error) in a multivariate normal regression model. Suppose now that (X, Y, Z) has a joint density on R R R d = R d+2. The assumption of conditional independence means that X is independent of Y given Z, i.e., X Y Z. Conditional independence is an important concept in modeling causal relations and in graphical models. However, there are very few easily implementable statistical tests for checking the assumption of conditional independence: Fukumizu et al. (2007) propose a measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces; Zhang et al. (2012) propose another kernel-based conditional independence test; Székely and Rizzo (2014) investigate a method that is easy to compute and can capture non-linear dependencies but does not completely characterize conditional independence; also see Györfi and Walk (2012) and the references therein. 2

3 In Section 3 we use the notion of residual defined in Section 2 to show that the conditional independence between X and Y, given Z, is equivalent to the mutual independence of the residuals (of X on Z and Y on Z) and Z. This reduction immediately allows the practitioner to use any of the numerous statistical methods available for testing mutual independence of random variables/vectors to test the hypothesis of conditional independence. We, in particular, in Section 3.1, use this result to propose a test for conditional independence using the energy statistic (see Székely and Rizzo (2005) and Rizzo and Székely (2010)). The resulting testing procedure is tuning parameter free (an advantage over existing methods), once the residuals have been estimated. The use of the energy statistic to test the mutual independence of two or more random variables (or vectors) is also new, and is outlined in Section 3.1. The estimation of the proposed residual, from data, is briefly described in Section 3.2. We end with a brief discussion, see Section 4, where we point to some open research problems and outline an idea, using the proposed residuals, to define (and test) a nonparametric notion of partial correlation. 2 A nonparametric notion of residual Conditions (C.1) (C.2) do not necessarily lead to a unique choice for ϕ. To find a meaningful and unique function ϕ that satisfies conditions (C.1) (C.2) we impose the following natural restrictions on ϕ. We assume that (C.3) x ϕ(x, z) is strictly increasing in its support, for every fixed z R d. Note that condition (C.3) is a slight strengthening of condition (C.2). Suppose that a function ϕ satisfies conditions (C.1) and (C.3). Then any strictly monotone transformation of ϕ(, z) would again satisfy (C.1) and (C.3). Thus, conditions (C.1) and (C.3) do not uniquely specify ϕ. To handle this identifiability issue, we replace condition (C.1) with (C.4), described below. First observe that, by condition (C.1), the conditional distribution of the random variable ϕ(x, Z) given Z = z does not depend on z. We assume that (C.4) ϕ(x, Z) Z = z is uniformly distributed, for all z R d. Condition (C.4) is again quite natural we usually assume that the residual has a fixed distribution, e.g., in regression we assume that the (standardized) residual in normally distributed with zero mean and unit variance. Note that condition (C.4) is slightly stronger than (C.1) and will help us uniquely identify ϕ. The following result shows that, indeed, under conditions (C.3) (C.4), a unique ϕ exists and gives its form. 3

4 Lemma 2.1. Let F X Z ( z) denote the conditional distribution function of X Z = z. Under conditions (C.3) and (C.4), we have a unique choice of ϕ(x, z), given by Also, h(z, u) can be taken as ϕ(x, z) = F X Z (x z). (2.1) h(z, u) = F 1 X Z (u z). (2.2) Proof. Fix z in the support of Z. Let u (0, 1). Let us write ϕ z (x) = ϕ(x, z). By condition (C.4), we have P[ϕ(X, Z) u Z = z] = u. On the other hand, by (C.3), Thus, we have P[ϕ(X, Z) u Z = z] = P[X ϕ 1 z (u) Z = z] = F X Z (ϕ 1 (u) z). F X Z (ϕ 1 z (u) z) = u, for all u (0, 1), which is equivalent to ϕ z (x) = F X Z (x z). Let h be as defined in (2.2). Then, as required. h(z, ϕ(x, z)) = F 1 1 X Z (ϕ(x, z) z) = FX Z (F X Z(x z) z) = x, Thus from the above lemma, we conclude that in the nonparametric setup, if we want to have a notion of a residual satisfying conditions (C.3) (C.4) then the residual has to be F X Z (X Z). z Remark 2.2. Let us first consider the example when (X, Z) follows a multivariate Gaussian distribution, i.e., ( ) (( ) ( )) X µ 1 σ 11 σ12 N, Σ :=, Z µ 2 σ 12 Σ 22 where µ 1 R, µ 2 R d, Σ is a (d + 1) (d + 1) positive definite matrix with σ 11 > 0, σ 12 R d 1 and Σ 22 R d d. Then the conditional distribution of X given Z = z is N(µ 1 + σ12σ 1 22 (z µ 2 ), σ 11 σ12σ 1 22 σ 12 ). Therefore, we have the following representation in the form of (1.1): X = µ 1 + σ 12Σ 1 22 (Z µ 2 ) + 4 ( ) X µ 1 σ12σ 1 22 (Z µ 2 )

5 where the usual residual is X µ 1 σ12σ 1 22 (Z µ 2 ), which is known to be independent of Z. In this case, using Lemma 2.1, we get ( ) X µ 1 σ ϕ(x, Z) = Φ 12Σ 1 22 (Z µ 2 ), σ11 σ12σ 1 22 σ 12 where Φ( ) is the distribution function of the standard normal distribution. Thus ϕ(x, Z) is just a fixed strictly increasing transformation of the usual residual, and the two notions of residual essentially coincide. Remark 2.3. The above notion of residual does not extend so easily to the case of discrete random variables. Conditions (C.1) and (C.2) are equivalent to the fact that σ(x, Z) factorizes into two sub σ-fields as σ(x, Z) = σ(ϕ(x, Z)) σ(z). This may not be always possible as can be seen from the following simple example. Let (X, Z) take values in {0, 1} 2 such that P[X = i, Z = j] > 0 for all i, j {0, 1}. Then it can be shown that such a factorization exists if and only if X and Z are independent, in which case ϕ(x, Z) = X. Remark 2.4. Lemma 2.1 also gives an way to generate X, using Z and the residual. We can first generate Z, following its marginal distribution, and an independent Uniform(0, 1) random variable U, which will act as the residual. Then (1.4), where h is defined in (2.2), shows that we can generate X = F 1 X Z (U Z). In practice, we need to estimate the residual F X Z (X Z) from observed data, which can be done both parametrically and non-parametrically. If we have a parametric model for F X Z ( ), we can estimate the parameters, using e.g., maximum likelihood, etc. If we do not want to assume any structure on F X Z ( ), we can use any nonparametric smoothing method, e.g., standard kernel methods, for estimation; see Bergsma (2011) for such an implementation. We will discuss the estimation of the residuals in more detail in Section Testing Conditional Independence Suppose now that (X, Y, Z) has a joint density on R R R d = R d+2. We want to test the hypothesis of conditional independence between X and Y given Z, i.e., X Y Z. Recently this problem has received some attention in the statistical literature; see e.g., Su and White (2007), Huang (2010), Song (2009). Although there is a large body of work on testing mutual independence of two random 5

6 vectors, testing conditional independence has received relatively less attention, albeit it being a central notion in modeling causal relations, and its importance in graphical modeling (see Lauritzen (1996), Pearl (2000)). The conditional independence assumption also has important implications in the insurance markets and, more generally, in economic theory (see Chiappori and Salanié (2000)) and in the literature of program evaluations (see Heckman et al. (1997)) among other fields. In this section we state a simple result that reduces testing for the conditional independence hypothesis H 0 : X Y Z to a problem of testing mutual independence between three random variables/vectors that involve our notion of residual. We also briefly describe a procedure to test the mutual independence of the three random variables/vectors. We start with the statement of the crucial lemma. Lemma 3.1. Suppose that (X, Y, Z) has a continuous joint density on R d+2. Then, X Y Z if and only if F X Z (X Z), F Y Z (Y Z) and Z are mutually independent. Proof. Let us make the following change of variable (X, Y, Z) (U, V, Z) := (F X Z (X), F Y Z (Y ), Z). The joint density of (U, V, Z) can be expressed as f (U,V,Z) (u, v, z) = where x = F 1 X Z=z f(x, y, z) f X Z=z (x)f Y Z=z (y) = f (X,Y ) Z=z(x, y)f Z (z) f X Z=z (x)f Y Z=z (y), (3.1) 1 (u), and y = FY Z=z (v). Note that as the Jacobian matrix is upper-triangular, the determinant is the product of the diagonal entries of the matrix, namely, f X Z=z (x), f Y Z=z (y) and 1. If X Y Z then f (U,V,Z) (u, v, z) reduces to just f Z (z), for u, v (0, 1), from the definition of conditional independence, which shows that U, V, Z are independent (note that it is easy to show that U, V are marginally Uniform(0, 1)). Now, given that U, V, Z are independent, we know that f (U,V,Z) (u, v, z) = f Z (z) for u, v (0, 1), which from (3.1) easily shows that X Y Z. Remark 3.2. Bergsma (2011) developed a test for conditional independence by testing mutual independence between F X Z (X Z) and F Y Z (Y Z). However, as the following example illustrates, the independence of F X Z (X Z) and F Y Z (Y Z) is not enough to guarantee that X Y Z. Let W 1, W 2, W 3 be i.i.d. Uniform(0, 1) 6

7 random variables. Let X = W 1 + W 3, Y = W 2 and Z = mod(w 1 + W 2, 1), where mod stands for the modulo (sometimes called modulus) operation that finds the remainder of the division W 1 + W 2 by 1. Clearly, the random vector (X, Y, Z) has a smooth continuous density on [0, 1] 3. Note that Z is independent of W i, for i = 1, 2. Hence, X, Y and Z are pairwise independent. Thus, F X Z (X) = F X (X) and F Y Z (X) = F Y (Y ), where F X and F Y are the marginal distribution functions of X and Y respectively. From the independence of X and Y, F X (X) and F Y (Y ) are independent. On the other hand, the value of W 1 is clearly determined by Y and Z, i.e., W 1 = Z Y if Y Z and W 1 = Z Y + 1 if Y > Z. Consequently, X and Y are not conditionally independent given Z. To see this, note that for every z (0, 1), { z Y if Y z E[X Y, Z = z] = z Y if Y > z, which obviously depends on Y. Remark 3.3. We can extend the above result to the case when X and Y are random vectors in R p and R q respectively. In that case we define the conditional multivariate distribution transform F X Z by successively conditioning on the co-ordinate random variables, i.e., if X = (X 1, X 2 ) then we can define F X Z as (F X2 X 1,Z, F X1 Z). With this definition, Lemma 3.1 still holds. To use Lemma 3.1 to test the conditional independence between X and Y given Z, we need to first estimate the residuals F X Z (X Z) and F Y Z (Y Z) from observed data, which can be done by any nonparametric smoothing method, e.g., standard kernel methods (see Section 3.2). Then, any procedure for testing the mutual independence of F X Z (X Z), F Y Z (Y Z) and Z can be used. In this paper we advocate the use of the energy statistic (see Rizzo and Székely (2010)), described briefly in the next subsection, to test the mutual independence of the three random variables/vectors. 3.1 Testing mutual independence of three or more random vectors Testing independence of two random variables (or vectors) has received much recent attention in the statistical literature; see e.g., Székely et al. (2007), Gretton et al. (2005), and the references therein. To test the mutual independence of the above three random variables we use the methodology of Rizzo and Székely (2010) 7

8 (also see Székely et al. (2007), Székely and Rizzo (2009)) developed in the context of testing the equality of the distribution of two or more random variables (or vectors). In the following we briefly describe our procedure in the general setup. Suppose that we have r 3 random variables (or vectors) T 1, T 2,..., T r. We write T := (T 1, T 2,..., T r ) and introduce T ind := (T1, T2,..., Tr ) where Tj is an i.i.d. copy of T j, j = 1, 2,..., r, but in T ind the coordinates, T1, T2,..., Tr, are independent. To test the mutual independence of T 1, T 2,..., T r all we need to test now is whether T and T ind are identically distributed. We can test this by considering the following energy statistic (see Székely and Rizzo (2005) and Rizzo and Székely (2010)) Λ(T ) = 2E T T ind E T T E T ind T ind, (3.2) where the denotes an i.i.d. copy, and denotes the Euclidean norm. Note that Λ(T ) is always nonnegative, and equals 0, if and only if T and T ind are identically distributed, i.e., if and only if T 1, T 2,..., T d are mutually independent (see Corollary 1 of Székely and Rizzo (2005)). We use the sample version of (3.2) as our test statistic. In the following we briefly describe our procedure; a complete analysis of our procedure is beyond the scope of the paper and will be topic for future research. Under the null hypothesis of conditional independence, we have T := (F X Z (X Z), F Y Z (Y Z), Z) D = (U 1, U 2, Z) := T ind (3.3) where U 1, U 2 are i.i.d. Uniform(0, 1), independent of Z. We observe i.i.d. data {(X i, Y i, Z i ) : i = 1,..., n} from the joint distribution of (X, Y, Z) and we are interested in testing the distributional equality (3.3). Suppose first that the conditional distribution functions F X Z ( ) and F Y Z ( ) are known. Then we can estimate P T, the distribution of T, by its empirical counterpart P T := 1 n δ n ( ), F X Z (X i Z i ),F Y Z (Y i Z i ),Z i i=1 where δ x denotes the Dirac measure at x. On the other hand, we can estimate P Tind by P Tind = 1 n δ n ( ). 3 F X Z (X i Z i ),F Y Z (Y j Z j ),Z k i,j,k=1 Hence, we can estimate E T T ind by V 1 := t t P T (dt) P Tind (dt ) = 1 n 4 n i,j,k,l=1 (F X Z (X i Z i ), F Y Z (Y i Z i ), Z i ) (F X Z (X j Z j ), F Y Z (Y k Z k ), Z l ). 8

9 Similarly we can estimate E T T and E T ind T ind by V 2 := t t P T (dt) P T (dt ) and V 3 := t t P Tind (dt) P Tind (dt ), respectively. Thus our test statistic, the sample version of (3.2) reduces to V := 2V 1 V 2 V 3. As F X Z and F Y Z are unknown, we can replace them by their estimates F X Z and F Y Z, respectively. We can plug-in these estimates in P T and P Tind above to obtain our final test statistic. To obtain the critical value of the proposed test, we can simply use the permutation test: consider the permuted data {( F X Z (X π1 (i) Z π1 (i)), F Y Z (Y π2 (i) Z π2 (i)), Z i ) : i = 1,..., n}, where π 1 and π 2 are random permutations of {1,..., n}, and compute the test statistic V using this permuted data set. Repeating this numerous times, with different π 1 and π 2 would give the null distribution of our test statistic. Note that this method will not involve recomputing the transformations F X Z ( ) and F X Z ( ) as they remain fixed. 3.2 Nonparametric estimation of the residuals The nonparametric estimation of the conditional distribution functions would involve smoothing. In the following we briefly describe the standard approach to estimating the conditional distribution functions using the kernel smoothing method (also see Lee et al. (2006), Yu and Jones (1998),Hall et al. (1999)). For notational simplicity, we restrict to the case d = 1, i.e., Z is a real-valued random variable. Given an i.i.d. sample of {(X i, Z i ) : i = 1,..., n} from f X,Z, the joint density of (X, Z), we can use the following kernel density estimator of f X,Z : 1 n ( ) ( ) x Xi z Zi f n (x, z) = k k nh 1,n h 2,n h 1,n h 2,n i=1 where k is a symmetric probability density on R (e.g., the standard normal density function), and h i,n, i = 1, 2, are the smoothing bandwidths. It can be shown that if nh 1,n h 2,n and max{h 1,n, h 2,n } 0 as n then f n (x, z) P f X,Z (x, z). In fact, the theoretical properties of the kernel density estimator are very well studied; see e.g., Fan and Gijbels (1996) and Einmahl and Mason (2005) and the references therein. For the convenience of notation, we will write h i,n as h i, i = 1, 2. The conditional density of X given Z can then be estimated by f X Z (x z) = f n (x, z) f Z (z) = ( ) 1 n nh 1 h 2 i=1 k x X i h 1 k 9 ( ) z Z i h 2 ( ). 1 n nh 2 i=1 k z Z i h 2

10 Thus the conditional distribution function of X given Z can be estimated as x F f ( ) ( ) 1 n n (t, z) dt nh 2 i=1 K x X i z Z h 1 k i h 2 n ( ) x Xi X Z (x z) = = ( ) = w i (z)k f Z (z) 1 n nh 2 i=1 k z Z i h h 2 i=1 1 where K is the distribution function corresponding to k (i.e., K(u) = u and w i (z) = ( ) 1 z Zi k nh 2 h 2 1 ( n nh 2 j=1 k z Zj 4 Discussion h 2 ) are weights that sum to one for every z. k(v) dv) Given a random vector (X, Z) in R R d = R d+1 we have defined the notion of a nonparametric residual of X on Z as F X Z (X Z), which is always independent of the response Z. We have studied some of properties and showed that it indeed reduces to the usual residual in a multivariate normal regression model. However, the estimation of F X Z ( ) requires nonparametric smoothing techniques, and hence suffers from the curse of dimensionality. One natural way of mitigating this curse of dimensionality could be to use dimension reduction techniques in estimating the residual F X Z (X Z). Suppose now that (X, Y, Z) has a joint density on R R R d = R d+2. We have used this notion of residual to show that the conditional independence between X and Y, given Z, is equivalent to the mutual independence of the residuals F X Z (X Z) and F Y Z (Y Z) and the predictor Z. We have used this result to propose a test for conditional independence, based on the energy statistic. The asymptotic theory of the proposed test statistic is however unknown, and will be topic for future research. We can also use these residuals to come up with a nonparametric notion of partial correlation. The partial correlation of X and Y measures the degree of association between X and Y, removing the effect of Z. In the nonparametric setting, this reduces to measuring the dependence between the residuals F X Z (X Z) and F Y Z (Y Z). We can use distance covariance (Székely et al. (2007)), or any other measure of dependence, for this purpose. We can also test for zero partial correlation by testing for the independence of the residuals F X Z (X Z) and F Y Z (Y Z). Acknowledgements: The first author would like to thank Arnab Sen for many helpful discussions, and for his help in writing parts of the paper. He would also like to thank Probal Chaudhuri for motivating the problem. The research of both authors is supported by National Science Foundation. 10

11 References Bergsma, W. P. (2011). Nonparametric testing of conditional independence by means of the partial copula. Billingsley, P. (1995). Probability and measure. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Inc., New York, third edition. A Wiley-Interscience Publication. Chiappori, P. and Salanié, B. (2000). Testing for asymmetric information in insurance markets. Journal of Political Economy, 108(1): Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist., 33(3): Fan, J. and Gijbels, I. (1996). Local polynomial modelling and its applications, volume 66 of Monographs on Statistics and Applied Probability. Chapman & Hall, London. Fukumizu, K., Gretton, A., Sun, X., and Schölkopf, B. (2007). Kernel measures of conditional dependence. In Advances in Neural Information Processing Systems, pages Gretton, A., Bousquet, O., Smola, A., and Schölkopf, B. (2005). Measuring statistical dependence with hilbert-schmidt norms. Proceedings of the Conference on Algorithmic Learning Theory (ALT), pages Györfi, L. and Walk, H. (2012). Strongly consistent nonparametric tests of conditional independence. Statist. Probab. Lett., 82(6): Hall, P., Wolff, R. C. L., and Yao, Q. (1999). Methods for estimating a conditional distribution function. J. Amer. Statist. Assoc., 94(445): Heckman, J., Ichimura, H., and Todd, P. (1997). Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme. The Review of Economic Studies, 64(4):605. Huang, T.-M. (2010). Testing conditional independence using maximal nonlinear conditional correlation. Ann. Statist., 38(4): Lauritzen, S. L. (1996). Graphical models, volume 17 of Oxford Statistical Science Series. The Clarendon Press Oxford University Press. Oxford Science Publications. 11

12 Lee, Y. K., Lee, E. R., and Park, B. U. (2006). Conditional quantile estimation by local logistic regression. J. Nonparametr. Stat., 18(4-6): Pearl, J. (2000). Causality. Cambridge University Press, Cambridge. Models, reasoning, and inference. Rizzo, M. L. and Székely, G. J. (2010). DISCO analysis: a nonparametric extension of analysis of variance. Ann. Appl. Stat., 4(2): Song, K. (2009). Testing conditional independence via Rosenblatt transforms. Ann. Statist., 37(6B): Su, L. and White, H. (2007). A consistent characteristic function-based test for conditional independence. J. Econometrics, 141(2): Székely, G. J. and Rizzo, M. L. (2005). A new test for multivariate normality. J. Mult. Anal., 93(1): Székely, G. J. and Rizzo, M. L. (2009). Brownian distance covariance. Ann. Appl. Stat., 3(4): Székely, G. J. and Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities. to appear in Ann. Statist. Székely, G. J., Rizzo, M. L., and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Ann. Statist., 35(6): Yu, K. and Jones, M. C. (1998). Local linear quantile regression. J. Amer. Statist. Assoc., 93(441): Zhang, K., Peters, J., Janzing, D., and Schölkopf, B. (2012). Kernel-based conditional independence test and application in causal discovery. arxiv preprint arxiv:

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

A nonparametric method of multi-step ahead forecasting in diffusion processes

A nonparametric method of multi-step ahead forecasting in diffusion processes A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Kun Zhang Dept of Computer Science and HIIT University of Helsinki 14 Helsinki, Finland kun.zhang@cs.helsinki.fi Aapo Hyvärinen

More information

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models JMLR Workshop and Conference Proceedings 6:17 164 NIPS 28 workshop on causality Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Kun Zhang Dept of Computer Science and HIIT University

More information

A Bootstrap Test for Conditional Symmetry

A Bootstrap Test for Conditional Symmetry ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

More information

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Inference via Bootstrapping the Debiased Estimator Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be

More information

Causal Inference on Discrete Data via Estimating Distance Correlations

Causal Inference on Discrete Data via Estimating Distance Correlations ARTICLE CommunicatedbyAapoHyvärinen Causal Inference on Discrete Data via Estimating Distance Correlations Furui Liu frliu@cse.cuhk.edu.hk Laiwan Chan lwchan@cse.euhk.edu.hk Department of Computer Science

More information

Consistent Bivariate Distribution

Consistent Bivariate Distribution A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of

More information

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers 6th St.Petersburg Workshop on Simulation (2009) 1-3 A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers Ansgar Steland 1 Abstract Sequential kernel smoothers form a class of procedures

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

Dependence Minimizing Regression with Model Selection for Non-Linear Causal Inference under Non-Gaussian Noise

Dependence Minimizing Regression with Model Selection for Non-Linear Causal Inference under Non-Gaussian Noise Dependence Minimizing Regression with Model Selection for Non-Linear Causal Inference under Non-Gaussian Noise Makoto Yamada and Masashi Sugiyama Department of Computer Science, Tokyo Institute of Technology

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

On Fractile Transformation of Covariates in Regression 1

On Fractile Transformation of Covariates in Regression 1 1 / 13 On Fractile Transformation of Covariates in Regression 1 Bodhisattva Sen Department of Statistics Columbia University, New York ERCIM 10 11 December, 2010 1 Joint work with Probal Chaudhuri, Indian

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH

More information

An Introduction to Nonstationary Time Series Analysis

An Introduction to Nonstationary Time Series Analysis An Introduction to Analysis Ting Zhang 1 tingz@bu.edu Department of Mathematics and Statistics Boston University August 15, 2016 Boston University/Keio University Workshop 2016 A Presentation Friendly

More information

Independent and conditionally independent counterfactual distributions

Independent and conditionally independent counterfactual distributions Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views

More information

SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL

SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL DANIEL J. HENDERSON AND CHRISTOPHER F. PARMETER Abstract. In this paper we propose an asymptotically equivalent single-step alternative to the two-step

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information

Kernel Bayes Rule: Nonparametric Bayesian inference with kernels

Kernel Bayes Rule: Nonparametric Bayesian inference with kernels Kernel Bayes Rule: Nonparametric Bayesian inference with kernels Kenji Fukumizu The Institute of Statistical Mathematics NIPS 2012 Workshop Confluence between Kernel Methods and Graphical Models December

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

On the Empirical Content of the Beckerian Marriage Model

On the Empirical Content of the Beckerian Marriage Model On the Empirical Content of the Beckerian Marriage Model Xiaoxia Shi Matthew Shum December 18, 2014 Abstract This note studies the empirical content of a simple marriage matching model with transferable

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

Worst-Case Bounds for Gaussian Process Models

Worst-Case Bounds for Gaussian Process Models Worst-Case Bounds for Gaussian Process Models Sham M. Kakade University of Pennsylvania Matthias W. Seeger UC Berkeley Abstract Dean P. Foster University of Pennsylvania We present a competitive analysis

More information

3 Comparison with Other Dummy Variable Methods

3 Comparison with Other Dummy Variable Methods Stats 300C: Theory of Statistics Spring 2018 Lecture 11 April 25, 2018 Prof. Emmanuel Candès Scribe: Emmanuel Candès, Michael Celentano, Zijun Gao, Shuangning Li 1 Outline Agenda: Knockoffs 1. Introduction

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION Alireza Bayestehtashk and Izhak Shafran Center for Spoken Language Understanding, Oregon Health & Science University, Portland, Oregon, USA

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Kernel Methods. Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton

Kernel Methods. Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Kernel Methods Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Alexander J. Smola Statistical Machine Learning Program Canberra,

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Bickel Rosenblatt test

Bickel Rosenblatt test University of Latvia 28.05.2011. A classical Let X 1,..., X n be i.i.d. random variables with a continuous probability density function f. Consider a simple hypothesis H 0 : f = f 0 with a significance

More information

Nonparametric Modal Regression

Nonparametric Modal Regression Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric

More information

Big Hypothesis Testing with Kernel Embeddings

Big Hypothesis Testing with Kernel Embeddings Big Hypothesis Testing with Kernel Embeddings Dino Sejdinovic Department of Statistics University of Oxford 9 January 2015 UCL Workshop on the Theory of Big Data D. Sejdinovic (Statistics, Oxford) Big

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Hilbert Space Representations of Probability Distributions

Hilbert Space Representations of Probability Distributions Hilbert Space Representations of Probability Distributions Arthur Gretton joint work with Karsten Borgwardt, Kenji Fukumizu, Malte Rasch, Bernhard Schölkopf, Alex Smola, Le Song, Choon Hui Teo Max Planck

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Testing against a linear regression model using ideas from shape-restricted estimation

Testing against a linear regression model using ideas from shape-restricted estimation Testing against a linear regression model using ideas from shape-restricted estimation arxiv:1311.6849v2 [stat.me] 27 Jun 2014 Bodhisattva Sen and Mary Meyer Columbia University, New York and Colorado

More information

Department of Economics, Vanderbilt University While it is known that pseudo-out-of-sample methods are not optimal for

Department of Economics, Vanderbilt University While it is known that pseudo-out-of-sample methods are not optimal for Comment Atsushi Inoue Department of Economics, Vanderbilt University (atsushi.inoue@vanderbilt.edu) While it is known that pseudo-out-of-sample methods are not optimal for comparing models, they are nevertheless

More information

Nonparametric Covariate Adjustment for Receiver Operating Characteristic Curves

Nonparametric Covariate Adjustment for Receiver Operating Characteristic Curves Nonparametric Covariate Adjustment for Receiver Operating Characteristic Curves Radu Craiu Department of Statistics University of Toronto joint with: Ben Reiser (Haifa) and Fang Yao (Toronto) IMS - Pacific

More information

Density Estimation (II)

Density Estimation (II) Density Estimation (II) Yesterday Overview & Issues Histogram Kernel estimators Ideogram Today Further development of optimization Estimating variance and bias Adaptive kernels Multivariate kernel estimation

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Package dhsic. R topics documented: July 27, 2017

Package dhsic. R topics documented: July 27, 2017 Package dhsic July 27, 2017 Type Package Title Independence Testing via Hilbert Schmidt Independence Criterion Version 2.0 Date 2017-07-24 Author Niklas Pfister and Jonas Peters Maintainer Niklas Pfister

More information

A consistent test of independence based on a sign covariance related to Kendall s tau

A consistent test of independence based on a sign covariance related to Kendall s tau Contents 1 Introduction.......................................... 1 2 Definition of τ and statement of its properties...................... 3 3 Comparison to other tests..................................

More information

CS 7140: Advanced Machine Learning

CS 7140: Advanced Machine Learning Instructor CS 714: Advanced Machine Learning Lecture 3: Gaussian Processes (17 Jan, 218) Jan-Willem van de Meent (j.vandemeent@northeastern.edu) Scribes Mo Han (han.m@husky.neu.edu) Guillem Reus Muns (reusmuns.g@husky.neu.edu)

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Supplemental Appendix to Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes

Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract Convergence

More information

Reliability of Coherent Systems with Dependent Component Lifetimes

Reliability of Coherent Systems with Dependent Component Lifetimes Reliability of Coherent Systems with Dependent Component Lifetimes M. Burkschat Abstract In reliability theory, coherent systems represent a classical framework for describing the structure of technical

More information

PRECISE ASYMPTOTIC IN THE LAW OF THE ITERATED LOGARITHM AND COMPLETE CONVERGENCE FOR U-STATISTICS

PRECISE ASYMPTOTIC IN THE LAW OF THE ITERATED LOGARITHM AND COMPLETE CONVERGENCE FOR U-STATISTICS PRECISE ASYMPTOTIC IN THE LAW OF THE ITERATED LOGARITHM AND COMPLETE CONVERGENCE FOR U-STATISTICS REZA HASHEMI and MOLUD ABDOLAHI Department of Statistics, Faculty of Science, Razi University, 67149, Kermanshah,

More information

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable

More information

Kernel Measures of Conditional Dependence

Kernel Measures of Conditional Dependence Kernel Measures of Conditional Dependence Kenji Fukumizu Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku Tokyo 6-8569 Japan fukumizu@ism.ac.jp Arthur Gretton Max-Planck Institute for

More information

Chain Plot: A Tool for Exploiting Bivariate Temporal Structures

Chain Plot: A Tool for Exploiting Bivariate Temporal Structures Chain Plot: A Tool for Exploiting Bivariate Temporal Structures C.C. Taylor Dept. of Statistics, University of Leeds, Leeds LS2 9JT, UK A. Zempléni Dept. of Probability Theory & Statistics, Eötvös Loránd

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Nonlinear time series analysis Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Nonlinearity Does nonlinearity matter? Nonlinear models Tests for nonlinearity Forecasting

More information

Understanding Ding s Apparent Paradox

Understanding Ding s Apparent Paradox Submitted to Statistical Science Understanding Ding s Apparent Paradox Peter M. Aronow and Molly R. Offer-Westort Yale University 1. INTRODUCTION We are grateful for the opportunity to comment on A Paradox

More information

Smooth functions and local extreme values

Smooth functions and local extreme values Smooth functions and local extreme values A. Kovac 1 Department of Mathematics University of Bristol Abstract Given a sample of n observations y 1,..., y n at time points t 1,..., t n we consider the problem

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

On Testing Independence and Goodness-of-fit in Linear Models

On Testing Independence and Goodness-of-fit in Linear Models On Testing Independence and Goodness-of-fit in Linear Models arxiv:1302.5831v2 [stat.me] 5 May 2014 Arnab Sen and Bodhisattva Sen University of Minnesota and Columbia University Abstract We consider a

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

Maximum Mean Discrepancy

Maximum Mean Discrepancy Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia

More information

Statistical Convergence of Kernel CCA

Statistical Convergence of Kernel CCA Statistical Convergence of Kernel CCA Kenji Fukumizu Institute of Statistical Mathematics Tokyo 106-8569 Japan fukumizu@ism.ac.jp Francis R. Bach Centre de Morphologie Mathematique Ecole des Mines de Paris,

More information

arxiv: v2 [stat.ml] 3 Sep 2015

arxiv: v2 [stat.ml] 3 Sep 2015 Nonparametric Independence Testing for Small Sample Sizes arxiv:46.9v [stat.ml] 3 Sep 5 Aaditya Ramdas Dept. of Statistics and Machine Learning Dept. Carnegie Mellon University aramdas@cs.cmu.edu September

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

arxiv: v1 [math.st] 24 Aug 2017

arxiv: v1 [math.st] 24 Aug 2017 Submitted to the Annals of Statistics MULTIVARIATE DEPENDENCY MEASURE BASED ON COPULA AND GAUSSIAN KERNEL arxiv:1708.07485v1 math.st] 24 Aug 2017 By AngshumanRoy and Alok Goswami and C. A. Murthy We propose

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble

More information

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs

Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Proceedings of Machine Learning Research vol 73:21-32, 2017 AMBN 2017 Causal Effect Identification in Alternative Acyclic Directed Mixed Graphs Jose M. Peña Linköping University Linköping (Sweden) jose.m.pena@liu.se

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

Hypothesis Testing with Kernel Embeddings on Interdependent Data

Hypothesis Testing with Kernel Embeddings on Interdependent Data Hypothesis Testing with Kernel Embeddings on Interdependent Data Dino Sejdinovic Department of Statistics University of Oxford joint work with Kacper Chwialkowski and Arthur Gretton (Gatsby Unit, UCL)

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

arxiv: v1 [cs.lg] 22 Jun 2009

arxiv: v1 [cs.lg] 22 Jun 2009 Bayesian two-sample tests arxiv:0906.4032v1 [cs.lg] 22 Jun 2009 Karsten M. Borgwardt 1 and Zoubin Ghahramani 2 1 Max-Planck-Institutes Tübingen, 2 University of Cambridge June 22, 2009 Abstract In this

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Optimal Kernel Shapes for Local Linear Regression

Optimal Kernel Shapes for Local Linear Regression Optimal Kernel Shapes for Local Linear Regression Dirk Ormoneit Trevor Hastie Department of Statistics Stanford University Stanford, CA 94305-4065 ormoneit@stat.stanjord.edu Abstract Local linear regression

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Unsupervised Kernel Dimension Reduction Supplemental Material

Unsupervised Kernel Dimension Reduction Supplemental Material Unsupervised Kernel Dimension Reduction Supplemental Material Meihong Wang Dept. of Computer Science U. of Southern California Los Angeles, CA meihongw@usc.edu Fei Sha Dept. of Computer Science U. of Southern

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

A Brief Introduction to Copulas

A Brief Introduction to Copulas A Brief Introduction to Copulas Speaker: Hua, Lei February 24, 2009 Department of Statistics University of British Columbia Outline Introduction Definition Properties Archimedean Copulas Constructing Copulas

More information