arxiv: v2 [stat.me] 27 Apr 2018

Size: px
Start display at page:

Download "arxiv: v2 [stat.me] 27 Apr 2018"

Transcription

1 Nonasymtotic estimation and suort recovery for high dimensional sarse covariance matrices arxiv: v [statme] 7 Ar Introduction Adam B Kashlak and Linglong Kong Deartment of Mathematical and Statistical Sciences, CAB 63 University of Alberta Edmonton, AB, Canada 6G G1 kashlak@ualbertaca url: htts://sitesualbertaca/~kashlak/ Abstract: We roose a general framework for nonasymtotic covariance matrix estimation making use of concentration inequality-based confidence sets We secify this framework for the estimation of large sarse covariance matrices through incororation of ast thresholding estimators with key emhasis on suort recovery his technique goes beyond ast results for thresholding estimators by allowing for a wide range of distributional assumtions beyond merely sub-gaussian tails his methodology can furthermore be adated to a wide range of other estimators and settings he usage of nonasymtotic dimension-free confidence sets yields good theoretical erformance hrough extensive simulations, it is demonstrated to have suerior erformance when comared with other such methods In the context of suort recovery, we are able to secify a false ositive rate and otimize to maximize the true recoveries MSC 010 subject classifications: Primary 6H1; secondary 6G15 Keywords and hrases: Concentration Inequality, Confidence Region, Random Matrix, Schatten Norm, Log Concave Measure, Sub-Exonential Measure Covariance matrices and accurate estimators of such objects are of critical imortance in statistics Various standard techniques including rincile comonents analysis and linear and quadratic discriminant analysis rely on an accurate estimate of the covariance structure of the data Alications can range from genetics and medical imaging data to climate and other tyes of data Furthermore, in the era of high dimensional data, classical asymtotic estimators erform oorly in alications Stein, 1975; Johnstone, 001 Hence, we roose a general methodology for nonasymtotic covariance matrix estimation making use of confidence balls constructed from concentration inequalities While this is a general framework with many otential alications, we secifically consider the use of thresholding estimators for sarse covariance matrices with a view towards suort recovery that is, determining which variable airs are correlated Many estimators for the covariance matrix have been roosed working under the assumtion of sarsity Pourahmadi, 011, which is, in a qualitative sense, the case when most of the off-diagonal entries are zero or negligible Beyond mere theoretical interest, the assumtion of sarsity is widely alicable to real data analysis as the ractitioner may believe that many of the variable airings will be uncorrelated hus, it is desirable to tailor covariance estimation rocedures given this assumtion of sarsity Sarsity in the simlest sense imlies some bound on the number of non-zero entries in the columns of a covariance matrix hus, given a Σ R d d with entries σ i,j for i, j = 1,, d, there exists some constant k > 0 such that max j=1,,d d 1[σ i,j 0] k his can be generalized to aroximate sarsity as in Rothman, Levina and Zhu 009 by max j=1,,d d σ i,j q k for some q [0, 1 Furthermore, Cai and Liu 011 define a broader aroximately sarse class by bounding weighted column sums of Σ In El Karoui 008, a similar notion referred to as β-sarsity is defined Such classes of sarse covariance matrices allow for good theoretical erformance of estimators One class of estimators are shrinkage estimators that follow a James-Stein aroach by shrinking estimated eigenvalues, eigenvectors, or the matrix itself towards some desired target Haff, 1980; Dey and Srinivasan, 1985; Daniels and Kass, 1999, 001; Ledoit and Wolf, 004; Hoff, 009; Johnstone and Lu, 01 Another class of sarse estimators are those that regularize the estimate with lasso-style enalties Rothman, 01; Bien and ibshirani, 011 Yet another class consists of thresholding estimators, which declare the covariance between 1

2 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation two variables to be zero, if the estimated value is smaller than some threshold Bickel and Levina, 008a,b; Rothman, Levina and Zhu, 009; Cai and Liu, 011 Beyond these, there are other methods such as banding and taering, which aly only when the variables are ordered or a notation of roximity exists eg satial, time series, or longitudinal data As we will not assume such an ordering and strive to construct a methodology that is ermutation invariant with resect to the variables, these aroaches will not be considered Lastly, there has also been substantial work into the estimation of the recision or inverse covariance matrix While it is easily ossible that our aroach could be adated to this setting, it will not be considered in this article and will, hence, be reserved for future research In this article, we roose of novel aroach to the estimation of sarse covariance matrices making use of concentration inequality based confidence sets such as those constructed in Kashlak, Aston and Nickl 017 for the functional data setting In short, consider a samle of real vector valued data X 1,, X n R d with mean zero and unknown covariance matrix Σ Concentration inequalities are used to construct a non-asymtotic confidence set for Σ about the emirical estimate of the unknown covariance matrix, ˆΣ em = n 1 n X i XX i X where X = n 1 n X i is the samle mean While, it has been noted for examle, see Cai and Liu 011 that ˆΣ em may be a oor estimator when the dimension d is large and Σ is sarse, the confidence set is still valid given a desired coverage of 1 α o construct a better estimator, we roose to search this confidence set for an estimator ˆΣ s which otimizes some sarsity criterion to be concretely defined later his estimation method adats to the uncertainty of ˆΣ em in the high dimensional setting, d n, by widening the confidence set and thus allowing our sarse estimator to lie far away from the emirical estimate Furthermore, given some distributional assumtions, the concentration inequalities rovide us with non-asymtotic dimension-free confidence sets allowing for very desirable convergence results Many established methods for sarse estimation make use of a regularization or enalization term incororated to enforce sarsity Rothman, 01; Bien and ibshirani, 011 In some sense, our roosed method can be considered to be in this class of estimators However, we do not enforce sarsity via some lasso-style enalization term, but enforce it through the choice of α he larger our 1 α-confidence set is, the sarser our estimator is allowed to be hus, as with other regularized estimators, the ractitioner will have to make a choice of α to achieve the desired level of sarsity A major contribution of this work is develoing a method with the ability to avoid costly cross-validation of the tuning arameter and maintain strong finite samle erformance he secific focus as discussed below and in Aendix B is accurate suort recovery, which is the identification of the non-zero entries in the covariance matrix Our methodology allows for fixing a false ositive rate ercentage of zero entries incorrectly said to be non-zero and otimizing over the true ositive rate ercentage of correctly identified non-zero entries Furthermore, our estimation technique imlements a binary search rocedure resulting in a highly efficient algorithm esecially when comared to the more laborious otimization required by lasso enalization In Section, the general estimation rocedure is outlined, and it is secified for tuning threshold estimators with concentration methods Section 3 discusses our aroach to fixing a certain false ositive rate when attemting to recover the suort of the covariance matrix In Section 4, three different tyes of concentration are considered for secifically log concave measures, sub-exonential distributions, and bounded random variables Lastly, Section 5 details comrehensive simulations comaring our concentration aroach to sarse estimation to standard techniques such as thresholding and enalization Beyond simulation exeriments, a real data set of gene exressions for small round blue cell tumours from the study of Khan et al 001 is considered 11 Notation and Definitions When defining a Banach sace matrices, there are many matrix norms that can be considered In the article, the main norms of interest are the -Schatten norms, which will be denoted and are defined as follows Definition 11 -Schatten Norm For an arbitrary matrix Σ R k l and [1,, the -Schatten norm is 1/ Σ = tr Σ Σ / min{k,l} 1/ = ν l = ν i

3 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 3 where ν = ν 1,, ν min{k,l} is the vector of singular values of Σ and where l is the standard l norm in R d In the covariance matrix case where Σ R d d is symmetric and ositive semi-definite, Σ = tr Σ 1/ = λ l where λ is the vector of eigenvalues of Σ he 1-Schatten norm is referred to as the trace norm and the -Schatten norm as the Hilbert-Schmidt or Frobenius norm For =, we have the usual oerator norm for Σ : R l R k with resect to the l norm, Σ = su Σu l = ν l = max ν i, u l =1,,min{k,l} which is similarly the maximal eigenvalue when Σ is symmetric ositive semi-definite he definition of the -Schatten norm involves taking the square root of a symmetric matrix In general, a matrix square root is only unique u to unitary transformations However, for symmetric ositive semi-definite matrices, we will only require the unique symmetric ositive semi-definite square root defined as follows Definition 1 Matrix Square Root Let A R d d be a symmetric ositive semi-definite matrix with eigen-decomosition A = UDU where U the orthonormal matrix of eigenvectors and D the diagonal matrix of eigenvalues, λ 1,, λ d hen, A = UD U where D is the diagonal matrix with entries λ 1,, λ d Another family of norms that will be used is the collection of entrywise matrix norms denoted, which are written in terms of l norms of the entries Definition 13, q-entrywise norm For an arbitrary matrix Σ R k l with entries σ i,j and, q [1, ], the, q-entrywise norm is 1/ k l Σ,q = σ q i,j /q, j=1 with the usual modification in the case that = and/or q = When = q, these are the l norms of a given matrix treated as a vector in R k,l Note that the -Schatten norm coincides with the, -entrywise norm 1 Main contributions and connections to ast work he main contribution of this work is the construction of a general framework for tuning threshold estimators for suort recovery and estimation of sarse covariance matrices for sub-gaussian, sub-exonential, and bounded data his framework has the otential to extend to other distributional assumtions and to other estimation techniques It offers finite samle guarantees and a much faster comute time than costly otimization and cross validation methods Past work on thresholding estimators for sarse covariance estimation began with solely considering Gaussian data and then extending to sub-gaussian tails Bickel and Levina, 008a,b; Rothman, Levina and Zhu, 009 he more recent work of Cai and Liu 011 also rovides theoretical results for sub-gaussian data as well as certain olynomial-tye tails However, only Gaussian data is considered in their numerical simulations In this article, we consider sub-gaussian, heavier tailed sub-exonential, and bounded data he rincial focus of this work is to use non-asymtotic concentration inequalities to guarantee finite samle erformance Past articles are focused on roximity of their estimator to truth in oerator norm as the main metric of success due to convergence in oerator norm imlying convergence of the eigenvalues and eigenvectors While asymtotically, such methods have elegant theoretical convergence roerties, for finite samles one can achieve better erformance in oerator norm distance by simly choosing the emirical diagonal matrix as an estimator ie the emirical estimator with off-diagonal entries set to zero In Aendix B, we rerun some of the numerical simulations from Rothman, Levina and Zhu 009 and demonstrate that for Gaussian data the emirical diagonal matrix achieves better erformance than all of the universal threshold estimators for data in R 500 for a samle of size n = 100 For sub-exonential data albeit outside of the scoe of their methodology resented in Rothman, Levina and Zhu 009 the emirical diagonal matrix dominated all threshold estimators in oerator norm distance even when d < n We thus strongly argue that the main metric of success for such sarse estimators is suort recovery of the true covariance matrix

4 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 4 able 1 Secific metrics d, and deviation thresholds r α given secific assumtions on the data X i discussed in subsequent sections Assumtion on X i dˆσ s, Σ r α Log Concave Measure ˆΣ s Σ /c0 n log α Bounded in norm ˆΣ s Σ U n log α Sub-Exonential Measure ˆΣ s Σ max{ K log α/ n, K log α} he main theoretical results of this work are heorem 3, which establishes how to fix a false ositive rate for threshold estimators devoid of any distributional assumtions, and heorem 45, which demonstrates suort recovery both zero and non-zero entries in the secific case that the data has a strongly log concave measure In the case that the data is instead sub-exonential or bounded, we do not achieve a similar limit theorem, but are still able to achieve good erformance in numerical simulations Of indeendent interest is Lemma 34, which establishes a symmetrization result for random sarse covariance estimators For suort recovery, let Σ be the true covariance matrix and ˆΣ s be the sarse estimator We define a true ositive to be a air i, j with i, j = 1,, d such that the i, jth entries σ i,j 0 and ˆσ s i,j 0 We similarly define a false ositive to be a air i, j such that σ i,j = 0 and ˆσ s i,j 0 As often in statistical ractice, we aim to maximize the true ositive rate while controlling the false ositive rate Unlike other methods that only rovide a single estimator with no ability to control the false ositive rate, our framework can be tuned to a desired false ositive rate Sarse Estimation Procedure Let X 1,, X n R d be a samle of n indeendent and identically distributed mean zero random vectors with unknown d d covariance matrix Σ Define the emirical estimate of Σ to be ˆΣ em = n 1 n X i XX i X where X = n 1 n X i he goal of the following rocedure is to construct a sarse estimator, ˆΣ s, for Σ by first constructing a non-asymtotic confidence set for Σ based on ˆΣ em and then searching this set for the sarsest member A search method using threshold estimators is outlined in Section 1 o construct such a confidence set about ˆΣ em, concentration inequalities are emloyed Secific inequalities are chosen based on data assumtions and are discussed in subsequent Sections 41 and 4 In general, the inequalities all take a similar form Let d, be some metric measuring the distance between two covariance matrices, and let ψ : R R be monotonically increasing hen, the general form of the concentration inequalities is P dσ, ˆΣ em EdΣ, ˆΣ em + r e ψr, which is a bound on the tail of the distribution of dσ, ˆΣ em as it deviates above its mean hus, to construct a 1 α-confidence set, the variable r = r α is chosen such that ex ψr α = α his r α will be referred to as the deviation threshold able 1 contains some exlicit choices for the metric and deviation threshold given secific assumtions on the X i, which are used to drive the choice of concentration inequality Log concave and sub-exonential measures are discussed searately in Section 4 Now, let ˆΣ s be our sarse estimator for Σ We want these two to be close in the sense of the above confidence set and therefore choose a ˆΣ s such that dˆσ s, ˆΣ em r α Consequently, we have that dˆσ s, Σ EdˆΣ em, Σ + r α P P dˆσ s, ˆΣ em + dˆσ em, Σ EdˆΣ em, Σ + r α P dˆσ em, Σ EdˆΣ em, Σ + r α ex ψr α = α Hence, we choose a ˆΣ s close enough to ˆΣ em to share its elegant concentration roerties, but far enough away to result in a better estimator for Σ

5 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 5 o actually identify such a ˆΣ s, we require some criterion to otimize over all elements of the confidence set Section 1 takes insiration from thresholding techniques for sarse covariance estimation Bickel and Levina, 008a; Rothman, Levina and Zhu, 009; Cai and Liu, 011 and imlements them inside this concentration framework It begins with ˆΣ em and attemts to threshold it as much as ossible while still remaining in the confidence set 1 hresholding within confidence sets A generalized thresholding oerator, as defined in Rothman, Levina and Zhu 009, is s λ : R R such that s λ z z, s λ z = 0 for z λ, and s λ z z λ, which will aly element-wise to a matrix In the ast, such an oerator is alied to the emirical estimate ˆΣ em for some λ generally chosen via cross validation Instead of directly choosing a threshold λ, our aroach is to choose a confidence level 1 α and find the largest λ such that ds λ ˆΣ em, ˆΣ em r α ˆΣ s 0 Set 0 = ˆΣ diag ˆΣ em ˆΣ diag to be the emirical estimator normalized to have a diagonal of s ones Initialize the threshold to λ = 05 and write ˆΣ λ = s λˆσ em Let k = 1 be the number of stes of the recursion Choose an α and comute r α In Section 3, we will use the desired false ositive rate to motivate a choice for α 1 Increase k k + 1, then udate λ as follows a if dˆσ s λ, ˆΣ em r α, set λ λ + k 1 b Otherwise, set λ λ k 1 Reeat ste 1 until k has reached the desired number of iterations Generally, as few as k = 10 will suffice 3 he resulting estimator is ˆΣ s = ˆΣ diag ˆΣ s λ ˆΣ diag s where ˆΣ λ is the final matrix resulting from this recursion Remark 1 Positive Definite Estimators If ˆΣ s is not ositive semi-definite, then it can be rojected onto the sace of ositive semi-definite matrices A standard ast aroach is to ma the negative eigenvalues to zero or to their absolute value, which maintains the eigen-structure However, such a rojection will have an adverse effect on the suort recovery roblem as the estimator will no longer be sarse An alternative is to ma ˆΣ s ˆΣ s + γi d for some γ > 0 large enough to make the result ositive definite his will not effect the recovered suort of the matrix More clever rojections may also be ossible In the case that the metric d, is a monotonically increasing function of the Hilbert-Schmidt / Frobenius norm ˆΣ s λ ˆΣ em or another entrywise norm, then the sequence dˆσ s λ, ˆΣ em will be increasing in λ Proosition In the context of the above algorithm, if λ 1 > λ, then for any, q, we have Proof As λ 1 > λ, the entries of the matrix entries of ˆΣ s λ 1 ˆΣ em,q ˆΣ s λ ˆΣ em,q ˆΣ s λ 1 ˆΣ em are equal to or larger than in absolute value the ˆΣ s λ ˆΣ em Hence ˆΣ s λ 1 ˆΣ em,q ˆΣ s λ ˆΣ em,q by definition 13 his roerty guarantees that the above algorithm will find the sarsest ˆΣ s in the confidence set in the sense of having the largest threshold ossible However, for an arbitrary metric or secifically other -Schatten norms, this sequence may not necessarily be strictly increasing in λ Another commonly used norm, which will be shown in Section 5 to give suerior erformance in simulation, is the oerator norm ˆΣ s λ ˆΣ em, which does not yield a monotonically increasing sequence hough, this sequence is roughly increasing in the sense that it is lower bounded by definition by the maximum l s norm of the columns of ˆΣ λ ˆΣ em, which is an increasing sequence Furthermore, it is uer bounded by the l 1 s norm of the columns of ˆΣ λ ˆΣ em, which follows from the Gershgorin circle theorem Iserles, 009, and which is also an increasing sequence

6 3 Fixing a false ositive rate AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 6 For many sarse matrix estimation methods, theorems demonstrating sarsistency are roved hese indicate that in some asymtotic sense, the correct suort of the true matrix will eventually be recovered generally as n and d grow together at some rate However, none rovide a method for fixing a false ositive rate and finding an estimator that satisfies such a rate, which is certainly of interest to any ractitioner with a finite fixed samle size Hence, we resent a method for tuning our arameter α to a desired false ositive rate for the covariance estimator Before roceeding, we will require a class of sarse matrices similar to those from Bickel and Levina 008a,b; Rothman, Levina and Zhu 009; Cai and Liu 011 Secifically, let Uk, δ = {Σ R d d : max,,d j=1 d 1[σ i,j 0] k, and if σ i,j 0, then σ i,j δ > 0} 31 For the results regarding the false ositive rate, we are not concerned with the lower bound δ and only with k, the maximum number of non-zero entries er column or row As long as k increases more slowly than the dimension d, which is made secific below, we can achieve a desired false ositive rate without interference For an estimator Σ R d d, the false ositive rate is ρ Σ = #{ σ i,j 0 σ i,j = 0, i > j} dd 1/ where σ i,j is the ijth entry of the true covariance matrix and σ i,j is the ijth entry of the estimator Σ Hence, we are counting the number of non-zero entries in our estimator that should have been zero For notation, let ˆΣ em em be the usual emirical estimate of the covariance matrix Let ˆΣ 0 be the emirical estimator with all off em diagonal entries set to zero thus guaranteeing a false ositive rate of zero For η 05, let ˆΣ η be the emirical estimator after alication of the strong threshold oerator with threshold M η = quantile ˆσ i,j, η : i > j}, which removes 1001 η% of the off diagonal entries achieving a false ositive rate of aroximately 1 η due to the following lemma Lemma 31 Let Σ Uk, δ from Equation 31 Let the η [05, 1 threshold, M η, be the η quantile of ˆσ i,j em with i > j, and let the corresonding thresholded estimator be ˆΣ η = s Mη ˆΣ em with ijth entry denoted ˆσ η i,j hen, #{i, j i > j, ˆσ η i,j > 0, σ i,j = 0} as η dd 1/ as d as long as k = Od ν for ν < 1 Proof For Σ Uk, δ, we have that #{i, j σ i,j 0} kd = Od 1+ν and #{i, j σ i,j = 0} = Od Hence, the ratio of true zeros to true non-zeros is of order Od ν 1, which decays to zero as d he robustness of the η-quantile gives the desired convergence as long as no more than 1 η of the samle is contaminated By the sarsity assumtion that k = od, we have the desired result We cannot extend this lemma from η 05 to arbitrary quantiles ie false ositive rates without some additional assumtion on k/d, the relative amount of true ositive entries However, using the matrices, ˆΣ em em η and ˆΣ 0, as reference oints, we can interolate via the following theorem to achieve any desired false ositive rate heorem 3 Let Σ Uk, δ from Equation 31 with k = Od ν for ν < Given a desired false ositive rate, ρ 0, 05], and η = ρ a 05, 1] for some a Z + em, let ˆΣ ρ be the hard thresholded emirical estimator that achieves a false ositive rate of ρ hen, em ˆΣ ρ Σ 0 η ˆΣ em ρ Σ 0 K 1nρ d 1/4 + K nρ 1/4 d + ond where K 1, K are universal constants η

7 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 7 Remark 33 he above heorem 3 is wholly uninteresting for large values of n However, its ower arises in the non-asymtotic realm of interest namely when d n and also from highlighting the interlay between the dimension, samle size, and ρ, the sarseness of the estimator Furthermore, this result does not require any distributional assumtion It also does not require any assumtion on the lower bound δ on the non-zero σ i,j as it is only concerned with the σ i,j that are zero he roof of the above theorem relies on the following lemma involving symmetrization of random covariance matrices, which may be of indeendent interest Lemma 34 Let R R d d be a real valued symmetric random matrix with zero diagonal and mean zero entries bounded by 1 and not necessarily iid, and let B {0, 1} d d be an iid symmetric Bernoulli random matrix with entries b i,j = b j,i Bernoulli ρ for ρ 0, 1 Denoting the entrywise or Hadamard roduct by, let A = R B Let E { 1, 1} d d be a symmetric random matrix with iid Rademacher entries ε i,j for j < i and ε i,j = ε j,i hen, E A E K 1 d ρ 1/4 + K d 3/4 ρ where K 1, K are universal constants 4 Estimation of sarse covariance matrices he following three subsections detail different assumtions on the data under scrutiny and the secific concentration results that aly in these cases We consider sub-gaussian concentration for log concave measures and for bounded random variables We also consider sub-exonential concentration However, this collection is by no means exhaustive Given the wide variety of concentration inequalities being develoed, our aroach can be alied much more widely than to merely these three settings 41 Log Concave Measures In this section, the general methods from Section are secialized for an iid samle X 1,, X n R d whose common measure µ is strongly log-concave his roerty imlies dimension-free sub-gaussian concentration and includes such common distributions as the multivariate Gaussian, Chi, and Dirichlet distributions Definition 41 Strongly log-concave measure A measure µ on R d is strongly log-concave if there exists a c > 0 such that dµ = e Ux dx and HessU ci d 0 ie is non-negative definite where HessU is the d d matrix of second derivatives From Corollary D5, let X 1,, X n R d have measures µ 1,, µ n, which are all strongly log-concave with coefficients c 1,, c n Let ν = µ 1 µ n be the roduct measure on R d n hen, for any 1-Lischitz φ : R d n R and for any r > 0, P φx 1,, X n EφX 1,, X n + r e mini cir / his follows from heorem D4 and the other results contained within Aendix D1 For a detailed exosition of how Gaussian concentration is established for log concave measures, see Chater 5 of Ledoux 001 Examle 4 Multivariate Gaussian measure For an arbitrary Gaussian measure on R d with mean zero and covariance Σ, we have that Ux = x Σ 1 x/ hus, HessU = Σ 1, and letting {λ i } d be the eigenvalues of Σ, any 0 < c min,,d λ 1 i = max i λ i 1 satisfies the above definition Alying Proosition D5, we have that for X 1,, X n R d iid multivariate Gaussian random variables, where λ 0 = max,,d λ i P φx 1,, X n EφX 1,, X n + r e r /λ 0 Examle 43 Dirichlet Distribution For X Dirichlet α 1,, α d with α i > 1 for all i, the exonent U = d 1 α i log x i and HessU is a diagonal matrix with entries α 1 1/x 1,, α d 1/x d hus, the maximum c to ensure that HessU ci d 0 for all values of x i 0, 1 with d x i = 1 is c = min,,d α i 1

8 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 8 o make use of the above result, we must choose a suitable Lischitz function φ Let X 1,, X n, X R d be indeendent and identically distributed random variables with covariance Σ and with a common strongly log-concave measure µ with coefficient c > 0 Let λ 1 λ n be the eigenvalues of Σ and Λ = λ 1,, λ n For some [1, ], let be the -Schatten norm, which in this case is Σ = Λ l Note that XX = X l for any [1, ] Define the function φ to be 1 φx 1,, X n = n X i EXX i EX For {1,, }, we have that φ is Lischitz with coefficient φ Li = n with resect to the Frobenius or Hilbert-Schmidt metric, which is established in Proosition C5 for = and = and in Proosition C for = 1 hat is, let X 1,, X n, Y 1,, Y n R d, and denote X = X 1,, X n and Y = Y 1,, Y n, then φx φy n d, X, Y = 1 n X i Y i l From here, the rocedure outlined in Section can be imlemented with the given φ and r α = /nc 0 log α In many cases, including the two examles above, the constructed confidence set is comletely dimensionfree hus, even mild assumtions on the relationshi between the samle size n and the dimension d, such as log d = on 1/3 from the adative soft thresholding estimator of Cai and Liu 011, are not needed to rove consistency in our setting Furthermore, the concentration inequalities immediately give us a fast rate of convergence as long as log α = on with a roof rovided in Aendix A heorem 44 Let X 1,, X n R d be iid with common measure µ Let µ be strictly log concave with some fixed constant c 0 from Definition 41 hen, for α 0, 1, [1, ], and r α = /nc 0 log α, ˆΣs su P Σ O n 1 + n 1/4 log α α ˆΣ s : ˆΣ s ˆΣ em r α A second and arguably more imortant issue, see Aendix B, in the setting of sarse covariance estimation is that of suort recovery or sarsistency Lam and Fan, 009; Rothman, Levina and Zhu, 009 o recover the suort of a covariance matrix that is, determine which entries σ i,j 0 we will require a class of sarse matrices from Equation 31 In ast work, a notation of aroximate sarsity is considered where the first condition in Uk, δ is relaced with max,,d d σ i,j q < k for q [0, 1 However, once we bound the non-zero entries away from zero by some δ, such aroximate sarsity imlies standard sarsity with q = 0 It is worth noting that the above Proosition 44 does not require such a sarsity class, because our estimator is forced to remain close enough to ˆΣ em to follow ˆΣ em s convergence to Σ heorem 45 Let X 1,, X n R d be iid with common measure µ Let µ be strictly log concave with some fixed constant c 0 from Definition 41 Furthermore, let Σ Uk, δ and let δ = On 1+ε for any ε > 0 hen, for ˆΣ s denoting the concentration estimator using the hard thresholding estimation from Section 1 with the oerator norm metric, lim P suˆσ s suσ = 0 n where suσ = {i, j : σ i,j 0} Remark 46 Note that the condition that δ = On 1+ε allows for a much quicker decay of the non-zero entries of Σ than in El Karoui 008 where the lower bound is of the form Cn α0 with 0 < α 0 < It is also much quicker than the similar rate achieved in Rothman, Levina and Zhu 009 where the lower bound is any τ such that nτ increases faster than logd with the enforced asymtotic condition that logd/n = o1 resulting in a rate no faster than n

9 4 Sub-Exonential Distributions AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 9 Comared with the reviously discussed measures with sub-gaussian concentration, there exists a larger class of measures with sub-exonential concentration Such measures can be secified as those that satisfy the Poincaré or sectral ga inequality Bobkov and Ledoux, 1997; Ledoux, 001; Gozlan, 010 For a random variable X on R d with measure µ, this is Var fx C f dµ for some C > 0 and for all locally Lischitz functions f If X satisfies such an inequality, then see heorem D6 or Chater 5 of Ledoux 001 for for X 1,, X n R d iid coies of X and for some Lischitz function φ : R d n R, P φx 1,, X n EφX 1,, X n + r ex 1K { } rb min, r where K > 0 in a constant deending only on C and a i φ, b max iφ,,n As in Section 41, φ is chosen to be 1 φx 1,, X n = n X i EXX i EX, which is Lischitz with constant n his results in values of a = 1 and b = n for the above coefficients Hence, the deviation threshold in this setting is r α = max{ K log α/ n, K log α} While an otimal or reasonable value for K may not be known, it makes little difference given the roosed rocedure for choosing α detailed in Section 3 for a desired false ositive rate his is because the term K log α will be equivalently tuned to determine the otimal size of the constructed confidence set As the deviation threshold in this setting is bounded below by a constant K log α, we do not achieve the nice convergence results as in the log concave setting However, the dimension-free concentration still allows for good erformance in simulation settings as will be seen in the Section 5 43 Bounded Random Variables In this section, we consider random variables that are bounded in some norm Consider a Banach sace B, and a collection of iid random variables X 1,, X n B such that X i U for all i = 1,, n Given only this assumtion, the bounded differences inequality, detailed in Aendix D3 and in Section 334 of Giné and Nickl 016, can be alied in this secific setting It rovides sub-gaussian concentration for such random variables Secifically, let X 1,, X n R d be iid with X i l U for i = 1,, n hen, for any [1, ], X i X i U, and ˆΣem P Σ E ˆΣ em Σ + r e nr /U his follows immediately from heorem D8 Hence, for any collection of real valued random vectors bounded in Euclidean norm, the bounded differences inequality can be alied to the emirical estimate for any of the -Schatten norms he deviation threshold is r α = U n log α However, unlike in the revious setting, the bounds may not necessarily be dimension free Examle 47 Distributions on the Hyercube If the comonents X i,j 1 such as for multivariate uniform or Rademacher random variables, then U = d Consequently, the deviation threshold r α = O d/n is not dimension free While this makes estimation with resect to oerator norm distance challenging, we can still use heorem 3 to fix the false ositive rate a

10 5 Numerical simulations AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 10 In the following subsections, we aly the methods from the revious sections to three multivariate distributions of interest: the Gaussian, Lalace, and Rademacher distributions In doing so, we aly heorem 3 to analytically determine the ideal confidence ball radius in order to construct a sarse estimator of Σ We also comare the suort recovery of our aroach against enalized estimators and standard alication of universal threshold estimators As mentioned before, our roosed concentration confidence set based method has a similar feel to regularized / enalized estimators as the larger the constructed confidence set is, the sarser the returned estimator will be hus, we comare our aroach with the following lasso style estimator from the R ackage PDSCE Rothman, 013, which otimizes ˆΣ PDS = arg min Σ 0 { Σ ˆΣ } em τ log detσ + λ Σ l 1 with τ, λ > 0 Here, the log det term is used to enforce ositive definiteness of the final solution, and l 1 is the lasso style enalty, which enforces sarsity he similar method from the R ackage scov Bien and ibshirani, 01, which uses a majorize-minimize algorithm to determine { ˆΣ MMA ˆΣem = arg min tr Σ 1 } log detσ 1 + λ Σ l 1 Σ 0 for some enalization λ > 0, was also considered but roved to run too slowly on high dimensional matrices ie d 00 to be included in the numerical exeriments Of course, we also comare our method against the four universal thresholding estimators alied to the emirical covariance matrix from Rothman, Levina and Zhu, 009, Hard, Soft, SCAD, and Adative LASSO: ˆΣ Hard λ ˆΣ SCAD λ = ˆΣ Soft λ ˆΣ Adt λ = {ˆσ i,j 1[ˆσ i,j > λ]} i,j ˆσ Soft i,j a 1 a ˆσ i,j λ + λ ˆσ Hard i,j = {signˆσ i,j ˆσ i,j λ + } i,j for ˆσ i,j λ for λ < ˆσ i,j aλ for ˆσ i,j > aλ = {signˆσ i,j ˆσ i,j λ η+1 ˆσ i,j η + } i,j where ˆσ i,j is the i, jth entry of the emirical covariance estimate, a = 37, and η = 1 he arameter λ > 0 is the threshold, which is chosen in ractice via cross validation with resect to the Hilbert-Schmidt norm Briefly, the data is slit in half, two emirical estimators are formed, one is thresholded, and λ is selected to minimize the Hilbert-Schmidt distance between the one emirical estimate and the other thresholded estimate 51 Multivariate Gaussian Data Let X 1,, X n R d be indeendent and identically distributed mean zero random vectors with a strictly log concave measure and covariance matrix Σ By Corollary D5, there exists a constant c 0 > 0 such that P φx EφX + r e nr /c 0 where φx = ˆΣ em Σ where ˆΣ em = n 1 n X i XX i X is the emirical estimate of the covariance matrix his results in the size 1 α confidence set for Σ { C 1 α = Σ : ˆΣ em Σ E ˆΣ em Σ + } c 0 /n log α for α 0, 1 In the notation of Section, r α = c 0 /n log α

11 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 11 able Percentage of false and true ositives for multivariate Gaussian data and Σ tri-diagonal with diagonal entries 1 and off-diagonal entries 03 False Positive % rue Positive % Dimension CoM 1% CoM 5% PDS Hard Soft SCAD Adt able 3 Percentage of false and true ositives for multivariate Lalace data and Σ tri-diagonal with diagonal entries 1 and off-diagonal entries 03 False Positive % rue Positive % Dimension CoM 1% CoM 5% PDS Hard Soft SCAD Adt In the multivariate Gaussian case, c 0 is the maximal eigenvalue of the covariance matrix Σ We avoid any issues of estimating c 0 in ractice Regardless of our choice for c 0 tuning the regularization arameter α to a secific false ositive rate negates the need for an accurate estimate of c 0 able dislays false ositive and true ositive ercentages for seven sarse estimators comuted over 100 relications of a random samle of size n = 50 of d = 50, 100, 00, 500 dimensional multivariate Gaussian data with a tri-diagonal covariance matrix Σ whose diagonal entries are 1 and whose off-diagonal entries are 03 We can clearly see that the concentration-based estimator aroaches the desired false ositive rate either 1% or 5% as the dimension increases In contrast, the thresholding estimators with threshold λ chosen via cross validation generally start with higher false ositive ercentages, which tend to zero as the dimension increases As noted in revious work, hard thresholding is overly aggressive he PDS method is very stable across changes in the dimension and maintains a constant 34% false ositive rate and 50% true ositive rate 5 Multivariate Lalace Data here are many ossible ways to extend the univariate Lalace distribution, also referred to as the double exonential distribution, onto R d For the following simulation study, we choose the extension detailed in Eltoft, Kim and Lee 006 Namely, let Z N 0, σ and let V Exonential 1 hen, X = V Z Lalace σ/, which has df fx = σ 1 ex x /σ and variance Var X = σ For the multivariate setting, now let Z R d be multivariate Gaussian with zero mean and covariance Σ and, once again, let V Exonential 1 hen, we declare X = V Z to have a multivariate Lalace distribution with zero mean and covariance Σ able 3 dislays false ositive and true ositive ercentages for seven sarse estimators comuted over 100 relications of a random samle of size n = 50 of d = 50, 100, 00, 500 dimensional multivariate Lalace data with a tri-diagonal covariance matrix Σ whose diagonal entries are 1 and whose off-diagonal entries are 03 Similarly to the revious setting, the concentration-based estimator aroaches the desired false ositive rate either 1% or 5% as the dimension increases All universal thresholding estimators set most of the entries in the matrix to zero when threshold λ chosen via cross validation he PDS method is still stable across changes in the dimension but fixates on a much higher false ositive rate around 15% and a similar true ositive rate of 51%

12 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 1 Multivariate Gaussian Data Multivariate Lalace Data rue Positive % Adt Hard SCAD Soft PDS rue Positive % All hresholds PDS False Positive % False Positive % Multivariate Rademacher Data rue Positive % Soft SCAD PDS 10 0 Adt Hard False Positive % Fig 1 A line demarcating the trade-off between false and true ositive recoveries for multivariate Gaussian to left, Lalace to right, and Rademacher bottom data from 100 relications of samle size n = 50 and dimension d = 100 he lines were formed by varying the false ositive rate for the concentration estimator he blue dots indicate the erformance of ast methods for sarse covariance estimation

13 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 13 able 4 Percentage of false and true ositives for multivariate Rademacher data and Σ tri-diagonal with diagonal entries 1 and off-diagonal entries 03 False Positive % rue Positive % Dimension CoM 1% CoM 5% PDS Hard Soft SCAD Adt High Dimensional Binary Vectors Random binary vectors fall into the category of bounded random variables, which have sub-gaussian concentration as a consequence of the bounded differences inequality an extension of Hölder s inequality as discussed in Section 43 he result is a slightly different form for the deviation threshold And while the concentration behaviour in this setting relies on the dimension and is oor for roducing an estimator that is close in oerator or Hilbert-Schmidt norm, our suort recovery methodology is still able to erform well in this setting able 4 dislays false ositive and true ositive ercentages for seven sarse estimators comuted over 100 relications of a random samle of size n = 50 of d = 50, 100, 00, 500 dimensional multivariate Rademacher data with a tri-diagonal covariance matrix Σ whose diagonal entries are 1 and whose off-diagonal entries are 03 As a consequence of the bounded differences inequality, this case also exhibits sub-gaussian behaviour As a consequence, able 4 and similar to able Concentration estimators erform better as d increases; hreshold estimators are overly aggressive as d increases; And the PDS method s suort recover is unaffected by the change in d 54 Small Round Blue-Cell umour Data Following the same analysis erformed in Rothman, Levina and Zhu 009 and subsequently in Cai and Liu 011, we will consider the dataset resulting from the small round blue-cell tumour SRBC microarray exeriment Khan et al, 001 he data set consists of a training set of 64 vectors containing 308 gene exressions he data contains four tyes of tumours denoted EWS, BL-NHL, NB, and RMS As erformed in the two revious aers, the genes are ranked by their resective amount of discriminative information according to their F -statistic 1 k m=1 n m x m x F = k 1 1 n k k m=1 n m 1ˆσ m where x is the samle mean, k = 4 is the number of classes, n = 64 is the samle size, n m is the samle size of class m, and likewise, x m and ˆσ m are, resectively, the samle mean and variance of class m he to 40 and bottom 160 scoring genes were selected to rovide a mix of the most and least informative genes able 5 dislays the results of alying the four threshold estimators with cross validation, the PDS method, and our concentration-based thresholding with the sub-gaussian formula and with false ositive rates of 10, 5, and 1 ercent he ercentage of matrix entries that are retained for the most informative block and the least informative block are tabulated Deending on the chosen false ositive rate, our concentration-based estimators give similar results to Soft and SCAD thresholding PDS is the least conservative of the methods as it kees the most entries Hard and Adative LASSO thresholding are the most aggressive methods It is also worth noting that our method is comutationally efficient enough to consider the entire matrix at once In fact, it took only 1313 seconds to comute ˆΣ s on an Intel i7-7567u CPU, 350GHz In contrast, the PDS method, which still has significantly faster run times than cross validating the threshold estimators, took over 101 minutes to finish False ositive rates of 5%, 1%, and 01% were tested he fraction of non-zero entries in ˆΣ s was 86%, 0%, and 0%, resectively For comarison, the fraction of non-zero

14 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 14 able 5 he ercentages of non-zero off-diagonal entries in the six resective covariance estimates artitioned into two arts: the informative art is the block of the highest scoring genes; the uninformative art is the remaining matrix entries non-zero % CoM 10% CoM 5% CoM 1% PDS Informative 303% 56% 85% 473% Uninformative 54% 7% 04% 156% Hard Soft SCAD Adt Informative 60% 47% 13% 99% Uninformative 03% 3% 18% 07% Fig A density lot of the number of non-zero entries in ˆΣ s R artitioned into 1 1 blocks for false ositive rates of 5%, 1%, and 01% entries retained by PDS was 177% If such an analysis is meant to lead to follow-u research on secific gene airings, then culling as many false ositives as ossible is of critical imortance he sarse covariance estimator was artitioned into 1 1 blocks and the number of non-zero entries was tabulated for each he results are dislayed in Figure 6 Summary and Extensions In this article, we were concerned with the suort recovery and estimation of an unknown sarse covariance matrix through the search of non-asymtotic confidence sets constructed via concentration inequalities Our method is able to construct an estimator with a secific false ositive rate agnostic to distributional assumtions via heorem 3 Furthermore, this aroach was secifically shown in heorem 45 to give desirable convergence results for recovery of the suort in the case of log concave measures Numerical erformance on simulated data detailed in Section 5 rovided similar and often suerior erformance to ast aroaches in the log concave, sub-exonential, and bounded data settings While our focus was on sarsity, the generic rocedure of searching the confidence set with some structural criterion in mind can be alied in much more generality Given some assumed form or roerty of the unknown true covariance matrix, a similar search rocedure may be formulated to recover such matrices Our method is similar to the enalization estimators such as lasso in the sense that the arameter α can be tweaked to determine the amount of sarsity to allow However, unlike some of the comlicated otimizations surrounding lasso enalization, our estimator is fast to comute even in high dimensions where such comlicated otimization stes required by lasso enalties can be comutationally intractable his was the case with the majorize-minimization algorithm whose R instantiation became intractably slow when d = 00 As was mentioned in the introduction, the emirical covariance is known in cases of sarsity and high dimension to be a oor estimator We nevertheless chose it as our starting oint for the search rocedure as its form, being a sum of iid random variables, fits nicely into the concentration aradigm However, this does not imly that other starting oints should not be considered Constructing concentration inequality

15 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 15 based confidence sets about more comlicated estimators may not be straightforward But, if some such set can be constructed, then alying our search technique will likely result in even better estimates of the true covariance matrix his methodology is also extendable beyond the setting of covariance matrices to other matrix estimation settings of interest Most readily would be the case of matrix regression For examle, let X R n q, Y R n, E R n, and B R q We could consider the model Y = XB + E where B is the unknown regression oerator and the entries of E are assumed to be a matrix of n centred iid random vectors in R with either sub-gaussian or sub-exonential tails and common covariance matrix Σ R In this case, we could adat the resented theory to the suort recovery and estimation of B under the assumtion of sarsity Even with starting at the emirical covariance matrix, the method of Section 1 was shown in the numerical simulations to have suerior erformance in suort recovery when comared with other methods as it is able to be tuned to a desired false ositive rate as we done in Section 3 It is feasible that a hybrid method is ossible where our method is used to recover the suort of Σ followed by some other technique for imroving uon the non-zero entries Aendix A: Proofs Proof of Lemma 34 his roof follows from the result of Lata la 005 heorem also found in heorem 38 of ao 01 without the assumtion of iid entries in the random matrix We first aly the exectation with resect to E and use the result from Lata la 005 E A E = E A E E A E [ E A E E A E d K 1 E max,,d ] j=1 a i,j + K E d i,j=1 a 4 i,j with K 1, K universal constants For the second term in the above equation, we have via Jensen s inequality and the fact that a i,j 1 that d d E d ρ = dρ i,j=1 a 4 i,j i,j=1 Ea 4 i,j For the first term in the above equation, we make use of the fact that a i,j 1 and that only ρ are non-zero resulting in d d d E max E a i,j,,d j=1 a i,j j=1 d E a i,j + i,j d i,j k=1 d ρ + d 3 d ρ dρ + d 3/ ρ a i,j a i,k Combining the above results and udating the constants K 1, K as necessary gives the desired result [ E A E K 1 dρ + K d ρ] 3/ K1 d ρ 1/4 + K d 3/4 ρ

16 AB Kashlak and L Kong/Nonasymtotic suort recovery and estimation 16 Proof of heorem 3 Without loss of generality, we can normalize ˆΣ em such that the diagonal entries are em 1 hus ˆΣ 0 = I d, the d dimensional identity matrix, and the off-diagonal entries of all matrices considered will be bounded in absolute value by one For the emirical covariance estimator, ˆΣ em em ˆΣ 0 = ˆΣ em 1 We can decomose ˆΣ em into three arts: the diagonal of ones; the off-diagonal terms corresonding to σ i,j 0; and the off-diagonal terms corresonding to σ i,j = 0 he number of non-zero off-diagonal terms is bounded in each row/column by k Hence, ˆΣ em em ˆΣ 0 ˆΣ em 0 em + ˆΣ =0 em k + ˆΣ =0 em where ˆΣ =0 has entries ˆσ i,j such that Eˆσ i,j = 0 Let the entrywise or Hadamard roduct of two similar matrices A and B be A B with entry ijth entry em a i,j b i,j i,j For ease of notation, we denote Π 0 = ˆΣ =0 Let Π 1 be the result of randomly removing half of the entries from Π 0, which is Π 1 = Π 0 B where B {0, 1} d d is a symmetric random matrix with iid Bernoulli entries Considering the corresonding symmetric Rademacher random matrix, E = B 1, we then have that E Π 1 = E Π 0 B = 1 E Π 0 ± Π 0 E where the ± comes from the symmetry of E hus, E Π 1 1 E Π 0 1 E Π 0 E his idea can be iterated Let Π m = Π 0 B 1 B m with the B i iid coies of B from before hen, similarly, Moreover, E Π m 1 E Π m Π m 1 E m E Π m 1 E Π m 1 1 Π m 1 E m m 1 E Πm m E Π 0 m+j E Π j E Alying Lemma 34 m times and udating universal constants K 1, K as necessary results in hus, for ρ = m, we have m 1 E Π [ m m E Π 0 m+j K 1 d j/4 + K d 3/4 j/] j=0 j=0 K 1 d m/4 + K d 3/4 m/ E Π m ρe Π 0 K 1 d ρ 1/4 + K d 3/4 ρ em We want to relace the Π m with ˆΣ ρ Σ 0 and similarly for Π 0 he off-diagonal entries such that σ i,j 0 can contribute at most k = od ν, ν < 1, to the oerator norm Hence, E ˆΣ em ρ Σ 0 ρe ˆΣ em Σ 0 K 1 d ρ 1/4 + K d 3/4 ρ ρod ν We lastly aly the crude but effective in the non-asymtotic setting bound ˆΣ em d/n almost surely Dividing by E ˆΣ em Σ 0 results in E ˆΣ em ρ Σ 0 ρ E ˆΣ em Σ 0 K 1nd ρ 1/4 + K nd 1/4 ρ + ond ν 1

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

arxiv: v2 [stat.me] 3 Nov 2014

arxiv: v2 [stat.me] 3 Nov 2014 onarametric Stein-tye Shrinkage Covariance Matrix Estimators in High-Dimensional Settings Anestis Touloumis Cancer Research UK Cambridge Institute University of Cambridge Cambridge CB2 0RE, U.K. Anestis.Touloumis@cruk.cam.ac.uk

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

A multiple testing approach to the regularisation of large sample correlation matrices

A multiple testing approach to the regularisation of large sample correlation matrices A multile testing aroach to the regularisation of large samle correlation matrices Natalia Bailey Queen Mary, University of London M. Hashem Pesaran University of Southern California, USA, and rinity College,

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Numerous alications in statistics, articularly in the fitting of linear models. Notation and conventions: Elements of a matrix A are denoted by a ij, where i indexes the rows and

More information

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668

More information

Elementary theory of L p spaces

Elementary theory of L p spaces CHAPTER 3 Elementary theory of L saces 3.1 Convexity. Jensen, Hölder, Minkowski inequality. We begin with two definitions. A set A R d is said to be convex if, for any x 0, x 1 2 A x = x 0 + (x 1 x 0 )

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

An Analysis of Reliable Classifiers through ROC Isometrics

An Analysis of Reliable Classifiers through ROC Isometrics An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit

More information

The spectrum of kernel random matrices

The spectrum of kernel random matrices The sectrum of kernel random matrices Noureddine El Karoui Deartment of Statistics, University of California, Berkeley Abstract We lace ourselves in the setting of high-dimensional statistical inference,

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 000-9939XX)0000-0 ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS WILLIAM D. BANKS AND ASMA HARCHARRAS

More information

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment

More information

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems Int. J. Oen Problems Comt. Math., Vol. 3, No. 2, June 2010 ISSN 1998-6262; Coyright c ICSRS Publication, 2010 www.i-csrs.org Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

Sharp gradient estimate and spectral rigidity for p-laplacian

Sharp gradient estimate and spectral rigidity for p-laplacian Shar gradient estimate and sectral rigidity for -Lalacian Chiung-Jue Anna Sung and Jiaing Wang To aear in ath. Research Letters. Abstract We derive a shar gradient estimate for ositive eigenfunctions of

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

COMMUNICATION BETWEEN SHAREHOLDERS 1

COMMUNICATION BETWEEN SHAREHOLDERS 1 COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov

More information

Covariance Matrix Estimation for Reinforcement Learning

Covariance Matrix Estimation for Reinforcement Learning Covariance Matrix Estimation for Reinforcement Learning Tomer Lancewicki Deartment of Electrical Engineering and Comuter Science University of Tennessee Knoxville, TN 37996 tlancewi@utk.edu Itamar Arel

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables High-dimensional Ordinary Least-squares Projection for Screening Variables Xiangyu Wang and Chenlei Leng arxiv:1506.01782v1 [stat.me] 5 Jun 2015 Abstract Variable selection is a challenging issue in statistical

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

Projected Principal Component Analysis. Yuan Liao

Projected Principal Component Analysis. Yuan Liao Projected Princial Comonent Analysis Yuan Liao University of Maryland with Jianqing Fan and Weichen Wang January 3, 2015 High dimensional factor analysis and PCA Factor analysis and PCA are useful tools

More information

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.

More information

t 0 Xt sup X t p c p inf t 0

t 0 Xt sup X t p c p inf t 0 SHARP MAXIMAL L -ESTIMATES FOR MARTINGALES RODRIGO BAÑUELOS AND ADAM OSȨKOWSKI ABSTRACT. Let X be a suermartingale starting from 0 which has only nonnegative jums. For each 0 < < we determine the best

More information

Finding a sparse vector in a subspace: linear sparsity using alternating directions

Finding a sparse vector in a subspace: linear sparsity using alternating directions IEEE TRANSACTION ON INFORMATION THEORY VOL XX NO XX 06 Finding a sarse vector in a subsace: linear sarsity using alternating directions Qing Qu Student Member IEEE Ju Sun Student Member IEEE and John Wright

More information

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is

More information

Hidden Predictors: A Factor Analysis Primer

Hidden Predictors: A Factor Analysis Primer Hidden Predictors: A Factor Analysis Primer Ryan C Sanchez Western Washington University Factor Analysis is a owerful statistical method in the modern research sychologist s toolbag When used roerly, factor

More information

Estimating function analysis for a class of Tweedie regression models

Estimating function analysis for a class of Tweedie regression models Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

1 Extremum Estimators

1 Extremum Estimators FINC 9311-21 Financial Econometrics Handout Jialin Yu 1 Extremum Estimators Let θ 0 be a vector of k 1 unknown arameters. Extremum estimators: estimators obtained by maximizing or minimizing some objective

More information

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Dedicated to Luis Caffarelli for his ucoming 60 th birthday Matteo Bonforte a, b and Juan Luis Vázquez a, c Abstract

More information

Chapter 3. GMM: Selected Topics

Chapter 3. GMM: Selected Topics Chater 3. GMM: Selected oics Contents Otimal Instruments. he issue of interest..............................2 Otimal Instruments under the i:i:d: assumtion..............2. he basic result............................2.2

More information

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES OHAD GILADI AND ASSAF NAOR Abstract. It is shown that if (, ) is a Banach sace with Rademacher tye 1 then for every n N there exists

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

where x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H:

where x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H: Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 2 October 25, 2017 Due: November 08, 2017 A. Growth function Growth function of stum functions.

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund

More information

Solution sheet ξi ξ < ξ i+1 0 otherwise ξ ξ i N i,p 1 (ξ) + where 0 0

Solution sheet ξi ξ < ξ i+1 0 otherwise ξ ξ i N i,p 1 (ξ) + where 0 0 Advanced Finite Elements MA5337 - WS7/8 Solution sheet This exercise sheets deals with B-slines and NURBS, which are the basis of isogeometric analysis as they will later relace the olynomial ansatz-functions

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

Adaptive estimation with change detection for streaming data

Adaptive estimation with change detection for streaming data Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment

More information

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:

More information

Khinchine inequality for slightly dependent random variables

Khinchine inequality for slightly dependent random variables arxiv:170808095v1 [mathpr] 7 Aug 017 Khinchine inequality for slightly deendent random variables Susanna Sektor Abstract We rove a Khintchine tye inequality under the assumtion that the sum of Rademacher

More information

A Recursive Block Incomplete Factorization. Preconditioner for Adaptive Filtering Problem

A Recursive Block Incomplete Factorization. Preconditioner for Adaptive Filtering Problem Alied Mathematical Sciences, Vol. 7, 03, no. 63, 3-3 HIKARI Ltd, www.m-hiari.com A Recursive Bloc Incomlete Factorization Preconditioner for Adative Filtering Problem Shazia Javed School of Mathematical

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

q-ary Symmetric Channel for Large q

q-ary Symmetric Channel for Large q List-Message Passing Achieves Caacity on the q-ary Symmetric Channel for Large q Fan Zhang and Henry D Pfister Deartment of Electrical and Comuter Engineering, Texas A&M University {fanzhang,hfister}@tamuedu

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität

More information

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces Abstract and Alied Analysis Volume 2012, Article ID 264103, 11 ages doi:10.1155/2012/264103 Research Article An iterative Algorithm for Hemicontractive Maings in Banach Saces Youli Yu, 1 Zhitao Wu, 2 and

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

A New Asymmetric Interaction Ridge (AIR) Regression Method

A New Asymmetric Interaction Ridge (AIR) Regression Method A New Asymmetric Interaction Ridge (AIR) Regression Method by Kristofer Månsson, Ghazi Shukur, and Pär Sölander The Swedish Retail Institute, HUI Research, Stockholm, Sweden. Deartment of Economics and

More information

Applications to stochastic PDE

Applications to stochastic PDE 15 Alications to stochastic PE In this final lecture we resent some alications of the theory develoed in this course to stochastic artial differential equations. We concentrate on two secific examles:

More information

Estimating Time-Series Models

Estimating Time-Series Models Estimating ime-series Models he Box-Jenkins methodology for tting a model to a scalar time series fx t g consists of ve stes:. Decide on the order of di erencing d that is needed to roduce a stationary

More information

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation Uniformly best wavenumber aroximations by satial central difference oerators: An initial investigation Vitor Linders and Jan Nordström Abstract A characterisation theorem for best uniform wavenumber aroximations

More information

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH DORIN BUCUR, ALESSANDRO GIACOMINI, AND PAOLA TREBESCHI Abstract For Ω R N oen bounded and with a Lischitz boundary, and

More information

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de

Chater Matrix Norms and Singular Value Decomosition Introduction In this lecture, we introduce the notion of a norm for matrices The singular value de Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A Dahleh George Verghese Deartment of Electrical Engineering and Comuter Science Massachuasetts Institute of Technology c Chater Matrix Norms

More information

Improving AOR Method for a Class of Two-by-Two Linear Systems

Improving AOR Method for a Class of Two-by-Two Linear Systems Alied Mathematics 2 2 236-24 doi:4236/am22226 Published Online February 2 (htt://scirporg/journal/am) Imroving AOR Method for a Class of To-by-To Linear Systems Abstract Cuixia Li Shiliang Wu 2 College

More information

Maxisets for μ-thresholding rules

Maxisets for μ-thresholding rules Test 008 7: 33 349 DOI 0.007/s749-006-0035-5 ORIGINAL PAPER Maxisets for μ-thresholding rules Florent Autin Received: 3 January 005 / Acceted: 8 June 006 / Published online: March 007 Sociedad de Estadística

More information

Preconditioning techniques for Newton s method for the incompressible Navier Stokes equations

Preconditioning techniques for Newton s method for the incompressible Navier Stokes equations Preconditioning techniques for Newton s method for the incomressible Navier Stokes equations H. C. ELMAN 1, D. LOGHIN 2 and A. J. WATHEN 3 1 Deartment of Comuter Science, University of Maryland, College

More information

Introduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1)

Introduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1) Introduction to MVC Definition---Proerness and strictly roerness A system G(s) is roer if all its elements { gij ( s)} are roer, and strictly roer if all its elements are strictly roer. Definition---Causal

More information

Proximal methods for the latent group lasso penalty

Proximal methods for the latent group lasso penalty Proximal methods for the latent grou lasso enalty The MIT Faculty has made this article oenly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Villa,

More information

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling Scaling Multile Point Statistics or Non-Stationary Geostatistical Modeling Julián M. Ortiz, Steven Lyster and Clayton V. Deutsch Centre or Comutational Geostatistics Deartment o Civil & Environmental Engineering

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

On information plus noise kernel random matrices

On information plus noise kernel random matrices On information lus noise kernel random matrices Noureddine El Karoui Deartment of Statistics, UC Berkeley First version: July 009 This version: February 5, 00 Abstract Kernel random matrices have attracted

More information

arxiv: v1 [stat.ml] 10 Mar 2016

arxiv: v1 [stat.ml] 10 Mar 2016 Global and Local Uncertainty Princiles for Signals on Grahs Nathanael Perraudin, Benjamin Ricaud, David I Shuman, and Pierre Vandergheynst March, 206 arxiv:603.03030v [stat.ml] 0 Mar 206 Abstract Uncertainty

More information

Linear diophantine equations for discrete tomography

Linear diophantine equations for discrete tomography Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,

More information

2-D Analysis for Iterative Learning Controller for Discrete-Time Systems With Variable Initial Conditions Yong FANG 1, and Tommy W. S.

2-D Analysis for Iterative Learning Controller for Discrete-Time Systems With Variable Initial Conditions Yong FANG 1, and Tommy W. S. -D Analysis for Iterative Learning Controller for Discrete-ime Systems With Variable Initial Conditions Yong FANG, and ommy W. S. Chow Abstract In this aer, an iterative learning controller alying to linear

More information

Probability Estimates for Multi-class Classification by Pairwise Coupling

Probability Estimates for Multi-class Classification by Pairwise Coupling Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics

More information

A Note on Guaranteed Sparse Recovery via l 1 -Minimization

A Note on Guaranteed Sparse Recovery via l 1 -Minimization A Note on Guaranteed Sarse Recovery via l -Minimization Simon Foucart, Université Pierre et Marie Curie Abstract It is roved that every s-sarse vector x C N can be recovered from the measurement vector

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Anal. Al. 44 (3) 3 38 Contents lists available at SciVerse ScienceDirect Journal of Mathematical Analysis and Alications journal homeage: www.elsevier.com/locate/jmaa Maximal surface area of a

More information

Spectral Clustering based on the graph p-laplacian

Spectral Clustering based on the graph p-laplacian Sectral Clustering based on the grah -Lalacian Thomas Bühler tb@cs.uni-sb.de Matthias Hein hein@cs.uni-sb.de Saarland University Comuter Science Deartment Camus E 663 Saarbrücken Germany Abstract We resent

More information

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley Elements of Asymtotic Theory James L. Powell Deartment of Economics University of California, Berkeley Objectives of Asymtotic Theory While exact results are available for, say, the distribution of the

More information

On Z p -norms of random vectors

On Z p -norms of random vectors On Z -norms of random vectors Rafa l Lata la Abstract To any n-dimensional random vector X we may associate its L -centroid body Z X and the corresonding norm. We formulate a conjecture concerning the

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA Proceedings of the 2011 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelsach, K. P. White, and M. Fu, eds. EFFICIENT RARE EVENT SIMULATION FOR HEAVY-TAILED SYSTEMS VIA CROSS ENTROPY Jose

More information

Optimal Estimation and Completion of Matrices with Biclustering Structures

Optimal Estimation and Completion of Matrices with Biclustering Structures Journal of Machine Learning Research 17 (2016) 1-29 Submitted 12/15; Revised 5/16; Published 9/16 Otimal Estimation and Comletion of Matrices with Biclustering Structures Chao Gao Yu Lu Yale University

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Finite Mixture EFA in Mplus

Finite Mixture EFA in Mplus Finite Mixture EFA in Mlus November 16, 2007 In this document we describe the Mixture EFA model estimated in Mlus. Four tyes of deendent variables are ossible in this model: normally distributed, ordered

More information

Chapter 7: Special Distributions

Chapter 7: Special Distributions This chater first resents some imortant distributions, and then develos the largesamle distribution theory which is crucial in estimation and statistical inference Discrete distributions The Bernoulli

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

Distributed Rule-Based Inference in the Presence of Redundant Information

Distributed Rule-Based Inference in the Presence of Redundant Information istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced

More information

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity Bayesian Satially Varying Coefficient Models in the Presence of Collinearity David C. Wheeler 1, Catherine A. Calder 1 he Ohio State University 1 Abstract he belief that relationshis between exlanatory

More information

HENSEL S LEMMA KEITH CONRAD

HENSEL S LEMMA KEITH CONRAD HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar

More information

GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS

GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS International Journal of Analysis Alications ISSN 9-8639 Volume 5, Number (04), -9 htt://www.etamaths.com GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS ILYAS ALI, HU YANG, ABDUL SHAKOOR Abstract.

More information

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT ZANE LI Let e(z) := e 2πiz and for g : [0, ] C and J [0, ], define the extension oerator E J g(x) := g(t)e(tx + t 2 x 2 ) dt. J For a ositive weight ν

More information

Correspondence Between Fractal-Wavelet. Transforms and Iterated Function Systems. With Grey Level Maps. F. Mendivil and E.R.

Correspondence Between Fractal-Wavelet. Transforms and Iterated Function Systems. With Grey Level Maps. F. Mendivil and E.R. 1 Corresondence Between Fractal-Wavelet Transforms and Iterated Function Systems With Grey Level Mas F. Mendivil and E.R. Vrscay Deartment of Alied Mathematics Faculty of Mathematics University of Waterloo

More information

Online Learning for Sparse PCA in High Dimensions: Exact Dynamics and Phase Transitions

Online Learning for Sparse PCA in High Dimensions: Exact Dynamics and Phase Transitions Online Learning for Sarse PCA in High Dimensions: Eact Dynamics and Phase Transitions Chuang Wang and Yue M. Lu John A. Paulson School of Engineering and Alied Sciences Harvard University Abstract We study

More information

Multi-Operation Multi-Machine Scheduling

Multi-Operation Multi-Machine Scheduling Multi-Oeration Multi-Machine Scheduling Weizhen Mao he College of William and Mary, Williamsburg VA 3185, USA Abstract. In the multi-oeration scheduling that arises in industrial engineering, each job

More information

c Copyright by Helen J. Elwood December, 2011

c Copyright by Helen J. Elwood December, 2011 c Coyright by Helen J. Elwood December, 2011 CONSTRUCTING COMPLEX EQUIANGULAR PARSEVAL FRAMES A Dissertation Presented to the Faculty of the Deartment of Mathematics University of Houston In Partial Fulfillment

More information

Quantitative estimates of propagation of chaos for stochastic systems with W 1, kernels

Quantitative estimates of propagation of chaos for stochastic systems with W 1, kernels oname manuscrit o. will be inserted by the editor) Quantitative estimates of roagation of chaos for stochastic systems with W, kernels Pierre-Emmanuel Jabin Zhenfu Wang Received: date / Acceted: date Abstract

More information

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application BULGARIA ACADEMY OF SCIECES CYBEREICS AD IFORMAIO ECHOLOGIES Volume 9 o 3 Sofia 009 Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Alication Svetoslav Savov Institute of Information

More information