SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION

Size: px
Start display at page:

Download "SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION"

Transcription

1 SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION By Jianqing Fan, Yingying Fan and Emre Barut Princeton University, University of Southern California and IBM T.J. Watson Research Center APPENDIX A: ADDITIONAL PROOFS A.1. Proof of Theorem 3. This proof is motivated by Pollard In the proof we use C > 0 to denote a generic positive constant. Let θ = V 1 n β 1 β 1, then β 1 = β 1 +V n θ. Letting G n θ = L n β 1, 0 L n β 1, 0, we have that A.1 G n θ = ρ τ ε Z n θ 1 ρ τ ε 1 + nλ n d0 β 1 + V n θ 1 d 0 β 1 1, where we have used the shorthand notation that ρ τ u 1 = n ρ τ u i for any vector u = u 1,, u n T. Since L n β 1, 0 is minimized at β 1 = β o 1, it follows that G n θ is minimized at θ n = V 1 n β o 1 β 1. We consider θ over the convex open set B 0 n = {θ R s : θ 2 < c 6 s}, with some constant c 6 > 0 independent of s. The idea of the proof is to approximate the stochastic function G n θ by a quadratic function, whose minimizer is shown to possess the asymptotic normality. Since G n θ and the quadratic approximation are close, the minimizer of G n θ enjoys the same asymptotic normality. Now, we proceed to prove Theorem 3. Decompose G n θ into its mean and centralized stochastic component: A.2 G n θ = Q n θ + T n θ, where Q n θ = E[G n θ] and A.3 T n θ = ρ τ ε Z n θ 1 ρ τ ε 1 E [ ] ρ τ ε Z n θ 1 ρ τ ε 1. 1

2 2 FAN ET AL. We first deal with the mean component Q n θ. Since θ 2 < c 6 s over the set B 0 n, it follows from the Cauchy-Schwarz inequality and the assumption of the theorem that H 1/2 Z n θ θ 2 max H 1/2 Z ni 2 = o s 3 log s 1. i Then, since f u is bounded in a small neighborhood of 0, by using a similar argument as equation 7.3 of the main paper and noting that n f i0 Z T niθ 2 = θ T Z T n HZ n θ = θ 2 2 we can show that E [ ] A.4 ρ τ ε Z n θ 1 ρ τ ε 1 = θ n f 2 i0 Z T n,iθ 3 + o n f i0 Z T n,iθ 3. Furthermore, since n f i0 Z T niθ 2 = θ 2 2 < c2 6s, it follows that A.5 n f i0 Z T n,iθ 3 C H 1/2 Z n θ n f i 0 Z T niθ 2 = o s 2 log s 1. Next, we deal with the penalty term in the expected value Q n θ. Since 1 n ST HS has bounded eigenvalues by Condition 2, it follows from the assumption of the theorem that, for any θ B 0 n, V n θ V n θ 2 Cn 1/2 θ 2 = o min {1 j s} β j. Hence, sgnβ 1 + V n θ = sgnβ 1 and A.6 d 0 β 1 + V n θ 1 d 0 β 1 1 = d T 0 V n θ, where d 0 is a s-vector with j-th component d j sgnβj. Combining A.4 A.6 yields A.7 Q n θ = θ nλ n dt 0 V n θ + o 1, where o is uniformly over all θ B 0 n. We now deal with the stochastic part T n θ. Define D = ρ τ ε 1,, ρ τ ε n T, W n = Z T n D, and R n θ = ρ τ ε Z n θ 1 ρ τ ε 1 W T n θ.

3 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 3 Then E[W T n θ] = 0 and A.8 T n θ = W T n θ + r n θ, where r n θ = R n θ E[R n θ]. Here, W T n θ can be regarded as the first order approximation of ρ τ ε Z n θ 1 ρ τ ε 1. We next show r n θ is uniformly small. By Lemma 3, there exists a sequence b n such that for any ϵ > 0, A.9 P r n θ ϵ exp Cϵb n slog s. Define λ n θ = G n θ nλ n dt 0 V n θ W T n θ and λθ = θ 2 2. Then by definition A.1, λ n θ and λθ are both convex functions on the set B 0 n. Furthermore, by definition, we can write r n θ as r n θ = λ n θ λθ o1. Since θ 2 < c 6 s for all θ B0 n, by Condition 1 we have that for any θ 1, θ 2 B 0 n, λθ1 λθ 2 = θ1 + θ 2 T θ 1 θ 2 θ 1 + θ 2 2 θ 1 θ 2 2 Cs θ 1 θ 2. Thus, the above result and A.9 indicate that conditions in Lemma 4 are satisfied. Then, for any compact set K s = { θ 2 c 4 s} B0 n with 0 < c 4 < c 6 some constant, A.10 sup r n θ = o p 1. θ K s Combining A.2, A.7 and A.8 we can write that A.11 A.12 G n θ = θ nλ n dt 0 V n θ + W T n θ + r n θ + o 1 = θ η n 2 2 η n r n θ + o 1, where η n = 1 2 nλn V n d0 + W n. By a classic weak convergence result, it is easy to see that c T Z T n Z n 1/2 W n D N0, τ1 τ, for any c R s satisfying c T c = 1. It follows immediately that A.13 c T Z T n Z n 1/2 η n nλ nv n d0 D N 0, τ1 τ.

4 4 FAN ET AL. We only need to show that the minimizer θ of G n θ is close to η n, i.e., for any ϵ > 0, A.14 P θ η n 2 ϵ 0. Hence Theorem 3 will follow from A.13 and Slutsky s lemma. We now proceed to prove A.14. First, let B 1 n be a ball with center η n and radius ϵ. Since c T W n has asymptotic normal distribution N0, 1 for any c R s with c T c = 1, and nv n V T n = 1 n ST HS 1 has bounded eigenvalues, by definition, we can bound η n as A.15 η n Wn 2 + nλ n V T n d 0 2 Op s + Cλ n n d0 2 = C s O p1, where the last step is by the assumption λ n n d0 2 = O s of the theorem. Since c 6 in the definition of B 0 n can be chosen to be much larger than C/2, it follows that for each fixed s, the compact set K s = { θ 2 c 4 s} B0 n with c 4 large enough can cover the ball B 1 n with probability arbitrarily close to 1. Therefore, by A.10 A.16 n sup r n θ sup r n θ = o p 1. θ B 1 n θ K s Now, we are ready to prove A.14. Consider the behavior of G n θ outside of the ball B 1 n. Let θ = η n + κu R s be a vector outside the ball B 1 n, where u R s is a unit vector and κ is a constant satisfying κ > ϵ, with ϵ the radius of B 1 n. Define θ as the boundary point of B 1 n that lies on the line segment connecting η n and θ. Then we can write θ = η n + ϵu = 1 ϵ/κη n + ϵθ/κ. By the convexity of G n, A.12 and A.16, ϵ κ G nθ + 1 ϵ Gn η κ n G n θ ϵ 2 η n 2 2 n ϵ 2 + G n η n 2 n. Since ϵ < κ, it follows that for large enough n, A.17 inf { θ ηn >ϵ} G n θ G n η n + κ ϵ [ϵ2 o p 1] > G n η n. This establishes A.14 and proves Theorem 3.

5 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 5 A.2. Proof of Theorem 5. The proof of Theorem 5 follows from those of Theorems 3 and 4. We use C to denote a generic constant in the proof. By Theorem 4, with asymptotic probability one, there exists a global minimizer β = β T 1, 0 T T of L n β and β 1 β 1 2 a n. Next we study the asymptotic normality of β 1. Following A.1 in the proof of Theorem 3, define G n θ = L n V n θ + β 1, 0 L n β 1, 0, where V n and θ are the same as in the proof of Theorem 3. Then θ n = V 1 n β 1 β 1 is a global minimizer of G n θ. The idea of the proof is to show that G n θ in A.1 and G n θ are uniformly close to each other. Since G n θ can be well approximated by a sequence of quadratic functions, Gn θ can be well approximated by the same sequence of quadratic functions. Thus, the minimizer of Gn θ enjoys the same asymptotic properties as that of G n θ. We now proceed to prove that G n θ and G n θ are uniformly close. To this end, first note that for any β 1 with β 1 2 < C s, A.18 L n β 1, 0 L n β 1, 0 nλ n β 1 2 d 0 d 0 2 Cnλ n s d 0 d 0 2. For 1 j s, by the mean-value theorem, A.19 d j ˆd j = p λ n βj p λ n β j ini = p λ n β j βj ini β j, where β j lies on the segment connecting βj and β ini j. By Condition 5 and the triangle inequality, with asymptotic probability one β j β j β j β j > β j C 2 slog p/n > 2 1 min j<s β j. This together with Condition 6 ensures that p λ n β j = o p s 1 λ 1 n n log p 1/2. Thus, in view of A.19, d 0 d 0 2 o p s 1 λ 1 n n log p 1/2 β ini 1 β 1 2 o p s 1/2 λ 1 n n 1. Since θ 2 C s ensures β 1 2 C s, the above inequality combined with A.18 entails that sup G n θ G n θ = sup θ B 0 n β 1 2 C L n β 1, 0 L n β 1, 0 = o p 1, s

6 6 FAN ET AL. where B 0 n is defined in Theorem 3. Therefore, by the above result and A.17, for any θ B 0 n and θ η n 2 > ϵ with ϵ > 0 arbitrarily small, inf G n θ { θ η n 2 >ϵ} inf G nθ θ η n 2 >ϵ sup θ B 0 n G n η n + κ ϵ [ϵ2 o p 1] o p 1 G n η n + κ ϵ [ϵ2 o p 1] o p 1. G n θ G n θ Then, it follows immediately that the minimizer θ n η n 2 ϵ with asymptotic probability one. Thus θ n η n = o p 1. The proof of Theorem 5 is completed. A.3. Lemmas. Lemma 3. Assume conditions of Theorem 3 hold. Let R n,i θ = ρ τ ε i Z T n,iθ ρ τ ε i +ρ τ ε i Z T n,iθ and R n θ = n R niθ. Then for any ϵ > 0, P R n θ E[R n θ] ϵ exp Cϵb n s 2 log s, where b n is some diverging sequence such that b n s 7/2 log s max i Z ni 2 0, and C > 0 is some constant. Proof. Let ξ i = R n,i θ E[R n,i θ]. Then R n θ E[R n θ] = n ξ i. Since R n,i θ s are independent, by Markov s inequality we obtain that for any ϵ > 0 and t > 0, [ P R n θ E[R n θ] ϵ e tϵ E exp t A.20 = exp tϵ t n ] ξ i n n E[R n,i θ] E[exptR n,i θ]. We next study E[R ni θ] and E [ exp tr ni θ ] in A.20. Using a similar argument to that for A.4 we can prove that E[R n,i θ] = E[ρ τ ε i Z T n,iθ ρ τ ε i ] = f i 0Z T n,iθ 2 + O Z T n,iθ 3, where O is uniformly over all i. Thus, it follows from the definition of Z n,i that n t E[R n,i θ] = t θ O n A.21 t Z T n,iθ 3.

7 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 7 Now, we consider E [ exp tr ni θ ]. If Z T n,iθ > 0, then by definition R n,i θ = Z T n,iθ ε i 1{0 ε i Z T n,iθ}. By Condition 1 and Taylor expansion it follows that E [ exp tr n,i θ ] 1 + exptz T n,iθ 1 P 0 ε i Z T n,iθ 1 + f i 0t Z T n,iθ 2 + Ot 2 Z T n,iθ 3. When Z T n,iθ < 0, we can get the same result using a similar argument. Since n 1 + x i exp n x i for x i > 0, in view of A.5 and the above inequality we obtain that A.22 n E [ exp tr n,i θ ] n exp E [ exp tr n,i θ 1 ] Substituting A.21 and A.22 into A.20 gives A.23 exp t θ O n t 2 Z T n,iθ 3. P R n θ E[R n θ] ϵ exp n tϵ + O t 2 Z T n,iθ 3. Choosing t = 2s 2 log sb n with b n such that b n s 7/2 log s max i Z ni 2 0, and using similar idea to that for A.5 we obtain that t n Z T n,iθ 3 Ct max Z T n,iθ i n f i 0 Z T n,iθ 2 Cts 3/2 max Z n,i 2 0. i Plugging this into A.23 yields that P R n θ E[R n θ] ϵ exp Cϵb n s 2 log s. Repeating the same argument for P R n θ E[R n θ] ϵ the proof. completes Lemma 4. Let λθ be a positive function defined on a convex, open subset Θ s = {θ R s : θ 2 < c 6 s} of R s, and {λ n θ : θ Θ s } be a sequence of random convex functions defined on Θ s, where c 6 > 0 is some constant. Suppose that there exists some b n such that for every θ Θ s, the following holds for all ϵ > 0 P λ n θ λθ ϵ c 9 exp c 7 s 2 log sb n ϵ,

8 8 FAN ET AL. where c 7, c 9 are two positive constants. Let K s be a compact set in R s such that K s = { θ 2 c 4 s} Θs, where c 4 < c 6 is some positive constant. If, for some constant c 8 > 0, λθ 1 λ 2 θ 2 c 8 s θ 1 θ 2 for any θ 1, θ 2 Θ s, then sup θ K s λ n θ λθ = o p 1. Proof. The proof is an extension of the convexity lemma in Pollard The basic idea is to prove that K s can be covered by a number of cubes, and λ n θ and λθ are uniformly close over the set of vertices of these cubes. Within each cube, values of both λ n θ and λθ do not change significantly. Thus λ n θ and λθ are uniformly close over K s. In this proof we use C to denote some generic positive constant. We proceed to prove the lemma. Since λθ 1 λθ 2 c 8 s θ 1 θ 2 for any θ 1, θ 2 Θ s, it follows that for a fixed ϵ > 0, the function λθ varies by less than ϵ/s over each cube of side δ ϵ/s 2 c 8 that intersects K s. Note that K s can be covered by less than 2c 4 s/δ s = 2c 4 c 8 s 5/2 s such cubes. Then in total, there are less than 2 s 2c 4 c 8 s 5/2 s vertices. Denote by V s the set of all such vertices whose cubes intersect K s. Since c 6 can be much larger than c 4 and the edge of each cube, δ = ϵ/c 8 s 2, is small and decreases with s, all vertices in V s fall in Θ s as well. Thus by the pointwise convergence assumption in the Lemma, it is easy to derive that for any ϵ > 0, as b n, P max θ V s λ n θ λθ ϵ/s c 6 exp Cs logcs c 7 slog sb n ϵ 0. Therefore, A.24 M n max θ V s λ n θ λθ = o p ϵ/s. For any θ K s, it will fall into a cube and thus can be written as the convex combination of this cube s vertices {θ i } in V s, that is, θ = i α iθ i with α i [0, 1. Then by the convexity of λ n θ and A.24, λ n θ α i λ n θ i i α i { λ n θ i λθ i + λθ i λθ + λθ} max λ n θ i λθ i + ϵ/s + λθ M n + ϵ/s + λθ. i Thus, A.25 P sup λ n θ λθ > 2ϵ/s 0. θ K s

9 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 9 Next we prove the lower bound. Each θ in K s lies within a δ-cube with a vertex θ 0 in D s, thus s θ = θ 0 + δ i e i with δ i < δ, where e 1,, e s denote s coordinate directions. Without loss of generality suppose 0 δ i < δ for each i. Define θ i to be the vertex θ 0 δe i in D s. Then θ 0 can be written as a convex combination of θ and θ i : δ θ 0 = δ + j δ θ + j i δ i δ + j δ θ i. j δ Denote by a = δ+ j δ and a δ i = i j δ+ j δ. Since 0 δ j < δ, it follows that j the coefficient for θ can be bounded as a 1 1+s. Since λ n is convex, by A.24 we obtain that aλ n θ λ n θ 0 i a i λ n θ i λθ 0 i a i λθ i 2M n λθ ϵ s i a i λθ + ϵ s 2M n aλθ 2ϵ s 2M n. Since a > 1 + s 1, it follows from the above inequality and A.24 that Therefore, A.26 λ n θ λθ 2ϵ s 2M 2ϵs + 1 ns + 1 o1. s P inf λ n θ λθ < 31 + s ϵ θ K s s 0. Since we can choose ϵ arbitrarily small, the uniform convergence result follows easily by combining A.25 with A.26. A.4. Proof of Proposition 1. Since s is finite, the summation of the probability below is of the same order as the maximum of the probability below. The distribution result equation 3.1 in the main paper entails that P 1 s n ST ε > z = P max 1 j s xt j ε > nz P x T j ε > Cnz s x j α αc α Cnz α, j=1 j=1

10 10 FAN ET AL. where C > 0 is some generic constant. For any sequence b n such that b n, by letting z = n 1 b n bn with b n = s j=1 x j α α 1/α, we have that P n 1 S T ε > n 1 b n bn 0. In other words, n 1 S T ε = O p n 1 b n. Hence, for the following condition, βm + λ n sgn β M = β M + n 1 S T ε, to have a solution β = β 1,, β s T with β j > 0 for all j = 1,, s, the necessary conditions are β 0 nb 1 n and λ n < β 0. Combining these conditions we have A.27 λ n < β 0 n 1 b n bn, with some diverging sequence b n. We next check if the following condition is satisfied, Q T ε nλ n. Combining 3.1 and A.27 ensures that for j > s, P x T j ε > nλ n P x T j ε > b n bn x j α αc α b n bn α. Since we assumed supp x j supp x i = for any i, j {s+1,, p}, it follows that Q T ε is a vector of independent random variables with components x T j ε. Then, P Q T ε > nλ n = 1 P Q T ε nλ n = 1 1 P x T j ε > nλ n j>s 1 1 C xj α αb n bn α = 1 exp log 1 C x j α αb n bn α. j>s Since log1 + x x for all x > 1, we have that log 1 C x j α αb n bn α C x j α αb n bn α. j>s j>s Combining the above two inequalities, if j>s x j α αb n bn α c 0 /C 0, ], then P Q T ε > nλ n 1 e c 0. That is, with probability at least 1 e c 0, Q T ε nλ n, fails to hold and Lasso does not have the model selection oracle property. In fact, if b n On 1 j>s x j α α 1/α b 1 n, or equivalently by A.27, β 0 On 1 x j α α 1/α, j>s j>s

11 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 11 then c 0 /C 0, ]. In other words, unless we have A.28 nβ 0 { j>s x j α α} 1/α, Lasso is not able to recover the true model and the correct sign. Finally, since x j 2 = n, supp x j = On 1/2 and max ij x ij = On 1/4, the nonzero components of Q are all of the same order On 1/4. Consequently, x j α α must be all of the same order On 2+α/4. Then, n{ j>s x j α α} 1/α = O n α and the condition A.28 above becomes n α β 0. APPENDIX B: REAL DATA EXAMPLE In this section, we use expression quantitative trait locus eqtl mapping to illustrate the performance of R-Lasso and AR-Lasso. eqtl studies aim at finding the variations of genotype in a certain part of a chromosome that are associated with the gene expression levels. In this study, we conducted a cis-eqtl mapping for the gene CHRNA6, cholinergic receptor, nicotinic, alpha 6. CHRNA6 is located on the 8th chromosome, in the cytogenetic location 8p11. CHRNA6 is thought to be related to activation of dopamine releasing neurons with nicotine Thorgeirsson et al., Therefore, CHRNA6 has been the subject of many nicotine addiction studies on people with western European heritage Saccone et al., 2009; Thorgeirsson et al., The data are from 90 individuals from the international HapMap project The International HapMap Consortium, 2005, all with western Europe ancestry. The data are available on ftp://ftp.sanger.ac.uk/pub/genevar/. The normalized expression data was generated with an Illumina Sentrix Human-6 Expression Bead Chip Stranger et al., The SNPs under investigation are located at 1 megabase upstream and downstream of the transcription start site TSS of CHRNA6; in this range, there were 554 SNPs. The additive coding for SNPs was employed, with 0, 1 and 2 representing the major, heterozygous, and minor populations, respectively. We further screened the SNPs using a variation of the independent screening method SIS of Fan and Lv We kept the top 100 SNPs that had correlation with the gene expression levels. Finally, we applied Lasso, SCAD, R-Lasso and AR- Lasso to the screened variables. The quantile parameter, τ was set to 0.5 for R-Lasso and AR-Lasso, corresponding to the median regression. The tuning parameter for all methods was chosen using five-fold cross validation. The selected SNPs as well as their regression coefficients and distances from the main transcription site are given in Table 1.

12 12 FAN ET AL. Table 1 Selected SNPs for the eqtl study. SNP Lasso SCAD R-Lasso AR-Lasso Distance from TSS in kb rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs It is seen that robust regression methods R-Lasso and AR-Lasso found more of the variables to be significant. R-Lasso and AR-Lasso selected 21 and 15 variables, respectively, whereas Lasso and SCAD only found 15 and 5 of the variables to be significant. Only 4 SNPs were included in all of the models, rs , rs , rs , rs Furthermore, none of these SNPs were covered in the previous study Saccone et al., We speculate that the difference is due to the fact that the previous studies focused on SNPs that are only 50 kb upstream and downstream. Additionally, these studies did not consider multiple regression which makes significant use of the correlation structure in the data. In addition, among all the SNPs

13 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 13 Quantiles of Input Sample Standard Normal Quantiles a Residuals from Lasso Quantiles of Input Sample Standard Normal Quantiles b Residuals from SCAD Quantiles of Input Sample Standard Normal Quantiles c Residuals from R-Lasso Quantiles of Input Sample Standard Normal Quantiles d Residuals from AR-Lasso Fig 1: QQ-plots of residuals for different methods in the eqtl study that are chosen by these four methods, only one of them rs appears in the paper by Saccone et al. 2009, and only R-Lasso and AR-Lasso found this SNP to be important. As it was observed in the finite sample simulations, SCAD and AR-Lasso consistently chose a smaller set of variables than their counterparts. Furthermore, almost two thirds of the selected SNPs lie to the left of the transcription site. We would also like to note that the residuals from the fitted regressions had very heavy right tails. This suggests that, at least for this particular eqtl study, it is a lot more reasonable to use methods based on quantile regression. The QQ-plots of the residuals from different regression methods are shown in Figure 1.

14 14 FAN ET AL. REFERENCES Fan, J. and Lv, J Sure independence screening for ultrahigh dimensional feature space with discussion. J. Roy. Statist. Soc. Ser. B The International HapMap Consortium A haplotype map of the human genome. Nature Nolan, J. P Stable Distributions - Models for Heavy Tailed Data. Birkhauser. In progress, Chapter 1 online at academic2.american.edu/ jpnolan. Pollard, D Asymptotics for least absolute deviation regression estimators. Econometric Theory Saccone, N.L., Saccone, S.F., Hinrichs, A.L., Stitzel, J.A., Duan, W., Pergadia, M.L., Agrawal, A., Breslau, N., Grucza, R.A., Hatsukami, D., Johnson, E.O., Madden, P.A.F., Swan, G.E., Wang, J.C., Goate, A.M., Rice, J.P., Bierut, L.J Multiple distinct risk loci for nicotine dependence identified by dense coverage of the complete family of nicotinic receptor subunit CHRN genes. Am. J. Med. Genet. Part B 150 B Stranger, B., Forrest, S., Dunning, M., Ingle, C., Beazley, C., Thorne, N., Redon, R., Bird, C., de Grassi, A., Lee, C., Tyler-Smith, C., Carter, N., Scherer, S., Tavar, S., Deloukas, P., Hurles, M. and Dermitzakis, E Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes. Science Thorgeirsson, T. et al Sequence variants at CHRNB3-CHRNA6 and CYP2A6 affect smoking behavior. Nat. Genet J. Fan Department of Operations Research and Fin Eng Princeton University Princeton, New Jersey USA jqfan@princeton.edu E. Barut IBM T.J. Watson Research Center Yorktown Heights, New York USA abarut@us.ibm.com Y. Fan Data Sciences and Operations Department Marshall School of Business University of Southern California Los Angeles, California USA fanyingy@marshall.usc.edu

Penalized composite quasi-likelihood for ultrahigh dimensional variable selection

Penalized composite quasi-likelihood for ultrahigh dimensional variable selection J. R. Statist. Soc. B (2011) 73, Part 3, pp. Penalized composite quasi-likelihood for ultrahigh dimensional variable selection Jelena Bradic and Jianqing Fan Princeton University, USA and Weiwei Wang University

More information

arxiv: v1 [stat.ml] 5 Jan 2015

arxiv: v1 [stat.ml] 5 Jan 2015 To appear on the Annals of Statistics SUPPLEMENTARY MATERIALS FOR INNOVATED INTERACTION SCREENING FOR HIGH-DIMENSIONAL NONLINEAR CLASSIFICATION arxiv:1501.0109v1 [stat.ml] 5 Jan 015 By Yingying Fan, Yinfei

More information

5.1 Approximation of convex functions

5.1 Approximation of convex functions Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and

More information

By Jianqing Fan 1, Yingying Fan 2 and Emre Barut Princeton University, University of Southern California and IBM T. J. Watson Research Center

By Jianqing Fan 1, Yingying Fan 2 and Emre Barut Princeton University, University of Southern California and IBM T. J. Watson Research Center The Annals of Statistics 204, Vol. 42, No., 324 35 DOI: 0.24/3-AOS9 c Institute of Mathematical Statistics, 204 ADAPTIVE ROBUST VARIABLE SELECTION arxiv:205.4795v3 [math.st] 2 Mar 204 By Jianqing Fan,

More information

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang The supplementary material contains some additional simulation

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April

More information

The Iterated Lasso for High-Dimensional Logistic Regression

The Iterated Lasso for High-Dimensional Logistic Regression The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of

More information

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/

More information

Guarding against Spurious Discoveries in High Dimension. Jianqing Fan

Guarding against Spurious Discoveries in High Dimension. Jianqing Fan in High Dimension Jianqing Fan Princeton University with Wen-Xin Zhou September 30, 2016 Outline 1 Introduction 2 Spurious correlation and random geometry 3 Goodness Of Spurious Fit (GOSF) 4 Asymptotic

More information

arxiv: v1 [stat.me] 30 Dec 2017

arxiv: v1 [stat.me] 30 Dec 2017 arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION

SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION IPDC SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION By Yinfei Kong, Daoji Li, Yingying Fan and Jinchi Lv California State University

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang 1, Joel L. Horowitz 2, and Shuangge Ma 3 1 Department of Statistics and Actuarial Science, University

More information

University of Toronto

University of Toronto A Limit Result for the Prior Predictive by Michael Evans Department of Statistics University of Toronto and Gun Ho Jang Department of Statistics University of Toronto Technical Report No. 1004 April 15,

More information

Theoretical results for lasso, MCP, and SCAD

Theoretical results for lasso, MCP, and SCAD Theoretical results for lasso, MCP, and SCAD Patrick Breheny March 2 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/23 Introduction There is an enormous body of literature concerning theoretical

More information

arxiv: v1 [stat.me] 11 May 2016

arxiv: v1 [stat.me] 11 May 2016 Submitted to the Annals of Statistics INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION arxiv:1605.03315v1 [stat.me] 11 May 2016 By Yinfei Kong, Daoji Li, Yingying

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

M-estimation in high-dimensional linear model

M-estimation in high-dimensional linear model Wang and Zhu Journal of Inequalities and Applications 208 208:225 https://doi.org/0.86/s3660-08-89-3 R E S E A R C H Open Access M-estimation in high-dimensional linear model Kai Wang and Yanling Zhu *

More information

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J 7 Appendix 7. Proof of Theorem Proof. There are two main difficulties in proving the convergence of our algorithm, and none of them is addressed in previous works. First, the Hessian matrix H is a block-structured

More information

Sample Size Requirement For Some Low-Dimensional Estimation Problems

Sample Size Requirement For Some Low-Dimensional Estimation Problems Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Statistica Sinica 213): Supplement NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Hokeun Sun 1, Wei Lin 2, Rui Feng 2 and Hongzhe Li 2 1 Columbia University and 2 University

More information

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability

More information

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy

More information

arxiv: v1 [math.st] 7 Dec 2018

arxiv: v1 [math.st] 7 Dec 2018 Variable selection in high-dimensional linear model with possibly asymmetric or heavy-tailed errors Gabriela CIUPERCA 1 Institut Camille Jordan, Université Lyon 1, France arxiv:1812.03121v1 [math.st] 7

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

University of Pavia. M Estimators. Eduardo Rossi

University of Pavia. M Estimators. Eduardo Rossi University of Pavia M Estimators Eduardo Rossi Criterion Function A basic unifying notion is that most econometric estimators are defined as the minimizers of certain functions constructed from the sample

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs Yingying Fan 1, Emre Demirkaya 1, Gaorong Li 2 and Jinchi Lv 1 University of Southern California 1 and Beiing University of Technology 2 October

More information

ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Statistica Sinica 18(2008), 1603-1618 ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang, Shuangge Ma and Cun-Hui Zhang University of Iowa, Yale University and Rutgers University Abstract:

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

Overview. Background

Overview. Background Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems

More information

regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered,

regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, L penalized LAD estimator for high dimensional linear regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, where the overall number of variables

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Assumptions for the theoretical properties of the

Assumptions for the theoretical properties of the Statistica Sinica: Supplement IMPUTED FACTOR REGRESSION FOR HIGH- DIMENSIONAL BLOCK-WISE MISSING DATA Yanqing Zhang, Niansheng Tang and Annie Qu Yunnan University and University of Illinois at Urbana-Champaign

More information

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable

More information

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big

More information

Part IB Complex Analysis

Part IB Complex Analysis Part IB Complex Analysis Theorems Based on lectures by I. Smith Notes taken by Dexter Chua Lent 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after

More information

Concentration behavior of the penalized least squares estimator

Concentration behavior of the penalized least squares estimator Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar

More information

SPECTRAL GAP FOR ZERO-RANGE DYNAMICS. By C. Landim, S. Sethuraman and S. Varadhan 1 IMPA and CNRS, Courant Institute and Courant Institute

SPECTRAL GAP FOR ZERO-RANGE DYNAMICS. By C. Landim, S. Sethuraman and S. Varadhan 1 IMPA and CNRS, Courant Institute and Courant Institute The Annals of Probability 996, Vol. 24, No. 4, 87 902 SPECTRAL GAP FOR ZERO-RANGE DYNAMICS By C. Landim, S. Sethuraman and S. Varadhan IMPA and CNRS, Courant Institute and Courant Institute We give a lower

More information

An asymptotic ratio characterization of input-to-state stability

An asymptotic ratio characterization of input-to-state stability 1 An asymptotic ratio characterization of input-to-state stability Daniel Liberzon and Hyungbo Shim Abstract For continuous-time nonlinear systems with inputs, we introduce the notion of an asymptotic

More information

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij Weak convergence and Brownian Motion (telegram style notes) P.J.C. Spreij this version: December 8, 2006 1 The space C[0, ) In this section we summarize some facts concerning the space C[0, ) of real

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

19.1 Maximum Likelihood estimator and risk upper bound

19.1 Maximum Likelihood estimator and risk upper bound ECE598: Information-theoretic methods in high-dimensional statistics Spring 016 Lecture 19: Denoising sparse vectors - Ris upper bound Lecturer: Yihong Wu Scribe: Ravi Kiran Raman, Apr 1, 016 This lecture

More information

Asymptotic properties for combined L 1 and concave regularization

Asymptotic properties for combined L 1 and concave regularization Biometrika (2014), 101,1,pp. 57 70 doi: 10.1093/biomet/ast047 Printed in Great Britain Advance Access publication 13 December 2013 Asymptotic properties for combined L 1 and concave regularization BY YINGYING

More information

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS The Annals of Applied Probability 1998, Vol. 8, No. 4, 1291 1302 ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS By Gareth O. Roberts 1 and Jeffrey S. Rosenthal 2 University of Cambridge

More information

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered

More information

Minimum Power Dominating Sets of Random Cubic Graphs

Minimum Power Dominating Sets of Random Cubic Graphs Minimum Power Dominating Sets of Random Cubic Graphs Liying Kang Dept. of Mathematics, Shanghai University N. Wormald Dept. of Combinatorics & Optimization, University of Waterloo 1 1 55 Outline 1 Introduction

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Part III. 10 Topological Space Basics. Topological Spaces

Part III. 10 Topological Space Basics. Topological Spaces Part III 10 Topological Space Basics Topological Spaces Using the metric space results above as motivation we will axiomatize the notion of being an open set to more general settings. Definition 10.1.

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5 MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5.. The Arzela-Ascoli Theorem.. The Riemann mapping theorem Let X be a metric space, and let F be a family of continuous complex-valued functions on X. We have

More information

Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments

Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments Stas Minsker University of Southern California July 21, 2016 ICERM Workshop Simple question: how to estimate

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS The Annals of Statistics 2008, Vol. 36, No. 2, 587 613 DOI: 10.1214/009053607000000875 Institute of Mathematical Statistics, 2008 ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION

More information

Notes on Complex Analysis

Notes on Complex Analysis Michael Papadimitrakis Notes on Complex Analysis Department of Mathematics University of Crete Contents The complex plane.. The complex plane...................................2 Argument and polar representation.........................

More information

An introduction to some aspects of functional analysis

An introduction to some aspects of functional analysis An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms

More information

The Australian National University and The University of Sydney. Supplementary Material

The Australian National University and The University of Sydney. Supplementary Material Statistica Sinica: Supplement HIERARCHICAL SELECTION OF FIXED AND RANDOM EFFECTS IN GENERALIZED LINEAR MIXED MODELS The Australian National University and The University of Sydney Supplementary Material

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links. 14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Bi-level feature selection with applications to genetic association

Bi-level feature selection with applications to genetic association Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may

More information

Oblique derivative problems for elliptic and parabolic equations, Lecture II

Oblique derivative problems for elliptic and parabolic equations, Lecture II of the for elliptic and parabolic equations, Lecture II Iowa State University July 22, 2011 of the 1 2 of the of the As a preliminary step in our further we now look at a special situation for elliptic.

More information

large number of i.i.d. observations from P. For concreteness, suppose

large number of i.i.d. observations from P. For concreteness, suppose 1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information

Goal. Robust A Posteriori Error Estimates for Stabilized Finite Element Discretizations of Non-Stationary Convection-Diffusion Problems.

Goal. Robust A Posteriori Error Estimates for Stabilized Finite Element Discretizations of Non-Stationary Convection-Diffusion Problems. Robust A Posteriori Error Estimates for Stabilized Finite Element s of Non-Stationary Convection-Diffusion Problems L. Tobiska and R. Verfürth Universität Magdeburg Ruhr-Universität Bochum www.ruhr-uni-bochum.de/num

More information

arxiv: v1 [stat.me] 25 Aug 2010

arxiv: v1 [stat.me] 25 Aug 2010 Variable-width confidence intervals in Gaussian regression and penalized maximum likelihood estimators arxiv:1008.4203v1 [stat.me] 25 Aug 2010 Davide Farchione and Paul Kabaila Department of Mathematics

More information

Oracle Estimation of a Change Point in High Dimensional Quantile Regression

Oracle Estimation of a Change Point in High Dimensional Quantile Regression Oracle Estimation of a Change Point in High Dimensional Quantile Regression Sokbae Lee, Yuan Liao, Myung Hwan Seo, and Youngki Shin arxiv:1603.00235v2 [stat.me] 16 Dec 2016 15 November 2016 Abstract In

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

An Asymptotic Property of Schachermayer s Space under Renorming

An Asymptotic Property of Schachermayer s Space under Renorming Journal of Mathematical Analysis and Applications 50, 670 680 000) doi:10.1006/jmaa.000.7104, available online at http://www.idealibrary.com on An Asymptotic Property of Schachermayer s Space under Renorming

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

MODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS

MODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS Statistica Sinica 23 (2013), 901-927 doi:http://dx.doi.org/10.5705/ss.2011.058 MODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS Hyunkeun Cho and Annie Qu University of Illinois at

More information

Classical regularity conditions

Classical regularity conditions Chapter 3 Classical regularity conditions Preliminary draft. Please do not distribute. The results from classical asymptotic theory typically require assumptions of pointwise differentiability of a criterion

More information

arxiv: v2 [math.st] 31 Mar 2015

arxiv: v2 [math.st] 31 Mar 2015 Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications arxiv:1502.04237v2 [math.st] 31 Mar 2015 Jianqing Fan 1,2, Qi-Man Shao 3, Wen-Xin Zhou 1,4 1 Princeton University,

More information

Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates

Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates Jingyuan Liu, Runze Li and Rongling Wu Abstract This paper is concerned with feature screening and variable selection

More information

Daniel M. Oberlin Department of Mathematics, Florida State University. January 2005

Daniel M. Oberlin Department of Mathematics, Florida State University. January 2005 PACKING SPHERES AND FRACTAL STRICHARTZ ESTIMATES IN R d FOR d 3 Daniel M. Oberlin Department of Mathematics, Florida State University January 005 Fix a dimension d and for x R d and r > 0, let Sx, r) stand

More information

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,

More information

Approximation Theoretical Questions for SVMs

Approximation Theoretical Questions for SVMs Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually

More information

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators Lei Ni And I cherish more than anything else the Analogies, my most trustworthy masters. They know all the secrets of

More information

Towards stability and optimality in stochastic gradient descent

Towards stability and optimality in stochastic gradient descent Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction

More information

Lower Tail Probabilities and Normal Comparison Inequalities. In Memory of Wenbo V. Li s Contributions

Lower Tail Probabilities and Normal Comparison Inequalities. In Memory of Wenbo V. Li s Contributions Lower Tail Probabilities and Normal Comparison Inequalities In Memory of Wenbo V. Li s Contributions Qi-Man Shao The Chinese University of Hong Kong Lower Tail Probabilities and Normal Comparison Inequalities

More information

Statistical Properties of Numerical Derivatives

Statistical Properties of Numerical Derivatives Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective

More information

On rational approximation of algebraic functions. Julius Borcea. Rikard Bøgvad & Boris Shapiro

On rational approximation of algebraic functions. Julius Borcea. Rikard Bøgvad & Boris Shapiro On rational approximation of algebraic functions http://arxiv.org/abs/math.ca/0409353 Julius Borcea joint work with Rikard Bøgvad & Boris Shapiro 1. Padé approximation: short overview 2. A scheme of rational

More information

Inference for High Dimensional Robust Regression

Inference for High Dimensional Robust Regression Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:

More information

Homogeneity Pursuit. Jianqing Fan

Homogeneity Pursuit. Jianqing Fan Jianqing Fan Princeton University with Tracy Ke and Yichao Wu http://www.princeton.edu/ jqfan June 5, 2014 Get my own profile - Help Amazing Follow this author Grace Wahba 9 Followers Follow new articles

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

A UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009)

A UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009) A UNIFIED APPROACH TO MODEL SELECTION AND SPARSE RECOVERY USING REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009) Mar. 19. 2010 Outline 1 2 Sideline information Notations

More information