SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION
|
|
- Aubrie McCormick
- 5 years ago
- Views:
Transcription
1 SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION By Jianqing Fan, Yingying Fan and Emre Barut Princeton University, University of Southern California and IBM T.J. Watson Research Center APPENDIX A: ADDITIONAL PROOFS A.1. Proof of Theorem 3. This proof is motivated by Pollard In the proof we use C > 0 to denote a generic positive constant. Let θ = V 1 n β 1 β 1, then β 1 = β 1 +V n θ. Letting G n θ = L n β 1, 0 L n β 1, 0, we have that A.1 G n θ = ρ τ ε Z n θ 1 ρ τ ε 1 + nλ n d0 β 1 + V n θ 1 d 0 β 1 1, where we have used the shorthand notation that ρ τ u 1 = n ρ τ u i for any vector u = u 1,, u n T. Since L n β 1, 0 is minimized at β 1 = β o 1, it follows that G n θ is minimized at θ n = V 1 n β o 1 β 1. We consider θ over the convex open set B 0 n = {θ R s : θ 2 < c 6 s}, with some constant c 6 > 0 independent of s. The idea of the proof is to approximate the stochastic function G n θ by a quadratic function, whose minimizer is shown to possess the asymptotic normality. Since G n θ and the quadratic approximation are close, the minimizer of G n θ enjoys the same asymptotic normality. Now, we proceed to prove Theorem 3. Decompose G n θ into its mean and centralized stochastic component: A.2 G n θ = Q n θ + T n θ, where Q n θ = E[G n θ] and A.3 T n θ = ρ τ ε Z n θ 1 ρ τ ε 1 E [ ] ρ τ ε Z n θ 1 ρ τ ε 1. 1
2 2 FAN ET AL. We first deal with the mean component Q n θ. Since θ 2 < c 6 s over the set B 0 n, it follows from the Cauchy-Schwarz inequality and the assumption of the theorem that H 1/2 Z n θ θ 2 max H 1/2 Z ni 2 = o s 3 log s 1. i Then, since f u is bounded in a small neighborhood of 0, by using a similar argument as equation 7.3 of the main paper and noting that n f i0 Z T niθ 2 = θ T Z T n HZ n θ = θ 2 2 we can show that E [ ] A.4 ρ τ ε Z n θ 1 ρ τ ε 1 = θ n f 2 i0 Z T n,iθ 3 + o n f i0 Z T n,iθ 3. Furthermore, since n f i0 Z T niθ 2 = θ 2 2 < c2 6s, it follows that A.5 n f i0 Z T n,iθ 3 C H 1/2 Z n θ n f i 0 Z T niθ 2 = o s 2 log s 1. Next, we deal with the penalty term in the expected value Q n θ. Since 1 n ST HS has bounded eigenvalues by Condition 2, it follows from the assumption of the theorem that, for any θ B 0 n, V n θ V n θ 2 Cn 1/2 θ 2 = o min {1 j s} β j. Hence, sgnβ 1 + V n θ = sgnβ 1 and A.6 d 0 β 1 + V n θ 1 d 0 β 1 1 = d T 0 V n θ, where d 0 is a s-vector with j-th component d j sgnβj. Combining A.4 A.6 yields A.7 Q n θ = θ nλ n dt 0 V n θ + o 1, where o is uniformly over all θ B 0 n. We now deal with the stochastic part T n θ. Define D = ρ τ ε 1,, ρ τ ε n T, W n = Z T n D, and R n θ = ρ τ ε Z n θ 1 ρ τ ε 1 W T n θ.
3 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 3 Then E[W T n θ] = 0 and A.8 T n θ = W T n θ + r n θ, where r n θ = R n θ E[R n θ]. Here, W T n θ can be regarded as the first order approximation of ρ τ ε Z n θ 1 ρ τ ε 1. We next show r n θ is uniformly small. By Lemma 3, there exists a sequence b n such that for any ϵ > 0, A.9 P r n θ ϵ exp Cϵb n slog s. Define λ n θ = G n θ nλ n dt 0 V n θ W T n θ and λθ = θ 2 2. Then by definition A.1, λ n θ and λθ are both convex functions on the set B 0 n. Furthermore, by definition, we can write r n θ as r n θ = λ n θ λθ o1. Since θ 2 < c 6 s for all θ B0 n, by Condition 1 we have that for any θ 1, θ 2 B 0 n, λθ1 λθ 2 = θ1 + θ 2 T θ 1 θ 2 θ 1 + θ 2 2 θ 1 θ 2 2 Cs θ 1 θ 2. Thus, the above result and A.9 indicate that conditions in Lemma 4 are satisfied. Then, for any compact set K s = { θ 2 c 4 s} B0 n with 0 < c 4 < c 6 some constant, A.10 sup r n θ = o p 1. θ K s Combining A.2, A.7 and A.8 we can write that A.11 A.12 G n θ = θ nλ n dt 0 V n θ + W T n θ + r n θ + o 1 = θ η n 2 2 η n r n θ + o 1, where η n = 1 2 nλn V n d0 + W n. By a classic weak convergence result, it is easy to see that c T Z T n Z n 1/2 W n D N0, τ1 τ, for any c R s satisfying c T c = 1. It follows immediately that A.13 c T Z T n Z n 1/2 η n nλ nv n d0 D N 0, τ1 τ.
4 4 FAN ET AL. We only need to show that the minimizer θ of G n θ is close to η n, i.e., for any ϵ > 0, A.14 P θ η n 2 ϵ 0. Hence Theorem 3 will follow from A.13 and Slutsky s lemma. We now proceed to prove A.14. First, let B 1 n be a ball with center η n and radius ϵ. Since c T W n has asymptotic normal distribution N0, 1 for any c R s with c T c = 1, and nv n V T n = 1 n ST HS 1 has bounded eigenvalues, by definition, we can bound η n as A.15 η n Wn 2 + nλ n V T n d 0 2 Op s + Cλ n n d0 2 = C s O p1, where the last step is by the assumption λ n n d0 2 = O s of the theorem. Since c 6 in the definition of B 0 n can be chosen to be much larger than C/2, it follows that for each fixed s, the compact set K s = { θ 2 c 4 s} B0 n with c 4 large enough can cover the ball B 1 n with probability arbitrarily close to 1. Therefore, by A.10 A.16 n sup r n θ sup r n θ = o p 1. θ B 1 n θ K s Now, we are ready to prove A.14. Consider the behavior of G n θ outside of the ball B 1 n. Let θ = η n + κu R s be a vector outside the ball B 1 n, where u R s is a unit vector and κ is a constant satisfying κ > ϵ, with ϵ the radius of B 1 n. Define θ as the boundary point of B 1 n that lies on the line segment connecting η n and θ. Then we can write θ = η n + ϵu = 1 ϵ/κη n + ϵθ/κ. By the convexity of G n, A.12 and A.16, ϵ κ G nθ + 1 ϵ Gn η κ n G n θ ϵ 2 η n 2 2 n ϵ 2 + G n η n 2 n. Since ϵ < κ, it follows that for large enough n, A.17 inf { θ ηn >ϵ} G n θ G n η n + κ ϵ [ϵ2 o p 1] > G n η n. This establishes A.14 and proves Theorem 3.
5 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 5 A.2. Proof of Theorem 5. The proof of Theorem 5 follows from those of Theorems 3 and 4. We use C to denote a generic constant in the proof. By Theorem 4, with asymptotic probability one, there exists a global minimizer β = β T 1, 0 T T of L n β and β 1 β 1 2 a n. Next we study the asymptotic normality of β 1. Following A.1 in the proof of Theorem 3, define G n θ = L n V n θ + β 1, 0 L n β 1, 0, where V n and θ are the same as in the proof of Theorem 3. Then θ n = V 1 n β 1 β 1 is a global minimizer of G n θ. The idea of the proof is to show that G n θ in A.1 and G n θ are uniformly close to each other. Since G n θ can be well approximated by a sequence of quadratic functions, Gn θ can be well approximated by the same sequence of quadratic functions. Thus, the minimizer of Gn θ enjoys the same asymptotic properties as that of G n θ. We now proceed to prove that G n θ and G n θ are uniformly close. To this end, first note that for any β 1 with β 1 2 < C s, A.18 L n β 1, 0 L n β 1, 0 nλ n β 1 2 d 0 d 0 2 Cnλ n s d 0 d 0 2. For 1 j s, by the mean-value theorem, A.19 d j ˆd j = p λ n βj p λ n β j ini = p λ n β j βj ini β j, where β j lies on the segment connecting βj and β ini j. By Condition 5 and the triangle inequality, with asymptotic probability one β j β j β j β j > β j C 2 slog p/n > 2 1 min j<s β j. This together with Condition 6 ensures that p λ n β j = o p s 1 λ 1 n n log p 1/2. Thus, in view of A.19, d 0 d 0 2 o p s 1 λ 1 n n log p 1/2 β ini 1 β 1 2 o p s 1/2 λ 1 n n 1. Since θ 2 C s ensures β 1 2 C s, the above inequality combined with A.18 entails that sup G n θ G n θ = sup θ B 0 n β 1 2 C L n β 1, 0 L n β 1, 0 = o p 1, s
6 6 FAN ET AL. where B 0 n is defined in Theorem 3. Therefore, by the above result and A.17, for any θ B 0 n and θ η n 2 > ϵ with ϵ > 0 arbitrarily small, inf G n θ { θ η n 2 >ϵ} inf G nθ θ η n 2 >ϵ sup θ B 0 n G n η n + κ ϵ [ϵ2 o p 1] o p 1 G n η n + κ ϵ [ϵ2 o p 1] o p 1. G n θ G n θ Then, it follows immediately that the minimizer θ n η n 2 ϵ with asymptotic probability one. Thus θ n η n = o p 1. The proof of Theorem 5 is completed. A.3. Lemmas. Lemma 3. Assume conditions of Theorem 3 hold. Let R n,i θ = ρ τ ε i Z T n,iθ ρ τ ε i +ρ τ ε i Z T n,iθ and R n θ = n R niθ. Then for any ϵ > 0, P R n θ E[R n θ] ϵ exp Cϵb n s 2 log s, where b n is some diverging sequence such that b n s 7/2 log s max i Z ni 2 0, and C > 0 is some constant. Proof. Let ξ i = R n,i θ E[R n,i θ]. Then R n θ E[R n θ] = n ξ i. Since R n,i θ s are independent, by Markov s inequality we obtain that for any ϵ > 0 and t > 0, [ P R n θ E[R n θ] ϵ e tϵ E exp t A.20 = exp tϵ t n ] ξ i n n E[R n,i θ] E[exptR n,i θ]. We next study E[R ni θ] and E [ exp tr ni θ ] in A.20. Using a similar argument to that for A.4 we can prove that E[R n,i θ] = E[ρ τ ε i Z T n,iθ ρ τ ε i ] = f i 0Z T n,iθ 2 + O Z T n,iθ 3, where O is uniformly over all i. Thus, it follows from the definition of Z n,i that n t E[R n,i θ] = t θ O n A.21 t Z T n,iθ 3.
7 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 7 Now, we consider E [ exp tr ni θ ]. If Z T n,iθ > 0, then by definition R n,i θ = Z T n,iθ ε i 1{0 ε i Z T n,iθ}. By Condition 1 and Taylor expansion it follows that E [ exp tr n,i θ ] 1 + exptz T n,iθ 1 P 0 ε i Z T n,iθ 1 + f i 0t Z T n,iθ 2 + Ot 2 Z T n,iθ 3. When Z T n,iθ < 0, we can get the same result using a similar argument. Since n 1 + x i exp n x i for x i > 0, in view of A.5 and the above inequality we obtain that A.22 n E [ exp tr n,i θ ] n exp E [ exp tr n,i θ 1 ] Substituting A.21 and A.22 into A.20 gives A.23 exp t θ O n t 2 Z T n,iθ 3. P R n θ E[R n θ] ϵ exp n tϵ + O t 2 Z T n,iθ 3. Choosing t = 2s 2 log sb n with b n such that b n s 7/2 log s max i Z ni 2 0, and using similar idea to that for A.5 we obtain that t n Z T n,iθ 3 Ct max Z T n,iθ i n f i 0 Z T n,iθ 2 Cts 3/2 max Z n,i 2 0. i Plugging this into A.23 yields that P R n θ E[R n θ] ϵ exp Cϵb n s 2 log s. Repeating the same argument for P R n θ E[R n θ] ϵ the proof. completes Lemma 4. Let λθ be a positive function defined on a convex, open subset Θ s = {θ R s : θ 2 < c 6 s} of R s, and {λ n θ : θ Θ s } be a sequence of random convex functions defined on Θ s, where c 6 > 0 is some constant. Suppose that there exists some b n such that for every θ Θ s, the following holds for all ϵ > 0 P λ n θ λθ ϵ c 9 exp c 7 s 2 log sb n ϵ,
8 8 FAN ET AL. where c 7, c 9 are two positive constants. Let K s be a compact set in R s such that K s = { θ 2 c 4 s} Θs, where c 4 < c 6 is some positive constant. If, for some constant c 8 > 0, λθ 1 λ 2 θ 2 c 8 s θ 1 θ 2 for any θ 1, θ 2 Θ s, then sup θ K s λ n θ λθ = o p 1. Proof. The proof is an extension of the convexity lemma in Pollard The basic idea is to prove that K s can be covered by a number of cubes, and λ n θ and λθ are uniformly close over the set of vertices of these cubes. Within each cube, values of both λ n θ and λθ do not change significantly. Thus λ n θ and λθ are uniformly close over K s. In this proof we use C to denote some generic positive constant. We proceed to prove the lemma. Since λθ 1 λθ 2 c 8 s θ 1 θ 2 for any θ 1, θ 2 Θ s, it follows that for a fixed ϵ > 0, the function λθ varies by less than ϵ/s over each cube of side δ ϵ/s 2 c 8 that intersects K s. Note that K s can be covered by less than 2c 4 s/δ s = 2c 4 c 8 s 5/2 s such cubes. Then in total, there are less than 2 s 2c 4 c 8 s 5/2 s vertices. Denote by V s the set of all such vertices whose cubes intersect K s. Since c 6 can be much larger than c 4 and the edge of each cube, δ = ϵ/c 8 s 2, is small and decreases with s, all vertices in V s fall in Θ s as well. Thus by the pointwise convergence assumption in the Lemma, it is easy to derive that for any ϵ > 0, as b n, P max θ V s λ n θ λθ ϵ/s c 6 exp Cs logcs c 7 slog sb n ϵ 0. Therefore, A.24 M n max θ V s λ n θ λθ = o p ϵ/s. For any θ K s, it will fall into a cube and thus can be written as the convex combination of this cube s vertices {θ i } in V s, that is, θ = i α iθ i with α i [0, 1. Then by the convexity of λ n θ and A.24, λ n θ α i λ n θ i i α i { λ n θ i λθ i + λθ i λθ + λθ} max λ n θ i λθ i + ϵ/s + λθ M n + ϵ/s + λθ. i Thus, A.25 P sup λ n θ λθ > 2ϵ/s 0. θ K s
9 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 9 Next we prove the lower bound. Each θ in K s lies within a δ-cube with a vertex θ 0 in D s, thus s θ = θ 0 + δ i e i with δ i < δ, where e 1,, e s denote s coordinate directions. Without loss of generality suppose 0 δ i < δ for each i. Define θ i to be the vertex θ 0 δe i in D s. Then θ 0 can be written as a convex combination of θ and θ i : δ θ 0 = δ + j δ θ + j i δ i δ + j δ θ i. j δ Denote by a = δ+ j δ and a δ i = i j δ+ j δ. Since 0 δ j < δ, it follows that j the coefficient for θ can be bounded as a 1 1+s. Since λ n is convex, by A.24 we obtain that aλ n θ λ n θ 0 i a i λ n θ i λθ 0 i a i λθ i 2M n λθ ϵ s i a i λθ + ϵ s 2M n aλθ 2ϵ s 2M n. Since a > 1 + s 1, it follows from the above inequality and A.24 that Therefore, A.26 λ n θ λθ 2ϵ s 2M 2ϵs + 1 ns + 1 o1. s P inf λ n θ λθ < 31 + s ϵ θ K s s 0. Since we can choose ϵ arbitrarily small, the uniform convergence result follows easily by combining A.25 with A.26. A.4. Proof of Proposition 1. Since s is finite, the summation of the probability below is of the same order as the maximum of the probability below. The distribution result equation 3.1 in the main paper entails that P 1 s n ST ε > z = P max 1 j s xt j ε > nz P x T j ε > Cnz s x j α αc α Cnz α, j=1 j=1
10 10 FAN ET AL. where C > 0 is some generic constant. For any sequence b n such that b n, by letting z = n 1 b n bn with b n = s j=1 x j α α 1/α, we have that P n 1 S T ε > n 1 b n bn 0. In other words, n 1 S T ε = O p n 1 b n. Hence, for the following condition, βm + λ n sgn β M = β M + n 1 S T ε, to have a solution β = β 1,, β s T with β j > 0 for all j = 1,, s, the necessary conditions are β 0 nb 1 n and λ n < β 0. Combining these conditions we have A.27 λ n < β 0 n 1 b n bn, with some diverging sequence b n. We next check if the following condition is satisfied, Q T ε nλ n. Combining 3.1 and A.27 ensures that for j > s, P x T j ε > nλ n P x T j ε > b n bn x j α αc α b n bn α. Since we assumed supp x j supp x i = for any i, j {s+1,, p}, it follows that Q T ε is a vector of independent random variables with components x T j ε. Then, P Q T ε > nλ n = 1 P Q T ε nλ n = 1 1 P x T j ε > nλ n j>s 1 1 C xj α αb n bn α = 1 exp log 1 C x j α αb n bn α. j>s Since log1 + x x for all x > 1, we have that log 1 C x j α αb n bn α C x j α αb n bn α. j>s j>s Combining the above two inequalities, if j>s x j α αb n bn α c 0 /C 0, ], then P Q T ε > nλ n 1 e c 0. That is, with probability at least 1 e c 0, Q T ε nλ n, fails to hold and Lasso does not have the model selection oracle property. In fact, if b n On 1 j>s x j α α 1/α b 1 n, or equivalently by A.27, β 0 On 1 x j α α 1/α, j>s j>s
11 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 11 then c 0 /C 0, ]. In other words, unless we have A.28 nβ 0 { j>s x j α α} 1/α, Lasso is not able to recover the true model and the correct sign. Finally, since x j 2 = n, supp x j = On 1/2 and max ij x ij = On 1/4, the nonzero components of Q are all of the same order On 1/4. Consequently, x j α α must be all of the same order On 2+α/4. Then, n{ j>s x j α α} 1/α = O n α and the condition A.28 above becomes n α β 0. APPENDIX B: REAL DATA EXAMPLE In this section, we use expression quantitative trait locus eqtl mapping to illustrate the performance of R-Lasso and AR-Lasso. eqtl studies aim at finding the variations of genotype in a certain part of a chromosome that are associated with the gene expression levels. In this study, we conducted a cis-eqtl mapping for the gene CHRNA6, cholinergic receptor, nicotinic, alpha 6. CHRNA6 is located on the 8th chromosome, in the cytogenetic location 8p11. CHRNA6 is thought to be related to activation of dopamine releasing neurons with nicotine Thorgeirsson et al., Therefore, CHRNA6 has been the subject of many nicotine addiction studies on people with western European heritage Saccone et al., 2009; Thorgeirsson et al., The data are from 90 individuals from the international HapMap project The International HapMap Consortium, 2005, all with western Europe ancestry. The data are available on ftp://ftp.sanger.ac.uk/pub/genevar/. The normalized expression data was generated with an Illumina Sentrix Human-6 Expression Bead Chip Stranger et al., The SNPs under investigation are located at 1 megabase upstream and downstream of the transcription start site TSS of CHRNA6; in this range, there were 554 SNPs. The additive coding for SNPs was employed, with 0, 1 and 2 representing the major, heterozygous, and minor populations, respectively. We further screened the SNPs using a variation of the independent screening method SIS of Fan and Lv We kept the top 100 SNPs that had correlation with the gene expression levels. Finally, we applied Lasso, SCAD, R-Lasso and AR- Lasso to the screened variables. The quantile parameter, τ was set to 0.5 for R-Lasso and AR-Lasso, corresponding to the median regression. The tuning parameter for all methods was chosen using five-fold cross validation. The selected SNPs as well as their regression coefficients and distances from the main transcription site are given in Table 1.
12 12 FAN ET AL. Table 1 Selected SNPs for the eqtl study. SNP Lasso SCAD R-Lasso AR-Lasso Distance from TSS in kb rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs It is seen that robust regression methods R-Lasso and AR-Lasso found more of the variables to be significant. R-Lasso and AR-Lasso selected 21 and 15 variables, respectively, whereas Lasso and SCAD only found 15 and 5 of the variables to be significant. Only 4 SNPs were included in all of the models, rs , rs , rs , rs Furthermore, none of these SNPs were covered in the previous study Saccone et al., We speculate that the difference is due to the fact that the previous studies focused on SNPs that are only 50 kb upstream and downstream. Additionally, these studies did not consider multiple regression which makes significant use of the correlation structure in the data. In addition, among all the SNPs
13 ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENT 13 Quantiles of Input Sample Standard Normal Quantiles a Residuals from Lasso Quantiles of Input Sample Standard Normal Quantiles b Residuals from SCAD Quantiles of Input Sample Standard Normal Quantiles c Residuals from R-Lasso Quantiles of Input Sample Standard Normal Quantiles d Residuals from AR-Lasso Fig 1: QQ-plots of residuals for different methods in the eqtl study that are chosen by these four methods, only one of them rs appears in the paper by Saccone et al. 2009, and only R-Lasso and AR-Lasso found this SNP to be important. As it was observed in the finite sample simulations, SCAD and AR-Lasso consistently chose a smaller set of variables than their counterparts. Furthermore, almost two thirds of the selected SNPs lie to the left of the transcription site. We would also like to note that the residuals from the fitted regressions had very heavy right tails. This suggests that, at least for this particular eqtl study, it is a lot more reasonable to use methods based on quantile regression. The QQ-plots of the residuals from different regression methods are shown in Figure 1.
14 14 FAN ET AL. REFERENCES Fan, J. and Lv, J Sure independence screening for ultrahigh dimensional feature space with discussion. J. Roy. Statist. Soc. Ser. B The International HapMap Consortium A haplotype map of the human genome. Nature Nolan, J. P Stable Distributions - Models for Heavy Tailed Data. Birkhauser. In progress, Chapter 1 online at academic2.american.edu/ jpnolan. Pollard, D Asymptotics for least absolute deviation regression estimators. Econometric Theory Saccone, N.L., Saccone, S.F., Hinrichs, A.L., Stitzel, J.A., Duan, W., Pergadia, M.L., Agrawal, A., Breslau, N., Grucza, R.A., Hatsukami, D., Johnson, E.O., Madden, P.A.F., Swan, G.E., Wang, J.C., Goate, A.M., Rice, J.P., Bierut, L.J Multiple distinct risk loci for nicotine dependence identified by dense coverage of the complete family of nicotinic receptor subunit CHRN genes. Am. J. Med. Genet. Part B 150 B Stranger, B., Forrest, S., Dunning, M., Ingle, C., Beazley, C., Thorne, N., Redon, R., Bird, C., de Grassi, A., Lee, C., Tyler-Smith, C., Carter, N., Scherer, S., Tavar, S., Deloukas, P., Hurles, M. and Dermitzakis, E Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes. Science Thorgeirsson, T. et al Sequence variants at CHRNB3-CHRNA6 and CYP2A6 affect smoking behavior. Nat. Genet J. Fan Department of Operations Research and Fin Eng Princeton University Princeton, New Jersey USA jqfan@princeton.edu E. Barut IBM T.J. Watson Research Center Yorktown Heights, New York USA abarut@us.ibm.com Y. Fan Data Sciences and Operations Department Marshall School of Business University of Southern California Los Angeles, California USA fanyingy@marshall.usc.edu
Penalized composite quasi-likelihood for ultrahigh dimensional variable selection
J. R. Statist. Soc. B (2011) 73, Part 3, pp. Penalized composite quasi-likelihood for ultrahigh dimensional variable selection Jelena Bradic and Jianqing Fan Princeton University, USA and Weiwei Wang University
More informationarxiv: v1 [stat.ml] 5 Jan 2015
To appear on the Annals of Statistics SUPPLEMENTARY MATERIALS FOR INNOVATED INTERACTION SCREENING FOR HIGH-DIMENSIONAL NONLINEAR CLASSIFICATION arxiv:1501.0109v1 [stat.ml] 5 Jan 015 By Yingying Fan, Yinfei
More information5.1 Approximation of convex functions
Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and
More informationBy Jianqing Fan 1, Yingying Fan 2 and Emre Barut Princeton University, University of Southern California and IBM T. J. Watson Research Center
The Annals of Statistics 204, Vol. 42, No., 324 35 DOI: 0.24/3-AOS9 c Institute of Mathematical Statistics, 204 ADAPTIVE ROBUST VARIABLE SELECTION arxiv:205.4795v3 [math.st] 2 Mar 204 By Jianqing Fan,
More informationSupplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang
Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang The supplementary material contains some additional simulation
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationSTAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song
STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April
More informationThe Iterated Lasso for High-Dimensional Logistic Regression
The Iterated Lasso for High-Dimensional Logistic Regression By JIAN HUANG Department of Statistics and Actuarial Science, 241 SH University of Iowa, Iowa City, Iowa 52242, U.S.A. SHUANGE MA Division of
More informationAsymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/
More informationGuarding against Spurious Discoveries in High Dimension. Jianqing Fan
in High Dimension Jianqing Fan Princeton University with Wen-Xin Zhou September 30, 2016 Outline 1 Introduction 2 Spurious correlation and random geometry 3 Goodness Of Spurious Fit (GOSF) 4 Asymptotic
More informationarxiv: v1 [stat.me] 30 Dec 2017
arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationSUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION
IPDC SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION By Yinfei Kong, Daoji Li, Yingying Fan and Jinchi Lv California State University
More informationThe deterministic Lasso
The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality
More informationASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS
ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang 1, Joel L. Horowitz 2, and Shuangge Ma 3 1 Department of Statistics and Actuarial Science, University
More informationUniversity of Toronto
A Limit Result for the Prior Predictive by Michael Evans Department of Statistics University of Toronto and Gun Ho Jang Department of Statistics University of Toronto Technical Report No. 1004 April 15,
More informationTheoretical results for lasso, MCP, and SCAD
Theoretical results for lasso, MCP, and SCAD Patrick Breheny March 2 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/23 Introduction There is an enormous body of literature concerning theoretical
More informationarxiv: v1 [stat.me] 11 May 2016
Submitted to the Annals of Statistics INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION arxiv:1605.03315v1 [stat.me] 11 May 2016 By Yinfei Kong, Daoji Li, Yingying
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationM-estimation in high-dimensional linear model
Wang and Zhu Journal of Inequalities and Applications 208 208:225 https://doi.org/0.86/s3660-08-89-3 R E S E A R C H Open Access M-estimation in high-dimensional linear model Kai Wang and Yanling Zhu *
More informationr=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J
7 Appendix 7. Proof of Theorem Proof. There are two main difficulties in proving the convergence of our algorithm, and none of them is addressed in previous works. First, the Hessian matrix H is a block-structured
More informationSample Size Requirement For Some Low-Dimensional Estimation Problems
Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationSupplement to Quantile-Based Nonparametric Inference for First-Price Auctions
Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract
More informationNETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA
Statistica Sinica 213): Supplement NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Hokeun Sun 1, Wei Lin 2, Rui Feng 2 and Hongzhe Li 2 1 Columbia University and 2 University
More informationHERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)
BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability
More informationStepwise Searching for Feature Variables in High-Dimensional Linear Regression
Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy
More informationarxiv: v1 [math.st] 7 Dec 2018
Variable selection in high-dimensional linear model with possibly asymmetric or heavy-tailed errors Gabriela CIUPERCA 1 Institut Camille Jordan, Université Lyon 1, France arxiv:1812.03121v1 [math.st] 7
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationUniversity of Pavia. M Estimators. Eduardo Rossi
University of Pavia M Estimators Eduardo Rossi Criterion Function A basic unifying notion is that most econometric estimators are defined as the minimizers of certain functions constructed from the sample
More informationLECTURE 15: COMPLETENESS AND CONVEXITY
LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other
More informationRANK: Large-Scale Inference with Graphical Nonlinear Knockoffs
RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs Yingying Fan 1, Emre Demirkaya 1, Gaorong Li 2 and Jinchi Lv 1 University of Southern California 1 and Beiing University of Technology 2 October
More informationADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS
Statistica Sinica 18(2008), 1603-1618 ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang, Shuangge Ma and Cun-Hui Zhang University of Iowa, Yale University and Rutgers University Abstract:
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids
More informationSupplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017
Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.
More informationDensity estimators for the convolution of discrete and continuous random variables
Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationOverview. Background
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationregression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered,
L penalized LAD estimator for high dimensional linear regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, where the overall number of variables
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan
More informationAssumptions for the theoretical properties of the
Statistica Sinica: Supplement IMPUTED FACTOR REGRESSION FOR HIGH- DIMENSIONAL BLOCK-WISE MISSING DATA Yanqing Zhang, Niansheng Tang and Annie Qu Yunnan University and University of Illinois at Urbana-Champaign
More informationExtended Bayesian Information Criteria for Model Selection with Large Model Spaces
Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable
More informationOptimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison
Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big
More informationPart IB Complex Analysis
Part IB Complex Analysis Theorems Based on lectures by I. Smith Notes taken by Dexter Chua Lent 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after
More informationConcentration behavior of the penalized least squares estimator
Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar
More informationSPECTRAL GAP FOR ZERO-RANGE DYNAMICS. By C. Landim, S. Sethuraman and S. Varadhan 1 IMPA and CNRS, Courant Institute and Courant Institute
The Annals of Probability 996, Vol. 24, No. 4, 87 902 SPECTRAL GAP FOR ZERO-RANGE DYNAMICS By C. Landim, S. Sethuraman and S. Varadhan IMPA and CNRS, Courant Institute and Courant Institute We give a lower
More informationAn asymptotic ratio characterization of input-to-state stability
1 An asymptotic ratio characterization of input-to-state stability Daniel Liberzon and Hyungbo Shim Abstract For continuous-time nonlinear systems with inputs, we introduce the notion of an asymptotic
More informationWeak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij
Weak convergence and Brownian Motion (telegram style notes) P.J.C. Spreij this version: December 8, 2006 1 The space C[0, ) In this section we summarize some facts concerning the space C[0, ) of real
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More information19.1 Maximum Likelihood estimator and risk upper bound
ECE598: Information-theoretic methods in high-dimensional statistics Spring 016 Lecture 19: Denoising sparse vectors - Ris upper bound Lecturer: Yihong Wu Scribe: Ravi Kiran Raman, Apr 1, 016 This lecture
More informationAsymptotic properties for combined L 1 and concave regularization
Biometrika (2014), 101,1,pp. 57 70 doi: 10.1093/biomet/ast047 Printed in Great Britain Advance Access publication 13 December 2013 Asymptotic properties for combined L 1 and concave regularization BY YINGYING
More informationON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS
The Annals of Applied Probability 1998, Vol. 8, No. 4, 1291 1302 ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS By Gareth O. Roberts 1 and Jeffrey S. Rosenthal 2 University of Cambridge
More informationAnalysis Finite and Infinite Sets The Real Numbers The Cantor Set
Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered
More informationMinimum Power Dominating Sets of Random Cubic Graphs
Minimum Power Dominating Sets of Random Cubic Graphs Liying Kang Dept. of Mathematics, Shanghai University N. Wormald Dept. of Combinatorics & Optimization, University of Waterloo 1 1 55 Outline 1 Introduction
More informationModel-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate
Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel
More informationPart III. 10 Topological Space Basics. Topological Spaces
Part III 10 Topological Space Basics Topological Spaces Using the metric space results above as motivation we will axiomatize the notion of being an open set to more general settings. Definition 10.1.
More informationAn Empirical Characteristic Function Approach to Selecting a Transformation to Normality
Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a
More informationMATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5
MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5.. The Arzela-Ascoli Theorem.. The Riemann mapping theorem Let X be a metric space, and let F be a family of continuous complex-valued functions on X. We have
More informationSub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments
Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments Stas Minsker University of Southern California July 21, 2016 ICERM Workshop Simple question: how to estimate
More informationASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS
The Annals of Statistics 2008, Vol. 36, No. 2, 587 613 DOI: 10.1214/009053607000000875 Institute of Mathematical Statistics, 2008 ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION
More informationNotes on Complex Analysis
Michael Papadimitrakis Notes on Complex Analysis Department of Mathematics University of Crete Contents The complex plane.. The complex plane...................................2 Argument and polar representation.........................
More informationAn introduction to some aspects of functional analysis
An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms
More informationThe Australian National University and The University of Sydney. Supplementary Material
Statistica Sinica: Supplement HIERARCHICAL SELECTION OF FIXED AND RANDOM EFFECTS IN GENERALIZED LINEAR MIXED MODELS The Australian National University and The University of Sydney Supplementary Material
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationFall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.
14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined
More informationSPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS
SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationConfidence Intervals for Low-dimensional Parameters with High-dimensional Data
Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology
More informationBi-level feature selection with applications to genetic association
Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may
More informationOblique derivative problems for elliptic and parabolic equations, Lecture II
of the for elliptic and parabolic equations, Lecture II Iowa State University July 22, 2011 of the 1 2 of the of the As a preliminary step in our further we now look at a special situation for elliptic.
More informationlarge number of i.i.d. observations from P. For concreteness, suppose
1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationGoal. Robust A Posteriori Error Estimates for Stabilized Finite Element Discretizations of Non-Stationary Convection-Diffusion Problems.
Robust A Posteriori Error Estimates for Stabilized Finite Element s of Non-Stationary Convection-Diffusion Problems L. Tobiska and R. Verfürth Universität Magdeburg Ruhr-Universität Bochum www.ruhr-uni-bochum.de/num
More informationarxiv: v1 [stat.me] 25 Aug 2010
Variable-width confidence intervals in Gaussian regression and penalized maximum likelihood estimators arxiv:1008.4203v1 [stat.me] 25 Aug 2010 Davide Farchione and Paul Kabaila Department of Mathematics
More informationOracle Estimation of a Change Point in High Dimensional Quantile Regression
Oracle Estimation of a Change Point in High Dimensional Quantile Regression Sokbae Lee, Yuan Liao, Myung Hwan Seo, and Youngki Shin arxiv:1603.00235v2 [stat.me] 16 Dec 2016 15 November 2016 Abstract In
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationAn Asymptotic Property of Schachermayer s Space under Renorming
Journal of Mathematical Analysis and Applications 50, 670 680 000) doi:10.1006/jmaa.000.7104, available online at http://www.idealibrary.com on An Asymptotic Property of Schachermayer s Space under Renorming
More informationSTAT 7032 Probability Spring Wlodek Bryc
STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,
More informationMODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS
Statistica Sinica 23 (2013), 901-927 doi:http://dx.doi.org/10.5705/ss.2011.058 MODEL SELECTION FOR CORRELATED DATA WITH DIVERGING NUMBER OF PARAMETERS Hyunkeun Cho and Annie Qu University of Illinois at
More informationClassical regularity conditions
Chapter 3 Classical regularity conditions Preliminary draft. Please do not distribute. The results from classical asymptotic theory typically require assumptions of pointwise differentiability of a criterion
More informationarxiv: v2 [math.st] 31 Mar 2015
Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications arxiv:1502.04237v2 [math.st] 31 Mar 2015 Jianqing Fan 1,2, Qi-Man Shao 3, Wen-Xin Zhou 1,4 1 Princeton University,
More informationFeature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates
Feature Selection for Varying Coefficient Models With Ultrahigh Dimensional Covariates Jingyuan Liu, Runze Li and Rongling Wu Abstract This paper is concerned with feature screening and variable selection
More informationDaniel M. Oberlin Department of Mathematics, Florida State University. January 2005
PACKING SPHERES AND FRACTAL STRICHARTZ ESTIMATES IN R d FOR d 3 Daniel M. Oberlin Department of Mathematics, Florida State University January 005 Fix a dimension d and for x R d and r > 0, let Sx, r) stand
More informationBinomial Mixture Model-based Association Tests under Genetic Heterogeneity
Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,
More informationApproximation Theoretical Questions for SVMs
Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually
More informationA Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators
A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators Lei Ni And I cherish more than anything else the Analogies, my most trustworthy masters. They know all the secrets of
More informationTowards stability and optimality in stochastic gradient descent
Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction
More informationLower Tail Probabilities and Normal Comparison Inequalities. In Memory of Wenbo V. Li s Contributions
Lower Tail Probabilities and Normal Comparison Inequalities In Memory of Wenbo V. Li s Contributions Qi-Man Shao The Chinese University of Hong Kong Lower Tail Probabilities and Normal Comparison Inequalities
More informationStatistical Properties of Numerical Derivatives
Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective
More informationOn rational approximation of algebraic functions. Julius Borcea. Rikard Bøgvad & Boris Shapiro
On rational approximation of algebraic functions http://arxiv.org/abs/math.ca/0409353 Julius Borcea joint work with Rikard Bøgvad & Boris Shapiro 1. Padé approximation: short overview 2. A scheme of rational
More informationInference for High Dimensional Robust Regression
Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:
More informationHomogeneity Pursuit. Jianqing Fan
Jianqing Fan Princeton University with Tracy Ke and Yichao Wu http://www.princeton.edu/ jqfan June 5, 2014 Get my own profile - Help Amazing Follow this author Grace Wahba 9 Followers Follow new articles
More informationLARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011
LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic
More informationA UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009)
A UNIFIED APPROACH TO MODEL SELECTION AND SPARSE RECOVERY USING REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009) Mar. 19. 2010 Outline 1 2 Sideline information Notations
More information