Quasi-regression for heritability

Size: px
Start display at page:

Download "Quasi-regression for heritability"

Transcription

1 Quasi-regression for heritability Art B. Owen Stanford University March 01 Abstract We show in an idealized model that the narrow sense (linear heritability from d autosomal SNPs can be estimated without bias from n independently sampled individuals, by a method of moments strategy based on quasi-regression. The variance of this estimator is below C 1/n + C d /n 3 for constants C <, uniformly in n and d 1. In particular d n is allowed, and heritability can be consistently estimated in some limits where the effects of individual SNPs cannot be consistently estimated due to non-sparsity. Furthermore, when the C d /n 3 term dominates, then doubling the sample size reduces the variance by almost eight fold. The idealization is that the model assumes complete linage equilibrium, although it does allow for an arbitrary pattern of regression coefficients. In particular the coefficients need not be sparse or Gaussianly distributed, nor independent of the minor allele frequency. The method and the rate of convergence extend to additive heritability. For a full quadratic model encompassing pairwise interactions with d = d + d(d 1/ coefficients the error variance is O(d 4 /n 3. 1 Introduction A missing heritability problem has emerged in genome wide association studies (Manolio et al., 009. A phenotype nown to be largely heritable, for example height, may be found to have only a few significantly predictive SNPs and those few may explain in total only a small portion of the nown heritability. A typical dataset setup has measurements of d SNPs on n subects where frequently d n. Multiple linear regression of the phenotype on SNPs is a useful way to study their effects. Even when d n, it is still possible to estimate the needed regression coefficients, if the true coefficient vector is sparse. See for example Zhao and Yu (006. When that coefficient vector is not sparse, then it is not possible to estimate it accurately (Candès and Davenport,

2 Notation and models In this paper, the meaning of a(n, d = O(b(n, d is that there is some constant C < for which a(n, d C b(n, d holds for all n and all d 1. That is, we use non-asymptotic order bounds. Let y i be a phenotype for subect i. We assume that the population average of this phenotype has been subtracted from y i so that E(y i = 0. We will suppose that the phenotype is bounded, y i c <. Boundedness is not necessary, but we prefer to write c 3 instead of, for example, E(y 1 1/4. Conversely, we find E(y more informative than c, so we do not use the bound everywhere. Let x i be a genetic measure for SNP on subect i. For an autosomal SNP, x i might be 1 if subect i has one or more copies of the minor allele and zero otherwise. Alternatively, x i {0, 1, } might be the number of copies of minor allele in subect i. Let x i = ( x i E( x i / Var( x i so that E(x i = 0 and Var(x i = 1 in the relevant population. We assume additionally that the components x 1,..., x d of x are independent of each other. This neglects linage disequilibrium (LD. LD only affects a small fraction of the O(d SNP pairs, but we expect it could be a significant consideration in aggregate because there are so many such pairs. The th moment of x is µ = E(x. We will need µ 4 for each SNP. If SNP has minor allele frequency ɛ > 0, then µ 4 = O(ɛ 1. The variable x is bounded. In the motivating examples, x c where c = O(ɛ 1/. In most wor, we assume that ɛ ɛ 0 holds for some ɛ 0 > 0. Then µ 4 are uniformly bounded. Some intermediate expressions use the average µ 4 = (1/d d =1 µ 4. These allow us to consider a limit in which alleles with smaller MAF become eligible for use as n increases. For example, if ɛ n 1/ then µ 4 n 1/, and if we only assume ɛ m/n (perhaps the smallest reasonable MAFs to use for some m > 0, then µ 4 = O(n. The Euclidean norm of x is denoted x. We easily find that E( x = d and E( x 4 = d + d( µ 4 1, so Var( x = d( µ 4 1. As a result x /d = 1 + O p (d 1/ µ 1/ 4. In particular ( n ( xi E 1 = n d d ( µ 4 1, and so if d n, then x i d is a good approximation and one that underlies some of our methods. We consider several models for heritability. The linear model is y i = x i β + ε i, i = 1,..., n (1 =1 where ε i is a random error uncorrelated with x i = (x i1,..., x id having mean 0 and variance σ. Effects of the environment, interactions among the SNPs

3 and interactions between SNPs and the environment are subsumed into ε i. The pairs (x i, y i are independent. Interest naturally centers on the vector β = (β 1,..., β d. Estimation of β is difficult because usually n d. If β is sparse, then it can still be estimated using compressed sensing methods. But β might not be sparse and then there are lower bounds (Candès and Davenport, 011 on its estimation error. To study linear heritability we are interested in σl = d =1 β = β. If β is not sparse enough to allow an accurate estimate ˆβ of β, then we cannot rely on ˆβ to be a good estimate of σl. Accordingly, we turn our attention to strategies for estimating β directly without requiring an accurate estimate of β. Our starting point is the bias corrected quasi-regression estimator of σl from Owen (000 given below. That algorithm is able to estimate σl in an asymptotic regime where n/d /3, without assuming sparsity of β. That is, the sample size n can be sublinear in d. We choose an asymptotic regime with n and d growing together not because more SNPs will be found, but because the usual fixed d and n framewor is not a good description for a finite data set with n d. In addition to the linear model, we may also be interested in an additive model which contains x terms. If x only taes levels, then the centered value x 1 is linearly dependent on x, and we gain nothing from incorporating it into the linear model. When SNPs are recorded at 3 levels, this additive model allows one to investigate dominance effects. Let z = x µ 3x 1 µ 4 µ 3 1. ( Then E(z = 0, E(z = 1 and E(x z = 0. The additive model for heritability is y i = x i β + =1 z i γ + ε i, i = 1,..., n, (3 =1 where ε i contains an error uncorrelated with the x i and the z i, and does not ordinarily tae the same numerical value in models (1, (3 and others that we consider. We are also interested in the squares model y i = z i γ + ε i, i = 1,..., n, (4 =1 the pure interaction model y i = d 1 =1 =+1 x i x i β + ε i, i = 1,..., n, (5 3

4 and the quadratic model y i = x i β + =1 =1 d 1 z i γ + =1 =+1 x i x i β + ε i, i = 1,..., n. (6 Some of these are of direct biological interest and others play a role in our proofs. Corresponding to the model parameters above are the sums of squares σ L = β, σ S = γ, and σ I = < β, along with combinations σa = σ L + σ S and σ Q = σ A + σ I. These sums of squares tae the same value in any model for which all their parts are defined. The following inequality max(σ L, σ S, σ I σ Q E(y (7 will be useful in bounding some error terms. For instance, even though σi is the sum of d(d 1/ nonnegative contributions, that sum is O(1. Also, introducing artificial phenotypes lie y E(y, and applying a version of (7 for them, yields further bounds that we use. 3 Quasi-regression for linear heritability We begin with estimation of σl, for linear heritability. For d n it is infeasible to estimate β by least squares. The quasi-regression estimator of β is β = 1 n x i y i. n We easily find that E( β = β. The naive estimator of σ L is Proposition 1 gives the bias of ˆσ L. Proposition 1. Proof. First ˆσ L = β. =1 E(ˆσ L = n 1 n σ L + 1 n E( x y. ne( β = 1 ( n E Summing over yields the result. x i x i y i y i i i = E ( x y + (n 1E(x y = E ( x y + (n 1β. 4

5 Because E( x = d, the bias in ˆσ L is of order d/n, which is severe when d n. But this bias is easily removed. Let B L = 1 n x i y i. (8 We can obtain a bias corrected estimate ˆσ L,BC = (ˆσ n 1 L B L, (9 because E (ˆσ L,BC n ( n 1 = n 1 n σ L + 1 n E( x y 1 n 1 E( x y = σl. A simple approximation to the bias adustment can be obtained via the concentration of measure approximation x i d. Letting, B L = d y i n the resulting estimate is ˆσ L,BC,CM = (ˆσ n 1 L B L. (10 Propositio. Suppose that the phenotype y satisfies the bound y c. Then E ( ( B L B L c 4 µ 4 d. Proof. First B L B L = n n (d x i y i. Therefore E ( ( B L B L = 1 n 3 E( (d x y 4 + n 1 n 3 E ( (d x y c 4 µ 4 d n 3 + c4 1 E( (d x c 4 µ 4 d n 3 + c4 n 1 n 3 d µ 4. Propositio shows that the concentration of measure approximation changes our estimate of σl by an amount with root mean square (RMS O(d1/ n 1 µ 1/ 4, where c is assumed to be constant in n and d for the phenotype of interest. In an asymptotic regime with µ 4 bounded we find that the concentration of measure approximation maes a vanishing RMS effect on our estimate if d = o(. Simulations in Owen (000 showed very little difference between approximations with and without the concentration of measure approximation. 5

6 Theorem 1. Let ˆσ L be the naive quasi-regression variance estimate and assume that the phenotype satisfies the bound y c. Then E ( (ˆσ L,BC σl ( 1 = Var(ˆσ L,BC = O n + d n 3. Proof. It suffices to show that Var(ˆσ L B L = O ( n 1 + d n 3. A lengthy derivation (see Lemma 3 of the appendix yields Var(ˆσ L 4 n c σl + 1 (4σ L c 3( d 1/ + d µ 4 + (d µ4 + E(y n 3 c4 (d + d µ 4 = O(n 1 + dn + d n 3 = O(n 1 + d n 3. Next Var( B L = 1 n 3 ( E( x 4 y 4 E ( x y c4 n 3 (d + d µ 4. Thus Var( B L = O(d n 3 and Var(ˆσ L = O(n 1 + d n 3, so Var(ˆσ L B L Var(ˆσ L + Var( B L + (Var(ˆσ L Var( B L 1/ = O(n 1 + d n 3. When d n, the dominant term in the bound for Var(ˆσ L,BC is 4c4 d n 3. In that case, doubling n reduces the variance by nearly a factor of eight. The concentration of measure approximation is accurate to within O(d 1/ n 1. The standard deviation of ˆσ L,BC is O(n 1/ + dn 3/. As a result, the bias introduced by concentration of measure is of order n/d + d/n 3 standard errors. It is thus negligible for the values of n and d used in many genome-wide association studies. Both ˆσ L,BC and ˆσ L,BC,CM require O(nd computation but the latter has a smaller implicit constant. 4 Additive and quadratic heritability The additive model predicts y by a linear combination of x and z for = 1,..., d. Because x 1,..., x d are independent, it follows that the entire list x 1,..., x d, z 1,..., z d consists of d mutually uncorrelated predictors. The analysis of the additive model is similar to that of the linear model, except that there are twice as many predictors and the fourth moments of z are larger than those of x. The quadratic model incorporates predictors x x for <. For distinct pairs < and < with or the predictors x x and x x are uncorrelated. Similarly x x is uncorrelated with all of the x and z whether or not = or =. The big difference in the quadratic model is that there are now more than d / coefficients to estimate instead of O(d coefficients. For the additive model, we estimate β = 1 n n x i y i, and γ = 1 n 6 n z i y i,

7 and the naive estimate of σ A is Proposition 3. ˆσ A = ( β + γ. =1 E(ˆσ A = n 1 n σ A + 1 n E( ( x + z y. Proof. The same argument used in Proposition 3 applies here. Letting B A = 1 we obtain estimates n ( x i + z i y i and BA = d ˆσ A,BC = (ˆσ n 1 A B A ˆσ A,BC,CM = (ˆσ n 1 A B A. and n yi, As in the linear case we find that E (ˆσ A,BC = σ A. Also from A = S + L we get E ( ( B A B A c 4( µ 4 (x + µ 4 (z + µ 4 (x µ 4 (z d where µ 4 (x and µ 4 (z are average fourth moments of x and z respectively. Theorem. Let ˆσ A be the naive quasi-regression variance estimate of σ A and assume that the phenotype satisfies the bound y c. Then ( 1 Var(ˆσ A,BC = O n + d n 3. Proof. The additive variance explained is σa = σ L + σ S, and the estimate partitions accordingly, as ˆσ A,BC = ˆσ L,BC + ˆσ S,BC. We can apply Theorem 1 to both parts, letting z tae the role of x in the second application. For the quadratic model, we incorporate estimates β = 1 n n x i x i y i for 1 < d. The naive estimate of σ Q is ˆσ Q = =1( β + γ + < β. 7

8 Proposition 4. E(ˆσ Q = n 1 n σ Q + 1 n E( ( x + z + x 4 y. Proof. Recall that σq = σ L + σ S + σ I. To tae account of the σ I write for <, ne ( β = 1 n i i ( E xi x i x i x i y i y i = E ( x x y + (n 1β. component, Summing over 1 < d, yields < E ( β = 1 n E( x 4 y + n 1 n σ I. Once again, we get a bias correction and a concentration of measure approximation: B Q = 1 n B Q = d + ( + µ 4 d ( x i 4 + x i + z i y i n yi. The bias is now of much larger order d, consistent with there being more than d / parameters to estimate. Using these expressions, we obtain estimates ˆσ Q,BC = (ˆσ n 1 Q B Q ˆσ Q,BC,CM = (ˆσ n 1 Q B Q. Here E (ˆσ Q,BC = σ Q. In the model equation Q = L + S + I we have already seen how the linear and square parts have an accurate concentration of measure approximation. It remains to consider B I B I. and and Proposition 5. For the interaction model, the bias adustments satisfy Proof. E ( ( B I B I ( d 3 = O. E ( ( B I B I = 1 (( n n 4 E (d + d µ 4 x i 4 yi 8

9 = 1 ( n 3 E (d + d µ 4 x 4 y 4 + n 1 E ((d + d µ 4 x 4 y = c4 Var( x 4 ( d 3 = O. Theorem 3. Let ˆσ Q be the naive quasi-regression variance estimate of σ Q and assume that the phenotype satisfies the bound y c. Then n 3 ( 1 Var(ˆσ Q,BC = O n + d4 n 3. Proof. The error from the additive parts is already seen to be O(1/n + d /n 3. Therefore it suffices to show that Var(ˆσ I,BC = O(1/n + d4 /n 3. This is exactly what we would expect from a linear model with d = d(d 1/ independent predictors. But the predictors in the interaction model, while uncorrelated, are not independent. We can follow the argument for the linear case using d predictors of the form x x for < in Lemma 3, replacing each x by x r x s for r < s and then each x by x x for < and each x by x = (x 1 x, x 1 x 3,..., x d 1 x d R d. We then use E(x x y = β and replace ε L by ε I = y < x x β. The analogy holds in a straightforward way except for the term which becomes n 1 n 3 E ( x x x r x s y < r<s which now involves four way sums over components of x. We need this to be O(d / = O(d /. If we use the artificial phenotype y E(y and consider a pure quartic model with d(d 1(d (d 3/4 predictors x x x r x s with,, r, and s all distinct, then the terms with no ties among,, r, s sum to at most E(y. What remains are the terms where {, } {r, s}. There are O(d such terms and they are uniformly bounded. Thus while it is possible to estimate the interaction inheritance and hence the quadratic inheritance with n d, we still need n of larger order than d. Practically, this implies that we may be able to investigate interactions among strategically chosen subsets of SNPs but perhaps not the entire interaction at practical sample sizes. The concentration of measure approximation for the interaction inheritance is x 4 d + d µ 4. The root mean square change from this shortcut is d 3/ /n, while the standard deviation of the estimate is O(n 1/ + d n 3/. Suppose that n is ust barely large enough to allow estimation of σ I, perhaps n d4/3+ɛ. Then the RMS difference between ˆσ I,BC and ˆσ I,BC,CM is of order d3/ n 1 = d 1/6 ɛ which diverges. As a result, the concentration of measure shortcut is not recommended for the interaction or quadratic model. 9

10 Acnowledgments I than Or Zu for helpful discussions. This wor was supported by the NSF under grant DMS References Candès, E. and Davenport, M. A. (011. Howe well can we estimate a sparse vector? Technical report, Stanford University. Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Charavarti, A., Cho, J. H., Guttmacher, A. E., Kong, A., Kruglya, L., Mardis, E., Rotimi, C. N., Slatin, M., Valle, D., Whittemore, A. S. Boehne, M., Clar, A. G., Eichler, E. E., Gibson, G., Haines, J. L., Macay, T. F. C., McCarroll, S. A., and Visscher, P. M. (009. Finding the missing heritability of complex diseases. Nature, 461: Owen, A. B. (000. Assessing linearity in high dimensions. The Annals of Statistics, 8(1:1 19. Zhao, P. and Yu, B. (006. On model selection consistency of the lasso. Journal of Machine Learning Research, 7: Appendix Here we prove some basic Lemmas used in the main Theorem. The following basic properties of x are used below: E ( x = d, E ( x 4 = d + d µ 4 E ( ( x d = d µ 4, and E ( ( x d r = O(d r 1, integer r 3. Lemma 1 uses the artificial phenotype y E(y to bound a quantity which appears in Var(ˆσ L. Lemma 1. E ( x x y (d µ4 + E(y 4. Proof. We write E ( x x y = E ( x x y + E ( x y. For the first term, consider an artificial phenotype y E(y. Then E ( x x y = E ( x x (y E(y < 10

11 E ( (y E(y E(y 4, because of inequality (7 applied to the interaction model for this artificial phenotype. Next E(x y E(x 4 E(y 4 E(y 4 d µ 4. Lemma. E ( (y ε x y 3 σ L c 3( d + d µ 4 1/. Proof. Using Cauchy-Schwarz, boundedness of y, and the above results on x : E ( (y ε x y 3 E ( (y ε 1/ E ( x 4 y 6 1/ Lemma 3 is the main lemma. σ L c 3( d + d µ 4 1/. Lemma 3. Under the conditions of Theorem 1 Var(ˆσ L 4 n c σl + 1 (4σ L c 3( d 1/ +d µ 4 +(d µ4 +E(y n 3 c4 (d +d µ 4. Proof. First, Var (ˆσ L = E ( β ( β (E β We simplify some expressions below using E ( x y = β and ε L y x β. For each pair, {1,,..., d}, E( β β (( 1 ( 1 = E x i y i x i y i n n i i = 1 ( n 4 E xi x i x i x i y i y i y i y i i i i i (n 1(n (n 3 = n 3 E(x y E(x y (n 1(n ( + n 3 E(x y E(x y + E(x y E(x y + 4E(x x y E(x ye(x y + n 1 ( n 3 E ( x y E ( x x y 3 + E ( x y E ( x x y 3 + E ( x y E ( x y + E ( x x y + 1 n 3 E( x x y 4. 11

12 Summing over and, E( β β = + (n 1(n (n 3 n 3 σl 4 (n 1(n ( n 3 σle ( x y + 4E ( (y ε L y + n 1 n 3 ( 4E ( (y ε L x y 3 + E( x y n 3 E( x 4 y 4. E(x x y Next E( β = E(x y /n + (n 1β /n. Therefore ( ( E Subtracting, β = 1 + E ( x y E ( x y (n 1 = 1 E( x y + β β + n 1 E ( x y β (n 1 σ 4 L + n 1 E ( x y σ L. ( (n 1(n (n 3 Var(ˆσ L (n 1 = n 3 σl 4 ( (n 1(n + n 3 n 1 E ( x y σl ( n 1 + n 3 1 E ( x y (n 1(n + 4 n 3 E ( (y ε L y + n 1 ( n 3 4E ( (y ε L x y 3 + E(x x y + 1 n 3 E( x 4 y 4. The first three terms above are less than or equal to zero. Therefore Var(ˆσ L (n 1(n 4 n 3 E ( (y ε L y + n 1 ( n 3 4E ( (y ε L x y 3 + E(x x y + 1 n 3 E( x 4 y 4. 1

13 We can simplify this bound using E ( (y ε L y c σl, Lemmas 1 and, and E ( x 4 y 4 c 4 (d + d µ 4, yielding Var(ˆσ L (n 1(n 4 n 3 c σl + n 1 ( n 3 4σ L c 3( d 1/ + d µ 4 + (d µ4 + E(y n 3 c4 (d + d µ 4 4 n c σl + 1 (4σ L c 3( d 1/ + d µ 4 + (d µ4 + E(y n 3 c4 (d + d µ 4. 13

The square root rule for adaptive importance sampling

The square root rule for adaptive importance sampling The square root rule for adaptive importance sampling Art B. Owen Stanford University Yi Zhou January 2019 Abstract In adaptive importance sampling, and other contexts, we have unbiased and uncorrelated

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

Regression Analysis. Ordinary Least Squares. The Linear Model

Regression Analysis. Ordinary Least Squares. The Linear Model Regression Analysis Linear regression is one of the most widely used tools in statistics. Suppose we were jobless college students interested in finding out how big (or small) our salaries would be 20

More information

Network Mining for Personalized Medicine

Network Mining for Personalized Medicine Department Biosystems Network Mining for Personalized Medicine Karsten Borgwardt ETH Zürich, Department Biosystems The Hague, September 3, 2016 Mapping Phenotypes to the Genome Disease Genotype Individual

More information

Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics

Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 10, Issue 1 2011 Article 38 Entropy Based Genetic Association Tests and Gene-Gene Interaction Tests Mariza de Andrade, Mayo Clinic Xin

More information

Inference for High Dimensional Robust Regression

Inference for High Dimensional Robust Regression Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:

More information

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization

CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization CS168: The Modern Algorithmic Toolbox Lecture #6: Regularization Tim Roughgarden & Gregory Valiant April 18, 2018 1 The Context and Intuition behind Regularization Given a dataset, and some class of models

More information

Mixed-Models. version 30 October 2011

Mixed-Models. version 30 October 2011 Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

The 2-valued case of makespan minimization with assignment constraints

The 2-valued case of makespan minimization with assignment constraints The 2-valued case of maespan minimization with assignment constraints Stavros G. Kolliopoulos Yannis Moysoglou Abstract We consider the following special case of minimizing maespan. A set of jobs J and

More information

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems

More information

arxiv: v1 [math.co] 17 Dec 2007

arxiv: v1 [math.co] 17 Dec 2007 arxiv:07.79v [math.co] 7 Dec 007 The copies of any permutation pattern are asymptotically normal Milós Bóna Department of Mathematics University of Florida Gainesville FL 36-805 bona@math.ufl.edu Abstract

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

Static-Priority Scheduling. CSCE 990: Real-Time Systems. Steve Goddard. Static-priority Scheduling

Static-Priority Scheduling. CSCE 990: Real-Time Systems. Steve Goddard. Static-priority Scheduling CSCE 990: Real-Time Systems Static-Priority Scheduling Steve Goddard goddard@cse.unl.edu http://www.cse.unl.edu/~goddard/courses/realtimesystems Static-priority Scheduling Real-Time Systems Static-Priority

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

McGill University Math 354: Honors Analysis 3

McGill University Math 354: Honors Analysis 3 Practice problems McGill University Math 354: Honors Analysis 3 not for credit Problem 1. Determine whether the family of F = {f n } functions f n (x) = x n is uniformly equicontinuous. 1st Solution: The

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

Generalization theory

Generalization theory Generalization theory Daniel Hsu Columbia TRIPODS Bootcamp 1 Motivation 2 Support vector machines X = R d, Y = { 1, +1}. Return solution ŵ R d to following optimization problem: λ min w R d 2 w 2 2 + 1

More information

GWAS IV: Bayesian linear (variance component) models

GWAS IV: Bayesian linear (variance component) models GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian

More information

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April

More information

19.1 Problem setup: Sparse linear regression

19.1 Problem setup: Sparse linear regression ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 19: Minimax rates for sparse linear regression Lecturer: Yihong Wu Scribe: Subhadeep Paul, April 13/14, 2016 In

More information

The space complexity of approximating the frequency moments

The space complexity of approximating the frequency moments The space complexity of approximating the frequency moments Felix Biermeier November 24, 2015 1 Overview Introduction Approximations of frequency moments lower bounds 2 Frequency moments Problem Estimate

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

1216 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE /$ IEEE

1216 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 6, JUNE /$ IEEE 1216 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL 54, NO 6, JUNE 2009 Feedback Weighting Mechanisms for Improving Jacobian Estimates in the Adaptive Simultaneous Perturbation Algorithm James C Spall, Fellow,

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Theorems. Theorem 1.11: Greatest-Lower-Bound Property. Theorem 1.20: The Archimedean property of. Theorem 1.21: -th Root of Real Numbers

Theorems. Theorem 1.11: Greatest-Lower-Bound Property. Theorem 1.20: The Archimedean property of. Theorem 1.21: -th Root of Real Numbers Page 1 Theorems Wednesday, May 9, 2018 12:53 AM Theorem 1.11: Greatest-Lower-Bound Property Suppose is an ordered set with the least-upper-bound property Suppose, and is bounded below be the set of lower

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

Math 61CM - Solutions to homework 6

Math 61CM - Solutions to homework 6 Math 61CM - Solutions to homework 6 Cédric De Groote November 5 th, 2018 Problem 1: (i) Give an example of a metric space X such that not all Cauchy sequences in X are convergent. (ii) Let X be a metric

More information

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Heterogeneous Multitask Learning with Joint Sparsity Constraints

Heterogeneous Multitask Learning with Joint Sparsity Constraints Heterogeneous ultitas Learning with Joint Sparsity Constraints Xiaolin Yang Department of Statistics Carnegie ellon University Pittsburgh, PA 23 xyang@stat.cmu.edu Seyoung Kim achine Learning Department

More information

Lecture 10 September 27, 2016

Lecture 10 September 27, 2016 CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 10 September 27, 2016 Scribes: Quinten McNamara & William Hoza 1 Overview In this lecture, we focus on constructing coresets, which are

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be Chapter 6 Expectation and Conditional Expectation Lectures 24-30 In this chapter, we introduce expected value or the mean of a random variable. First we define expectation for discrete random variables

More information

Compressed Least-Squares Regression on Sparse Spaces

Compressed Least-Squares Regression on Sparse Spaces Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Compressed Least-Squares Regression on Sparse Spaces Mahdi Milani Fard, Yuri Grinberg, Joelle Pineau, Doina Precup School of Computer

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Asymptotically optimal induced universal graphs

Asymptotically optimal induced universal graphs Asymptotically optimal induced universal graphs Noga Alon Abstract We prove that the minimum number of vertices of a graph that contains every graph on vertices as an induced subgraph is (1 + o(1))2 (

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1 10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

arxiv: v1 [math.co] 18 Dec 2018

arxiv: v1 [math.co] 18 Dec 2018 A Construction for Difference Sets with Local Properties Sara Fish Ben Lund Adam Sheffer arxiv:181.07651v1 [math.co] 18 Dec 018 Abstract We construct finite sets of real numbers that have a small difference

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

Lecture 2: Streaming Algorithms

Lecture 2: Streaming Algorithms CS369G: Algorithmic Techniques for Big Data Spring 2015-2016 Lecture 2: Streaming Algorithms Prof. Moses Chariar Scribes: Stephen Mussmann 1 Overview In this lecture, we first derive a concentration inequality

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Mining Big Data Using Parsimonious Factor and Shrinkage Methods

Mining Big Data Using Parsimonious Factor and Shrinkage Methods Mining Big Data Using Parsimonious Factor and Shrinkage Methods Hyun Hak Kim 1 and Norman Swanson 2 1 Bank of Korea and 2 Rutgers University ECB Workshop on using Big Data for Forecasting and Statistics

More information

Welsh s problem on the number of bases of matroids

Welsh s problem on the number of bases of matroids Welsh s problem on the number of bases of matroids Edward S. T. Fan 1 and Tony W. H. Wong 2 1 Department of Mathematics, California Institute of Technology 2 Department of Mathematics, Kutztown University

More information

KRIVINE SCHEMES ARE OPTIMAL

KRIVINE SCHEMES ARE OPTIMAL KRIVINE SCHEMES ARE OPTIMAL ASSAF NAOR AND ODED REGEV Abstract. It is shown that for every N there exists a Borel probability measure µ on { 1, 1} R { 1, 1} R such that for every m, n N and x 1,..., x

More information

SUPPLEMENTARY MATERIAL COBRA: A Combined Regression Strategy by G. Biau, A. Fischer, B. Guedj and J. D. Malley

SUPPLEMENTARY MATERIAL COBRA: A Combined Regression Strategy by G. Biau, A. Fischer, B. Guedj and J. D. Malley SUPPLEMENTARY MATERIAL COBRA: A Combined Regression Strategy by G. Biau, A. Fischer, B. Guedj and J. D. Malley A. Proofs A.1. Proof of Proposition.1 e have E T n (r k (X)) r? (X) = E T n (r k (X)) T(r

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

arxiv: v2 [math.ap] 28 Nov 2016

arxiv: v2 [math.ap] 28 Nov 2016 ONE-DIMENSIONAL SAIONARY MEAN-FIELD GAMES WIH LOCAL COUPLING DIOGO A. GOMES, LEVON NURBEKYAN, AND MARIANA PRAZERES arxiv:1611.8161v [math.ap] 8 Nov 16 Abstract. A standard assumption in mean-field game

More information

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012 Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O

More information

WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME)

WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME) WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME ITAI BENJAMINI, GADY KOZMA, LÁSZLÓ LOVÁSZ, DAN ROMIK, AND GÁBOR TARDOS Abstract. We observe returns of a simple random wal on a finite graph to a fixed node,

More information

Asymptotically optimal induced universal graphs

Asymptotically optimal induced universal graphs Asymptotically optimal induced universal graphs Noga Alon Abstract We prove that the minimum number of vertices of a graph that contains every graph on vertices as an induced subgraph is (1+o(1))2 ( 1)/2.

More information

Spectral representations and ergodic theorems for stationary stochastic processes

Spectral representations and ergodic theorems for stationary stochastic processes AMS 263 Stochastic Processes (Fall 2005) Instructor: Athanasios Kottas Spectral representations and ergodic theorems for stationary stochastic processes Stationary stochastic processes Theory and methods

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

HYPERGRAPHS, QUASI-RANDOMNESS, AND CONDITIONS FOR REGULARITY

HYPERGRAPHS, QUASI-RANDOMNESS, AND CONDITIONS FOR REGULARITY HYPERGRAPHS, QUASI-RANDOMNESS, AND CONDITIONS FOR REGULARITY YOSHIHARU KOHAYAKAWA, VOJTĚCH RÖDL, AND JOZEF SKOKAN Dedicated to Professors Vera T. Sós and András Hajnal on the occasion of their 70th birthdays

More information

Estimating and Testing the US Model 8.1 Introduction

Estimating and Testing the US Model 8.1 Introduction 8 Estimating and Testing the US Model 8.1 Introduction The previous chapter discussed techniques for estimating and testing complete models, and this chapter applies these techniques to the US model. For

More information

Math 328 Course Notes

Math 328 Course Notes Math 328 Course Notes Ian Robertson March 3, 2006 3 Properties of C[0, 1]: Sup-norm and Completeness In this chapter we are going to examine the vector space of all continuous functions defined on the

More information

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Lasso, Ridge, and Elastic Net

Lasso, Ridge, and Elastic Net Lasso, Ridge, and Elastic Net David Rosenberg New York University February 7, 2017 David Rosenberg (New York University) DS-GA 1003 February 7, 2017 1 / 29 Linearly Dependent Features Linearly Dependent

More information

Exercise Solutions to Functional Analysis

Exercise Solutions to Functional Analysis Exercise Solutions to Functional Analysis Note: References refer to M. Schechter, Principles of Functional Analysis Exersize that. Let φ,..., φ n be an orthonormal set in a Hilbert space H. Show n f n

More information

NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES

NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES JONATHAN LUK These notes discuss theorems on the existence, uniqueness and extension of solutions for ODEs. None of these results are original. The proofs

More information

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

Short cycles in random regular graphs

Short cycles in random regular graphs Short cycles in random regular graphs Brendan D. McKay Department of Computer Science Australian National University Canberra ACT 0200, Australia bdm@cs.anu.ed.au Nicholas C. Wormald and Beata Wysocka

More information

MATH 131A: REAL ANALYSIS (BIG IDEAS)

MATH 131A: REAL ANALYSIS (BIG IDEAS) MATH 131A: REAL ANALYSIS (BIG IDEAS) Theorem 1 (The Triangle Inequality). For all x, y R we have x + y x + y. Proposition 2 (The Archimedean property). For each x R there exists an n N such that n > x.

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Chapter 8 Gradient Methods

Chapter 8 Gradient Methods Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point

More information

Classical regularity conditions

Classical regularity conditions Chapter 3 Classical regularity conditions Preliminary draft. Please do not distribute. The results from classical asymptotic theory typically require assumptions of pointwise differentiability of a criterion

More information

Decouplings and applications

Decouplings and applications April 27, 2018 Let Ξ be a collection of frequency points ξ on some curved, compact manifold S of diameter 1 in R n (e.g. the unit sphere S n 1 ) Let B R = B(c, R) be a ball with radius R 1. Let also a

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Introduction The framework Bias and variance Approximate computation of leverage Empirical evaluation Discussion of sampling approach in big data

Introduction The framework Bias and variance Approximate computation of leverage Empirical evaluation Discussion of sampling approach in big data Discussion of sampling approach in big data Big data discussion group at MSCS of UIC Outline 1 Introduction 2 The framework 3 Bias and variance 4 Approximate computation of leverage 5 Empirical evaluation

More information

Very Sparse Random Projections

Very Sparse Random Projections Very Sparse Random Projections Ping Li, Trevor Hastie and Kenneth Church [KDD 06] Presented by: Aditya Menon UCSD March 4, 2009 Presented by: Aditya Menon (UCSD) Very Sparse Random Projections March 4,

More information

Lectures 2 3 : Wigner s semicircle law

Lectures 2 3 : Wigner s semicircle law Fall 009 MATH 833 Random Matrices B. Való Lectures 3 : Wigner s semicircle law Notes prepared by: M. Koyama As we set up last wee, let M n = [X ij ] n i,j=1 be a symmetric n n matrix with Random entries

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10 COS53: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 0 MELISSA CARROLL, LINJIE LUO. BIAS-VARIANCE TRADE-OFF (CONTINUED FROM LAST LECTURE) If V = (X n, Y n )} are observed data, the linear regression problem

More information

A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes

A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes A Piggybacing Design Framewor for Read-and Download-efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran, Fellow, IEEE Department of Electrical Engineering and Computer Sciences

More information

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get 18:2 1/24/2 TOPIC. Inequalities; measures of spread. This lecture explores the implications of Jensen s inequality for g-means in general, and for harmonic, geometric, arithmetic, and related means in

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

IB Mathematics HL Year 2 Unit 11: Completion of Algebra (Core Topic 1)

IB Mathematics HL Year 2 Unit 11: Completion of Algebra (Core Topic 1) IB Mathematics HL Year Unit : Completion of Algebra (Core Topic ) Homewor for Unit Ex C:, 3, 4, 7; Ex D: 5, 8, 4; Ex E.: 4, 5, 9, 0, Ex E.3: (a), (b), 3, 7. Now consider these: Lesson 73 Sequences and

More information

Inequalities Relating Addition and Replacement Type Finite Sample Breakdown Points

Inequalities Relating Addition and Replacement Type Finite Sample Breakdown Points Inequalities Relating Addition and Replacement Type Finite Sample Breadown Points Robert Serfling Department of Mathematical Sciences University of Texas at Dallas Richardson, Texas 75083-0688, USA Email:

More information

Chapter 7. Number Theory. 7.1 Prime Numbers

Chapter 7. Number Theory. 7.1 Prime Numbers Chapter 7 Number Theory 7.1 Prime Numbers Any two integers can be multiplied together to produce a new integer. For example, we can multiply the numbers four and five together to produce twenty. In this

More information