False Discovery Rate

Size: px
Start display at page:

Download "False Discovery Rate"

Transcription

1 False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30

2 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR Control through Benjamini-Hochberg 4 FDR Control with Knockoffs 5 FDR Control through SLOPE Peng Zhao False Discovery Rate 2/30

3 Multiple Comparison and FWER The Problem Recall the definition of P-value: the probability of finding the observed or more extreme results under null hypothesis. For high dimensional data, p, which means the number of hypotheses need to be tested simultaneously is large. We can further prove, if we use chi-square statistics to test the overall effect, the power is almost zero when the true alternative is sparse (for example, like screening problems). We need a more proper criterion for multiple comparison, the goal is to make less mistakes but still construct a powerful test. Peng Zhao False Discovery Rate 3/30

4 Multiple Comparison and FWER FWER Control A strahtforward way is to control Family-Wise Error Rate (FWER): the probability of rejecting any true null hypothesis. Bonferroni Correction Test each individual hypothesis at α/m. Let the ordered p-values be: p (1) p (2) p (3) p (m), Holm s Procedure will reject H (i) if p (j) α m j + 1 for j = 1, 2,..., i Peng Zhao False Discovery Rate 4/30

5 Multiple Comparison and FWER Problems for FWER controlling Sometimes the control of the FWER is not quite needed. For example, when comparing a treatment group and a control group by testing various aspects of the effect, even if some of the null hypotheses are falsely rejected, the overall conclusion that the treatment is better need not be wrong. Another example is the screening problem (like candidates for drug developments), where we want to obtain as many as possible discoveries, but too large a fraction of false discoveries would also burden the second phase analysis. Thus it s hard to choose a suitable significant level. Peng Zhao False Discovery Rate 5/30

6 False Discovery Rate False-Discovery Rates Consider testing m null hypotheses simultaneously, of which m 0 are true, the results can be summarized: Notice that FWER = P(V 1). Peng Zhao False Discovery Rate 6/30

7 False Discovery Rate False-Discovery Rates R is the number of hypotheses rejected, define False discovery proportion (FDP) to be V / max(r, 1), then define False Discovery Rate (FDR) as: FDR = E(FDP) = E{(V/ max(v + S, 1)} = E(V/ max(r, 1)). Notice that although we observe R, we do not observe V, and so R/V is an unobserved random variable, what we want control is its expectation; The meaning to control FDR: If we repeat the experiment many times, then on average we can control the FDP, the controlling for one time may be very bad. This is different from FWER, which we can control for one experiment. Peng Zhao False Discovery Rate 7/30

8 FDR False Discovery Rate If all null hypotheses are true, then the FDR is equivalent to the FWER: all rejections are false rejections. So if V = 0 then FDP= 0; if V 1 then FDP= 1, thus FDR = E(FDP) = P(V 1)=FWER. FWER FDR: if V = 0 then FDP= 0 = 1 V 1 ; if V 1 then FDP 1 = 1 V 1. Taking expectations we have P(V 1) FDR, which means controlling FWER also implies controlling FDR. Thus, procedures that control FDR can be more powerful. Peng Zhao False Discovery Rate 8/30

9 FDR Control through Benjamini-Hochberg Benjamini-Hochberg Let the ordered p-values be: p (1) p (2) p (3) p (m), Benjamini-Hochberg s Procedure (BH) will reject H (i) if p (j) jq m for j = 1, 2,..., i Above procedure means that the if we know the number of rejections R, then we can choose the threshold of P-value as max(r,1)q m. Peng Zhao False Discovery Rate 9/30

10 FDR Control through Benjamini-Hochberg Benjamini-Hochberg Controls FDR Theorem For independent test statistics and for any configuration of false null hypotheses, the above procedure controls the FDR at q. Proof Sketch: Given any threshold value t, reject all hypothesis with p value less than t, then V(t) and R(t) are stochastic processes, and the some facts can be verified: F t = σ{v(s), R(s), t s 1} is a backward filtration; τ = q max(r(τ), 1)/n is a stopping time w.r.t backward filtration F t ; Peng Zhao False Discovery Rate 10/30

11 FDR Control through Benjamini-Hochberg Benjamini-Hochberg Controls FDR { } V(t) t is a martingale running backward in time and [ ] V(t) E F s t = 1 t E[V(t) F s] = 1 t V(s) t s = V(s), s since under F s, V(s) = # { p 0 i : p 0 i s } where p 0 i are independent and uniformly distributed on [0, s]. By Optional Stopping Time Theorem: ( ) V (τ) FDR (τ) = E = q ( ) V (τ) max (R (τ), 1) m E τ = q m EV(1) = qm 0 m. Peng Zhao False Discovery Rate 11/30

12 FDR Control through Benjamini-Hochberg Simulation Settings Consider the following simulation settings: we run 50 independent hypothesis tests, of which 35 have true null hypothesis, so we generate p vaule through uniform distribution between [0, 1]; Let the other 15 null hypotheses be false, we can generate the 15 p values in this way: first sample 1 random number form N(2.5, 1), then we calculate the p value for the two sided z test only based on this number, finally we repeat the process 15 times independently; We choose the threshold values for α or q to be 0.05, 0.1, 0.2 respectively, and the decision results can be summarized as the following table. Peng Zhao False Discovery Rate 12/30

13 FDR Control through Benjamini-Hochberg Simulation Results Truth \Decision NS S NS S NS S BH H H Bonferroni H H Holm H H No correction H H Peng Zhao False Discovery Rate 13/30

14 FDR Control through Benjamini-Hochberg BH Visualization The black, red and green dashed line stands for the threshold curve for q = as 0.05, 0.1 and 0.2 respectively. Peng Zhao False Discovery Rate 14/30

15 FDR Control through Benjamini-Hochberg Comparison Visualization The black, red, green and blue dashed line stands for the threshold curve for Bonferroni, Holm, BH and no correction as α = 0.2 or q = 0.2 respectively. Peng Zhao False Discovery Rate 15/30

16 Knockoffs FDR Control with Knockoffs For regression problem y = Xβ + w, BH can not control FDR (the positive regression dependency does not hold). Knockoffs: a negative control group for the predictors X, then it should have the following characteristics: Conditional independence: X Y X, this can be obtained when X is constructed only from X; Exchangeability: for any subset S {1,..., p}, (X, X) swap(s) d = (X, X), where swap(s) swaps the entries X j and X j for any j S. For example, when p = 3 and S = {2, 3}: ( X1, X 2, X 3, X 1, X 2, X 3 ) swap({2,3}) d = ( ) X 1, X 2, X 3, X 1, X 2, X 3 By construction, the all null hypotheses for all knockoffs are true. Peng Zhao False Discovery Rate 16/30

17 FDR Control with Knockoffs Knockoffs:Examples Suppose X N (0, Σ), with Σ positive-definite, then we can set: [ Σ Σ diag{s} (X, X) N (0, G), where G = Σ diag{s} Σ ], where s is choose to make sure G is positive-definite. Here G is invariant under permutation operator. Then X X d = N (µ, V), where µ = X XΣ 1 diag{s} V = 2 diag{s} diag{s}σ 1 diag{s}. Then the model can be fitted for y vs (X, X) like the lasso min b R 2p 1 2 y [X, X]b λ b 1. Peng Zhao False Discovery Rate 17/30

18 FDR Control with Knockoffs FDR Control through Knockoffs Let adjusted score W j = b j b j+p, then W j statisfies: Under null hypothesis, W j s are symmetrically distributed; Conditional on W j, under null hypothesis, the sign of W j is i.i.d flipping coin. Large W j provides evidence against null hypothesis. Define: S + (t) = { j : W j t }, S (t) = { j : W j t }, then the threshold value is defined as: τ = min{t := 1 + S (t) 1 S + (t) q}, since S + (t) is all the selection sets, S (t) can be viewed as the mirror reflection of the false selection set. Peng Zhao False Discovery Rate 18/30

19 FDR Control with Knockoffs FDR Control through Knockoffs Theorem Denote the set Ŝ = { j : W j τ }, then the FDR can be controlled: Proof Sketch: [ { } ] j Ŝ H0 E q. Ŝ 1 Define V + (t) = # {j H 0 : j S + (t)} and V (t) = # {j H 0 : j S (t)}, and the some facts can be verified: F t = σ{v + (s), V (s), 0 s t} is a filtration; τ is a stopping time w.r.t filtration F t ; Peng Zhao False Discovery Rate 19/30

20 FDR Control with Knockoffs FDR Control through Knockoffs FDP(τ) = V+ (τ) Ŝ V (t) 1 + V (t) q V + (τ) 1 + V (τ) { } V + (t) is a super-martingale in forward time, for s t: 1+V (t) [ V + ] (s) E 1 + V (s) V± (t), V + (s) + V (s) V+ (t) 1 + V (t), since conditional on V + (s) + V (s), V + (s) is hypergeometric. By Optional Stopping Time Theorem and V + (0) Bin ( #H 0, 1 2) : ( V FDR (τ) + ) ( (τ) V qe 1 + V + ) (0) qe (τ) 1 + V q. (0) Peng Zhao False Discovery Rate 20/30

21 FDR Control with Knockoffs Problems for Knockoffs Fixed design Knockoffs work well when n p, however, for n < p, generating X is hard; For random desing knockoffs (Model-X knockoff), we need to assume to have full knowledge of the feature distribution P X ; We need a tractable sampling technique when the parametric assumption for feature distribution P X can not be satisfied. Peng Zhao False Discovery Rate 21/30

22 FDR Control through SLOPE Sorted l 1 Penalized Estimation (SLOPE) Consider the following optimization problem: 1 min b 2 y Xb 2 + λ 1 b (1) + λ 2 b (2) + + λ p b (p), where λ 1 λ 2... λ p 0 and b (1) b (2)... b (p) and J λ (b) = λ 1 b (1) + λ 2 b (2) λ p b (p) is called sorted-l 1 norm. Then we need to prove J λ (b) is convex, reformulate J λ (b) as: J λ (b) = p i=1 (λ i λ i+1 ) f i (b), f i (b) = b (j), j i where λ p+1 = 0, it s sufficient to show each f i (b) is convex. Peng Zhao False Discovery Rate 22/30

23 FDR Control through SLOPE Sorted l 1 Penalized Estimation (SLOPE) Consider diagonal matrix B = Diag( b ), under Min-max theorem for eigenvalues of B, we have: f i (b) = b (j) = sup tr (BP V ), j i dim(v)=k where P V is the projection onto V. So f i (b) is convex, so as J λ (b). For computation, we can use proximal gradient descent, so we need to evaluate the proximity operator: prox λ (y) = argmin x R n 1 2 y x 2 l 2 + n i=1 λ i x (i), which is a strongly convex problem. Peng Zhao False Discovery Rate 23/30

24 FDR Control through SLOPE SLOPE Computation The sign of the solution x i will match that of y i ; Any permutation P to y results a solution Px. So we can assume y 1 y 2 y n 0 first, then restore the signs and permutation in a post-processing step. Then the solution should satisfies x 1 x 2 x n 0. Otherwise suppose x i < x j for i < j, consider x with entries i and j exchanged for x, we have: f (x) f ( x ) = x j y i x i y i + x i y j x j y j = ( x j x i ) ( yi y j ) > 0. Thus the evaluation of the proximity operator is a quadratic program: minimize 1 2 y x 2 l 2 + n i=1 λ ix i subject to x 1 x 2 x n 0. Peng Zhao False Discovery Rate 24/30

25 FDR Control through SLOPE SLOPE Controls FDR Theorem Assume an orthogonal design with i.i.d. N (0, 1) errors, set λ i = λ BH (i) = Φ 1 (1 iq/2p). Then the FDR of the SLOPE procedure obeys: [ ] V FDR = E q n 0 R 1 n. Proof Sketch: [ ] V FDR = E = R 1 n [ ] V E r I {R=r} = r=1 [ n 1 n0 r E r=1 I {Hi is rejected and R=r} i=1 From the optimality of the solution, we can obtain the following characteristics: ] Peng Zhao False Discovery Rate 25/30

26 FDR Control through SLOPE SLOPE Controls FDR {y : H i is rejected and R = r} = {y : y i > λ r and R = r} Consider SLOPE procedure to ỹ = (y 1,..., y i 1, y i+1,..., y n ) with λ = (λ 2,..., λ n ), and let R be the number of rejections for this procedure, then: {y : y i > λ r and R = r} { y : y i > λ r and R = r 1 }. Thus by the normal distribution of y, P (H i rejected and R = r) P ( y i λ r and R = r 1 ) = P ( y i λ r ) P( R = r 1) = qr n P( R = r 1). qn Then we have FDR 0 r 1 n P( R = r 1) = qn 0 n Peng Zhao False Discovery Rate 26/30

27 FDR Control through SLOPE Problems for SLOPE The main theorem only shows SLOPE controls FDR under orthogonal design, for general design only simulation results are provided. The unknown of variance σ may be overestimated by the undetection of some weak signals, then it may cause lose of power; It is important to have a data-driven procedures to choose regularizaing sequence {λ i }. Peng Zhao False Discovery Rate 27/30

28 References I References Benjamini, Y. and Hochberg, Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, pp Bogdan, M., Van Den Berg, E., Sabatti, C., Su, W. and Candes, E.J., SLOPE adaptive variable selection via convex optimization. The annals of applied statistics, 9(3), p Barber, R.F. and Candes, E.J., Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43(5), pp Verhoeven, K.J., Simonsen, K.L. and McIntyre, L.M., Implementing false discovery rate control: increasing your power. Oikos, 108(3), pp Peng Zhao False Discovery Rate 28/30

29 References II References Candes, E., Fan, Y., Janson, L. and Lv, J., Panning for gold: Model-free knockoffs for high-dimensional controlled variable selection. arxiv preprint, arxiv: Candes, E Stats 300C Theory of Statistics. candes/stats300c/lectures.html. Peng Zhao False Discovery Rate 29/30

30 References Thank you for your time! Peng Zhao False Discovery Rate 30/30

Lecture 12 April 25, 2018

Lecture 12 April 25, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 12 April 25, 2018 Prof. Emmanuel Candes Scribe: Emmanuel Candes, Chenyang Zhong 1 Outline Agenda: The Knockoffs Framework 1. The Knockoffs Framework

More information

Lecture 7 April 16, 2018

Lecture 7 April 16, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties

More information

Knockoffs as Post-Selection Inference

Knockoffs as Post-Selection Inference Knockoffs as Post-Selection Inference Lucas Janson Harvard University Department of Statistics blank line blank line WHOA-PSI, August 12, 2017 Controlled Variable Selection Conditional modeling setup:

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

3 Comparison with Other Dummy Variable Methods

3 Comparison with Other Dummy Variable Methods Stats 300C: Theory of Statistics Spring 2018 Lecture 11 April 25, 2018 Prof. Emmanuel Candès Scribe: Emmanuel Candès, Michael Celentano, Zijun Gao, Shuangning Li 1 Outline Agenda: Knockoffs 1. Introduction

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Summary and discussion of: Controlling the False Discovery Rate via Knockoffs

Summary and discussion of: Controlling the False Discovery Rate via Knockoffs Summary and discussion of: Controlling the False Discovery Rate via Knockoffs Statistics Journal Club, 36-825 Sangwon Justin Hyun and William Willie Neiswanger 1 Paper Summary 1.1 Quick intuitive summary

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

Alpha-Investing. Sequential Control of Expected False Discoveries

Alpha-Investing. Sequential Control of Expected False Discoveries Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

A knockoff filter for high-dimensional selective inference

A knockoff filter for high-dimensional selective inference 1 A knockoff filter for high-dimensional selective inference Rina Foygel Barber and Emmanuel J. Candès February 2016; Revised September, 2017 Abstract This paper develops a framework for testing for associations

More information

Hunting for significance with multiple testing

Hunting for significance with multiple testing Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance

More information

CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS. BY RINA FOYGEL BARBER 1 AND EMMANUEL J. CANDÈS 2 University of Chicago and Stanford University

CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS. BY RINA FOYGEL BARBER 1 AND EMMANUEL J. CANDÈS 2 University of Chicago and Stanford University The Annals of Statistics 2015, Vol. 43, No. 5, 2055 2085 DOI: 10.1214/15-AOS1337 Institute of Mathematical Statistics, 2015 CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS BY RINA FOYGEL BARBER 1 AND

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

arxiv: v3 [stat.me] 14 Oct 2015

arxiv: v3 [stat.me] 14 Oct 2015 The Annals of Statistics 2015, Vol. 43, No. 5, 2055 2085 DOI: 10.1214/15-AOS1337 c Institute of Mathematical Statistics, 2015 CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS arxiv:1404.5609v3 [stat.me]

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Peak Detection for Images

Peak Detection for Images Peak Detection for Images Armin Schwartzman Division of Biostatistics, UC San Diego June 016 Overview How can we improve detection power? Use a less conservative error criterion Take advantage of prior

More information

Exam: high-dimensional data analysis January 20, 2014

Exam: high-dimensional data analysis January 20, 2014 Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Specific Differences. Lukas Meier, Seminar für Statistik

Specific Differences. Lukas Meier, Seminar für Statistik Specific Differences Lukas Meier, Seminar für Statistik Problem with Global F-test Problem: Global F-test (aka omnibus F-test) is very unspecific. Typically: Want a more precise answer (or have a more

More information

Optional Stopping Theorem Let X be a martingale and T be a stopping time such

Optional Stopping Theorem Let X be a martingale and T be a stopping time such Plan Counting, Renewal, and Point Processes 0. Finish FDR Example 1. The Basic Renewal Process 2. The Poisson Process Revisited 3. Variants and Extensions 4. Point Processes Reading: G&S: 7.1 7.3, 7.10

More information

Announcements. Proposals graded

Announcements. Proposals graded Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

Statistical Estimation and Testing via the Sorted l 1 Norm

Statistical Estimation and Testing via the Sorted l 1 Norm Statistical Estimation and Testing via the Sorted l Norm Ma lgorzata Bogdan a Ewout van den Berg b Weijie Su c Emmanuel J. Candès c,d a Departments of Mathematics and Computer Science, Wroc law University

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1 Linear Combinations Comparison of treatment means Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 6 1 Linear Combinations of Means y ij = µ + τ i + ǫ ij = µ i + ǫ ij Often study

More information

Selection-adjusted estimation of effect sizes

Selection-adjusted estimation of effect sizes Selection-adjusted estimation of effect sizes with an application in eqtl studies Snigdha Panigrahi 19 October, 2017 Stanford University Selective inference - introduction Selective inference Statistical

More information

SLOPE Adaptive Variable Selection via Convex Optimization

SLOPE Adaptive Variable Selection via Convex Optimization SLOPE Adaptive Variable Selection via Convex Optimization Ma lgorzata Bogdan a Ewout van den Berg b Chiara Sabatti c,d Weijie Su d Emmanuel J. Candès d,e June 2015 a Department of Mathematics, Wroc law

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi

More information

Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions

Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions arxiv:1706.09375v1 stat.me 28 Jun 2017 Eugene Katsevich and Chiara Sabatti June 27, 2017 Abstract We tackle the problem

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

arxiv: v2 [stat.me] 9 Aug 2018

arxiv: v2 [stat.me] 9 Aug 2018 Submitted to the Annals of Applied Statistics arxiv: arxiv:1706.09375 MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS arxiv:1706.09375v2 stat.me 9 Aug 2018 By Eugene Katsevich,

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

Lecture 6 April

Lecture 6 April Stats 300C: Theory of Statistics Spring 2017 Lecture 6 April 14 2017 Prof. Emmanuel Candes Scribe: S. Wager, E. Candes 1 Outline Agenda: From global testing to multiple testing 1. Testing the global null

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Familywise Error Rate Control via Knockoffs

Familywise Error Rate Control via Knockoffs Familywise Error Rate Control via Knockoffs Abstract We present a novel method for controlling the k-familywise error rate (k-fwer) in the linear regression setting using the knockoffs framework first

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Heterogeneity and False Discovery Rate Control

Heterogeneity and False Discovery Rate Control Heterogeneity and False Discovery Rate Control Joshua D Habiger Oklahoma State University jhabige@okstateedu URL: jdhabigerokstateedu August, 2014 Motivating Data: Anderson and Habiger (2012) M = 778 bacteria

More information

Multiple testing: Intro & FWER 1

Multiple testing: Intro & FWER 1 Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes

More information

arxiv: v1 [stat.me] 11 Feb 2016

arxiv: v1 [stat.me] 11 Feb 2016 The knockoff filter for FDR control in group-sparse and multitask regression arxiv:62.3589v [stat.me] Feb 26 Ran Dai e-mail: randai@uchicago.edu and Rina Foygel Barber e-mail: rina@uchicago.edu Abstract:

More information

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5) STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons Ch. 4-5) Recall CRD means and effects models: Y ij = µ i + ϵ ij = µ + α i + ϵ ij i = 1,..., g ; j = 1,..., n ; ϵ ij s iid N0, σ 2 ) If we reject

More information

Introductory Econometrics

Introductory Econometrics Session 4 - Testing hypotheses Roland Sciences Po July 2011 Motivation After estimation, delivering information involves testing hypotheses Did this drug had any effect on the survival rate? Is this drug

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017 Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical Preliminaries Random Variables Random Variables X: A mapping from Ω to ℝ that describes the question we care about

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Plan Martingales cont d. 0. Questions for Exam 2. More Examples 3. Overview of Results. Reading: study Next Time: first exam

Plan Martingales cont d. 0. Questions for Exam 2. More Examples 3. Overview of Results. Reading: study Next Time: first exam Plan Martingales cont d 0. Questions for Exam 2. More Examples 3. Overview of Results Reading: study Next Time: first exam Midterm Exam: Tuesday 28 March in class Sample exam problems ( Homework 5 and

More information

PANNING FOR GOLD: MODEL-FREE KNOCKOFFS FOR HIGH-DIMENSIONAL CONTROLLED VARIABLE SELECTION. Emmanuel J. Candès Yingying Fan Lucas Janson Jinchi Lv

PANNING FOR GOLD: MODEL-FREE KNOCKOFFS FOR HIGH-DIMENSIONAL CONTROLLED VARIABLE SELECTION. Emmanuel J. Candès Yingying Fan Lucas Janson Jinchi Lv PANNING FOR GOLD: MODEL-FREE KNOCKOFFS FOR HIGH-DIMENSIONAL CONTROLLED VARIABLE SELECTION By Emmanuel J. Candès Yingying Fan Lucas Janson Jinchi Lv Technical Report No. 2016-05 October 2016 Department

More information

Accumulation test demo - simulated data

Accumulation test demo - simulated data Accumulation test demo - simulated data Ang Li and Rina Foygel Barber May 27, 2015 Introduction This demo reproduces the ordered hypothesis testing simulation in the paper: Ang Li and Rina Foygel Barber,

More information

Doing Cosmology with Balls and Envelopes

Doing Cosmology with Balls and Envelopes Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Selective Sequential Model Selection

Selective Sequential Model Selection Selective Sequential Model Selection William Fithian, Jonathan Taylor, Robert Tibshirani, and Ryan J. Tibshirani August 8, 2017 Abstract Many model selection algorithms produce a path of fits specifying

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

arxiv: v3 [stat.me] 9 Nov 2015

arxiv: v3 [stat.me] 9 Nov 2015 Familywise Error Rate Control via Knockoffs Lucas Janson Weijie Su Department of Statistics, Stanford University, Stanford, CA 94305, USA arxiv:1505.06549v3 [stat.me] 9 Nov 2015 May 2015 Abstract We present

More information

A significance test for the lasso

A significance test for the lasso 1 First part: Joint work with Richard Lockhart (SFU), Jonathan Taylor (Stanford), and Ryan Tibshirani (Carnegie-Mellon Univ.) Second part: Joint work with Max Grazier G Sell, Stefan Wager and Alexandra

More information

The knockoff filter for FDR control in group-sparse and multitask regression

The knockoff filter for FDR control in group-sparse and multitask regression The knockoff filter for FDR control in group-sparse and multitask regression Ran Dai Department of Statistics, University of Chicago, Chicago IL 6637 USA Rina Foygel Barber Department of Statistics, University

More information

New Procedures for False Discovery Control

New Procedures for False Discovery Control New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University

More information

Panning for Gold: Model-X Knockoffs for High-dimensional Controlled Variable Selection

Panning for Gold: Model-X Knockoffs for High-dimensional Controlled Variable Selection Panning for Gold: Model-X Knockoffs for High-dimensional Controlled Variable Selection Emmanuel Candès 1, Yingying Fan 2, Lucas Janson 1, and Jinchi Lv 2 1 Department of Statistics, Stanford University

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses arxiv:1610.03330v1 [stat.me] 11 Oct 2016 Jingshu Wang, Chiara Sabatti, Art B. Owen Department of Statistics, Stanford University

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

A General Framework for High-Dimensional Inference and Multiple Testing

A General Framework for High-Dimensional Inference and Multiple Testing A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1 Overview Goal: Control false scientific discoveries in high-dimensional

More information

Week 5 Video 1 Relationship Mining Correlation Mining

Week 5 Video 1 Relationship Mining Correlation Mining Week 5 Video 1 Relationship Mining Correlation Mining Relationship Mining Discover relationships between variables in a data set with many variables Many types of relationship mining Correlation Mining

More information

Multiple Dependent Hypothesis Tests in Geographically Weighted Regression

Multiple Dependent Hypothesis Tests in Geographically Weighted Regression Multiple Dependent Hypothesis Tests in Geographically Weighted Regression Graeme Byrne 1, Martin Charlton 2, and Stewart Fotheringham 3 1 La Trobe University, Bendigo, Victoria Austrlaia Telephone: +61

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

Stat 602 Exam 1 Spring 2017 (corrected version)

Stat 602 Exam 1 Spring 2017 (corrected version) Stat 602 Exam Spring 207 (corrected version) I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed This is a very long Exam. You surely won't be able to

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Panning for gold: model-x knockoffs for high dimensional controlled variable selection

Panning for gold: model-x knockoffs for high dimensional controlled variable selection J. R. Statist. Soc. B (2018) 80, Part 3, pp. 551 577 Panning for gold: model-x knockoffs for high dimensional controlled variable selection Emmanuel Candès, Stanford University, USA Yingying Fan, University

More information

Two simple sufficient conditions for FDR control

Two simple sufficient conditions for FDR control Electronic Journal of Statistics Vol. 2 (2008) 963 992 ISSN: 1935-7524 DOI: 10.1214/08-EJS180 Two simple sufficient conditions for FDR control Gilles Blanchard, Fraunhofer-Institut FIRST Kekuléstrasse

More information

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan

More information

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs

RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs Yingying Fan 1, Emre Demirkaya 1, Gaorong Li 2 and Jinchi Lv 1 University of Southern California 1 and Beiing University of Technology 2 October

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011

More information

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Joshua D. Habiger Oklahoma State University jhabige@okstate.edu Nov. 8, 2013 Outline 1 : Motivation and FDR Research Areas

More information

Technical Report 1004 Dept. of Biostatistics. Some Exact and Approximations for the Distribution of the Realized False Discovery Rate

Technical Report 1004 Dept. of Biostatistics. Some Exact and Approximations for the Distribution of the Realized False Discovery Rate Technical Report 14 Dept. of Biostatistics Some Exact and Approximations for the Distribution of the Realized False Discovery Rate David Gold ab, Jeffrey C. Miecznikowski ab1 a Department of Biostatistics,

More information

arxiv: v1 [math.st] 31 Mar 2009

arxiv: v1 [math.st] 31 Mar 2009 The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark

More information

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve

More information

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007 Department of Statistics University of Central Florida Technical Report TR-2007-01 25APR2007 Revised 25NOV2007 Controlling the Number of False Positives Using the Benjamini- Hochberg FDR Procedure Paul

More information

Homework 5. Convex Optimization /36-725

Homework 5. Convex Optimization /36-725 Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

arxiv: v2 [stat.me] 14 Mar 2011

arxiv: v2 [stat.me] 14 Mar 2011 Submission Journal de la Société Française de Statistique arxiv: 1012.4078 arxiv:1012.4078v2 [stat.me] 14 Mar 2011 Type I error rate control for testing many hypotheses: a survey with proofs Titre: Une

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25 Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University

More information

False Discovery Control in Spatial Multiple Testing

False Discovery Control in Spatial Multiple Testing False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University

More information

Data Mining. CS57300 Purdue University. March 22, 2018

Data Mining. CS57300 Purdue University. March 22, 2018 Data Mining CS57300 Purdue University March 22, 2018 1 Hypothesis Testing Select 50% users to see headline A Unlimited Clean Energy: Cold Fusion has Arrived Select 50% users to see headline B Wedding War

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Symmetries in Experimental Design and Group Lasso Kentaro Tanaka and Masami Miyakawa

Symmetries in Experimental Design and Group Lasso Kentaro Tanaka and Masami Miyakawa Symmetries in Experimental Design and Group Lasso Kentaro Tanaka and Masami Miyakawa Workshop on computational and algebraic methods in statistics March 3-5, Sanjo Conference Hall, Hongo Campus, University

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information