A Power and Prediction Analysis for Knockoffs with Lasso Statistics

Size: px
Start display at page:

Download "A Power and Prediction Analysis for Knockoffs with Lasso Statistics"

Transcription

1 A Power and Prediction Analysis for Knockoffs with Lasso Statistics Asaf Weinstein a, Rina Barber b, and mmanuel Candès a a Stanford University b University of Chicago Abstract Knockoffs is a new framework for controlling the false discovery rate (FDR) in multile hyothesis testing roblems involving comlex statistical models. While there has been great emhasis on Tye- I error control, Tye-II errors have been far less studied. In this aer we analyze the false negative rate or, equivalently, the ower of a knockoff rocedure associated with the Lasso solution ath under an i.i.d. Gaussian design, and find that knockoffs asymtotically achieve close to otimal ower with resect to an omniscient oracle. Furthermore, we demonstrate that for sarse signals, erforming model selection via knockoff filtering achieves nearly ideal rediction errors as comared to a Lasso oracle equied with full knowledge of the distribution of the unknown regression coefficients. The i.i.d. Gaussian design is adoted to leverage results concerning the emirical distribution of the Lasso estimates, which makes ower calculation ossible for both knockoff and oracle rocedures. 1 Introduction Suose that for each of n subjects we have measured a set of features which we want to use in a linear model to exlain a resonse, and lan to use the Lasso to select relevant features. A desirable rocedure controls the (exected) roortion of false variables entering the exlanatory model; that is, the ratio between the number of null coefficients estimated as nonzero by the Lasso rocedure and the total number of coefficients estimated as nonzero. Concetually, then, we imagine roceeding along the Lasso ath by decreasing the enalty λ, and need to decide when to sto and reject the hyotheses corresonding to nonzero estimates, in such a way that the false discovery rate (FDR) does not exceed a reset level q. The setu above differs from many classical situations, where we know that the null statistics the statistics for which the associated null hyothesis is true are (aroximately) indeendent draws from a distribution that is (aroximately) known; in such situations, it is relatively easily to estimate the roortion of null hyotheses that are rejected. In stark contrast, in our setting we do not know the samling distribution of β j (λ), and, to further comlicate matters, this distribution generally deends on the entire vector β (this is unlike the distribution of the least-squares estimator, where β OLS j deends only on β j ). Barber and Candès (2015) have roosed a method that utilizes artificial knockoff features, and used it to circumvent the difficulties described above. The general idea behind the method is to introduce a set of fake variables (i.e., variables that are known to not belong in the model), in such a way that there is some form of exchangeability under swaing a true null variable and a knockoff counterart. This exchangeability essentially imlies that a true null variable and a corresonding knockoff variable are indistinguishable (in the sense that the sufficient statistic is insensitive to swaing the two), hence given the event that one of the two was selected, each is equally robable to be the selected. We can then use the number of knockoff selections to estimate the number of true nulls being selected.

2 In the original aer, Barber and Candès (2015) considered a Gaussian linear regression model with a fixed number of fixed (nonrandom) redictors, and n. Since then, extensions and variations on knockoffs have been roosed, which made the method alicable to a wide range of roblems. Barber and Candès (2016) considered the high dimensional (n < ) setting, where control over tye III error (sign) is also studied. Candès et al. (2016) constructed robabilistic model-x knockoffs for a nonarametric setting with random covariates. A grou knockoff filter has been roosed by Dai and Barber (2016) for detecting relevant grous of variables in a linear model. The focus in these works is on constructing knockoff variables with the necessary roerties to enable control over the false discovery rate. On the other hand, the Tye II error rate has been less carefully examined. Power calculations are in general hard, esecially when sohisticated test statistics, such as the Lasso estimates, are used. There is, however, a secific setting that enables us to exactly characterize (in some articular sense) the asymtotic distribution of the Lasso estimator relying on the theory of Aroximate Message-Passing (AMP). More secifically, we work in the setu of Bayati and Montanari (2012) which entails an n Gaussian design matrix with n, and n/ δ > 0; and the regression coefficients are i.i.d. coies from a mixture distribution Π = (1 ɛ)δ 0 ɛπ, where Π is the distribution of the nonzero comonents 1 (i.e., has no oint mass at zero). We consider a rocedure that enters variables into the model according to their order of (first) aearance on the Lasso ath. Aealing to the results in Bayati and Montanari (2012), we identify an oracle rocedure that, with the knowledge of the distribution Π (and the noise level σ 2 ), is able to asymtotically attain maximum true ositive rate subject to controlling the FDR. We show that a rocedure which uses knockoffs for calibration, and hence a different solution ath, is able to strictly control the FDR and at the same time asymtotically achieve ower very close to that of the oracle adatively in the unknown Π. Figure 1 shows ower of (essentially) the level-q knockoff rocedure roosed in Section 3 versus that of the level-q oracle rocedure, as q varies and for two different distributions Π. The to two lots are theoretical asymtotic redictions, and the bottom lots show simulation counterarts. As imlied above, the oracle gets to use the knowledge of Π and ɛ in choosing the value of the Lasso enalty arameter λ, so that the asymtotic false discovery roortion is q; by contrast, the knockoff rocedure is alied in exactly the same way in both situations, that is, it does not utilize any knowledge of Π or ɛ. Yet the to two lots in Figure 1 show that the knockoff rocedure adats well to the unknown true distribution of the coefficients, at least for FDR levels in the range that would tyically be of interest, in the sense that it blindly achieves close to otimal asymtotic ower. Figures 4 and 5 of Section 4 give another reresentation of these findings, which are the main results of the aer. While the main focus is on the testing roblem, we address also the issue of model selection and rediction. Motivated by the work of Abramovich et al. (2006), which formally connected FDR control and estimation in the sequence model, our aim is to investigate whether favorable roerties associated with FDR control carry over to the linear model. As Abramovich et al. (2006) remark at the outset, in the secial case of the sequence model, a full decision-theoretic analysis of the redictive risk is ossible. For the linear model case, Bogdan et al. (2015) roosed a enalized least squares estimator, SLOP, motivated from the ersective of multile testing and FDR control, and Su et al. (2016) even roved that this estimator is asymtotically minimax under the same design as that considered in the current aer, although in a different sarsity regime, namely ɛ 0. In truth, the linear model is substantially more difficult to analyze than the sequence model. Fortunately, however, the secific working assumtions we adot allow to analyze the (asymtotic) redictive risk, again relying on the elegant results of Bayati and Montanari (2012) for the Lasso estimates. While our results lack the rigor of those in, e.g., Abramovich et al. (2006) and Su et al. (2016), we think that the findings reorted in Section 5 are valuable in investigating a roblem for which 1 With some abuse of notation, in this aer we use Π and Π to refer to either a distribution or a random variable that has the corresonding distribution. 2

3 Theoretical: Π * = δ 50 Theoretical: Π * = ex(1) TPP (knockoff) q=0.05 q=0.1 q=0.125 TPP (knockoff) q=0.05 q=0.3 q= TPP (oracle) TPP (oracle) Simulation: Π * = δ 50 Simulation: Π * = ex(1) TPP (knockoff) TPP (knockoff) TPP (oracle) TPP (oracle) Figure 1: Asymtotic ower of the knockoff rocedure as comared to the oracle, for two different distributions Π : oint mass (left) and exonential (right). ɛ = 0.2, δ = 1, σ = 0.5. To: Theoretical rediction. ach of the oints marked on the to grahs corresonds to a articular value of q, and was generated as follows. (i) Fix q; (ii) find λ and t s.t. FDP (t) = q and FDP aug(t ) = q, where FDP and FDP aug are given in (5) and (31); (iii) lot the oint ( TPP (t), TPP aug(t ) ), where TPP and TPP aug are given in (6) and (29). The broken line is the 45 degrees line. The knockoff rocedure adats to an unknown Π : in both situations the asymtotic ower (TPP) of the knockoffs rocedure is very close to that of the oracle rocedure. Bottom: Simulation results, ɛ = 0.2; n = = 5000; σ = 0.5. ach of the two grahs is meant to be an analogue of the asymtotic limits above. 3

4 conclusive answers are currently not available. The rest of the aer is organized as follows. Section 2 reviews some relevant results from Su et al. (2017) and resents the oracle false discovery rate-ower tradeoff diagram for the Lasso. In Section 3 we resent a simle testing rocedure which uses knockoffs for FDR calibration, and rove some theoretical guarantees. A ower analysis, which comares the knockoff rocedure to the oracle rocedure, is carried out in Section 4. Section 5 is concerned with using knockoffs for rediction, and Section 6 concludes with a short discussion. 2 An oracle tradeoff diagram Adoting the setu from Su et al. (2017), we consider the linear model y = Xβ z (1) in which X R n has i.i.d. N (0, 1/n) entries and the errors z i are i.i.d. N (0, σ 2 ) with σ 0 fixed and arbitrary. We assume that the regression coefficients β j, j = 1,..., are i.i.d. coies of a random variable Π with (Π 2 ) < and P(Π 0) = ɛ (0, 1) for a fixed constant ɛ. In this section we will be interested in the case where n, with n/ δ for a ositive constant δ. Note that Π is assumed to not deend on, and so the exected number of nonzero elements β j is equal to ɛ. Let β(λ) be the Lasso solution, 1 β(λ) = argmin b R 2 y Xb 2 λ b 1, (2) and denote V (λ) = {j : βj (λ) 0, β j = 0}, T (λ) = {j : βj (λ) 0, β j 0}, R(λ) = {j : β j (λ) 0} and k = {j : β j 0}. Hence V (λ) is regarded as the number of false discoveries made by the Lasso; T (λ) is the number of true discoveries; R(λ) is the total number of discoveries, and k is the number of true signals. We would like to remark here that β j = 0 imlies that X j (the j-th variable) is indeendent of y marginally and conditionally on any subset of X j := {X 1,..., X j 1, X j1,..., X }; hence the interretation of rejecting the hyothesis β j = 0 as a false discovery is clear and unambiguous. The false discovery roortion (FDP) is defined as usual as FDP(λ) = and the true ositive roortion (TPP) is defined as V (λ) 1 R(λ) TPP(λ) = T (λ) 1 k. Su et al. (2017) build on the results in Bayati and Montanari (2012) to devise a tradeoff diagram between TPP and FDP in the linear sarsity regime. Lemma A.1 in Su et al. (2017), which is adoted from Bogdan et al. (2013) and is a consequence of Theorem 1.5 in Bayati and Montanari (2012), redicts the limits of FDP and TPP at a fixed value of λ. Throughout, let η t ( ) be the soft-thresholding oerator, given by η t (x) = sgn(x)( x t). Also, let Π denote a random variable distributed according to the conditional distribution of Π given Π 0; that is, { Π, w.. ɛ, Π = (3) 0, w.. 1 ɛ. Finally, denote by α 0 the unique root of the equation (1 t 2 )Φ( t) tφ(t) = δ/2. 4

5 Lemma 2.1 (Lemma A.1 in Su et al., 2017; Theorem 1 in Bogdan et al., 2013; see also Theorem 1.5 in Bayati and Montanari, 2012). The Lasso solution with a fixed λ > 0 obeys V (λ) T (λ) P 2(1 ɛ)φ( α), P P( Π τw > ατ, Π 0) = ɛp( Π τw > ατ), where W N (0, 1) indeendently of Π, and τ > 0, α > max{α 0, 0} is the unique solution to τ 2 = σ 2 1 δ (η ατ (Π τw ) Π) 2 λ = (1 1δ ) P( Π τw > ατ) ατ. (4) As exlained in Su et al., 2017, Lemma 2.1 is a consequence of the fact that, under the working assumtions, (β, β(λ)) is in some (limited) sense asymtotically distributed as (β, η ατ (β τw)), where W N (0, I) indeendently of β; here, the soft-thresholding oeration acts on each comonent of a vector. It follows immediately from Lemma 2.1 that for a fixed λ > 0, the limits of FDP and TPP are and FDP(λ) P 2(1 ɛ)φ( α) 2(1 ɛ)φ( α) ɛp( Π τw > ατ) FDP (λ) (5) TPP(λ) P P( Π τw > ατ) TPP (λ). (6) The arametric curve (FDP (λ), TPP (λ)) imlicity defines the FDP level attainable at a secific TPP level for the Lasso solution with a fixed λ as q Π (TPP (λ); ɛ, δ, σ) = FDP (λ). (7) Interreted in the oosite direction, for known Π and ɛ, we have an analytical exression for the asymtotic ower of a rocedure that, for a given q, chooses λ in (2) in such a way that FDP (λ) = q. We describe how to comute the tradeoff curve q Π ( ; ɛ, δ, σ) for a given Π and ɛ in the Aendix. Because in ractice Π and ɛ would usually be unknown, we refer to this rocedure as the oracle rocedure. The oracle rocedure is not a legal multile testing rocedure, as it requires knowledge of ɛ, Π and σ 2, which is of course realistically unavailable. Still, it serves as a reference for evaluating the erformance of valid testing rocedures. Cometing with this oracle is the subject of the current aer, and the articular working assumtions above were chosen simly because it is ossible under these assumtions to obtain a tractable exression for the emirical distribution of the Lasso solution. Before roceeding, we would like to mention another oracle rocedure. Hence, consider the rocedure which for λ uses the value λ = inf{λ : FDP(λ) q}. (8) This can be thought of as an oracle that is allowed to see the actual β but is still committed to the Lasso ath, and stos the last time the FDP is smaller than q. From now on, we think of λ as decreasing in time, and regard a variable j as entering before variable j if su{λ : β j (λ) 0} su{λ : β j (λ) 0}. Although this may aear as a stronger oracle in some sense, it is asymtotically equivalent to the fixed-λ oracle discussed earlier: by Theorem 3 in Su et al. (2017), the event { } FDP(λ) q Π (TPP(λ) 0.001; ɛ, δ, σ) (9) λ>0.01 has robability tending to one, hence the lower bound on FDP rovided by q Π ( ; ɛ, δ, σ) holds uniformly in λ. 5

6 3 Calibration using knockoffs If we limit ourselves to rocedures that use the Lasso with a fixed or even data-deendent value of λ, from the revious section we see that the achievable TPP can never asymtotically exceed that of the oracle rocedure. Hence the asymtotic ower of the oracle rocedure serves as an asymtotic benchmark. In this section we roose a cometitor to the oracle, which utilizes knockoffs for FDR calibration. While the rocedure resented below is sequential and formally does not belong to the class of Lasso estimates with a data-deendent choice of λ, it will be almost equivalent to the corresonding rocedure belonging to that class. We discuss this further later on. 3.1 A knockoff filter for the i.i.d. design Let the model be given by (1) as before. In this section, unless otherwise secified, it suffices to assume that the entries of X are i.i.d. from an arbitrary known continuous distribution G (not necessarily normal). Furthermore, unless otherwise indicated, in this section we treat n, as fixed and condition throughout on β. The definitions in the current section override the revious definitions, if there is any conflict. Let X R n r be a matrix with i.i.d. entries drawn from G indeendently of X and of all other variables, where r is a fixed ositive integer. Next, let X := [X X] R n (r) denote the augmented design matrix, and consider the Lasso solution for the augmented design, 1 β(λ) = argmin b R r 2 y Xb 2 λ b 1. (10) For simlicity we kee the notation β(λ) for the solution corresonding to the augmented design, although it is of course different from (2) (for one thing, (10) has r comonents versus for (2)). Let H = {1,..., }, H 0 = {j H : β j = 0}, K 0 = { 1,..., r} be the sets of indices corresonding to the original variables, the null variables and the knockoff variables, resectively. Note that, because we are conditioning on β, the set H 0 is nonrandom. Now define the statistics T j = su{λ : β j (λ) 0}, j = 1,..., r. (11) Informally, T j measures how early the jth variable enters the Lasso ath (the larger, the earlier). Let V 0 (λ) = {j H 0 : T j λ}, V 1 (λ) = {j K 0 : T j λ} be the number of true null and fake null variables, resectively, which enter the Lasso ath before time λ. Also, let R(λ) = {j H : T j λ} be the total number of original variables entering the Lasso ath before time λ. A visual reresentation of the quantities defined above is given in Figure 2. 6

7 nonnull: true null: fake null: R(t) V 0 (t)=6, R(s) V 0 (s)=7 V 0 (t)=3, V 0 (s)=5 V 1 (t)=2, V 1 (s)=3 0 s t T Figure 2: Reresentation of test statistics. ach T j is reresented by a marker: round markers corresond to original variables (H), triangles reresent fakes (knockoffs, K 0 ); black corresonds to nonnull (H \ H 0 ), red reresents null (true or fake, H 0 K 0 ). The class of rocedures we roose (henceforth, knockoff rocedures) reject the null hyotheses corresonding to {j H : T j λ}, where λ = inf λ Λ : (1 V 1(λ)) R(λ) H π 0 1 K 0 q (12) for an estimate π 0 of π 0 := H 0 / H and for a set Λ associated with π 0. Note that conditionally on β, π 0 is a (nonrandom) arameter, and in this section we seak of estimating π 0 rather than 1 ɛ. Hence the tests in this class are indexed by the estimate of π 0 (and by Λ). Denote by X j, j = 1,..., r, the columns of the augmented matrix X. To see why the rocedure makes sense, observe that, conditionally on β, the distribution of (X, y) is invariant under ermutations of the variables in the set {H 0 K 0 } (just by symmetry). Therefore, conditionally on {j H 0 K 0 : T j λ} = k, any subset of size k of {H 0 K 0 } is equally robable to be the selected set {j H 0 K 0 : T j λ}. It follows that Therefore, V 0 (λ)/ H 0 V 1 (λ)/ K 0. V 0 (λ) V 1 (λ) H π 0 K 0 V 1 (λ) H π 0 K 0 and hence (12) can be interreted as the smallest λ such that an estimate of FDP(λ) V 0(λ) 1 R(λ) is below q. The fact that the rocedure defined above, with π 0 = 1, controls the FDR, is a consequence of the more general result below. Theorem 3.1 (FDR control for knockoff rocedure with π 0 = 1). Let T j, j = 1,..., r be test statistics with a joint distribution that is invariant to ermutations in the set H 0 K 0 (that is, reordering the statistics with indices in the extended null set, leaves the joint distribution unchanged). Set V 0 (t) = {j H 0 : T j 7

8 t}, V 1 (t) = {j K 0 : T j t}, and R(t) = {j H : T j t}. Then the rocedure that rejects H j 0 if T j τ, where τ = inf t R : (1 V H 1(t)) 1 K 0 q (13) 1 R(t) controls the FDR at level q. In fact, [ ] V0 (τ) q H 0 1 R(τ) H. Proof. We will use the following lemma, which is roved in the Aendix. Lemma 3.2. Let X Hyer(n 0, n 1 ; m) be a hyergeometric random variable, i.e., ( n0 )( n1 ) P(X = k) = k m k ) with m > 0 and set Y = X/(1 m X). Then 2 In articular, for any value of m. Y = n 0 1 n 1 ( 1 ( n0 n 1 m Y n 0 1 n 1 ( n0 1 ( m n0 n 1 m We roceed to rove the theorem. Denote m 0 = H 0 and recall that H = and K 0 = r. Recall also that we are conditioning throughout on β (so that, in articular, H 0 is nonrandom). We will show that FDR control holds conditionally on β (it therefore also holds unconditionally). With the notation above, We have FDR = We will show that Consider the filtration [ ] V0 (τ) = 1 R(τ) { τ = inf t : (1 V 1(t)) 1 R(t) [ (1 V1 (τ)) 1r 1 R(τ) 1r ) ) ) V 0 (τ) (1 V 1 (τ)). } q. 1r ] [ V 0 (τ) q (1 V 1 (τ)) = q 1 r 1r [ V0 (τ) 1 V 1 (τ) [ ] V0 (τ) m 0 1 V 1 (τ) 1 r. (14) F t = σ ({V 0 (s)} 0 s t, {V 1 (s)} 0 s t, {R(s)} 0 s t ). That is, at all times s t, we have knowledge about the number of true null, fake null and nonnull variables with statistics larger than s. We claim that V 0 (t) 1 V 1 (t) 2 Here and below we adot the convention that ( 0 0) = 1, and ( n k) = 0 if n < 0 or if either k < 0 or k > n. 8 ] ].

9 is a suermartingale w.r.t. {F t }, and τ is a stoing time. Indeed, let t > s. We need to show that [ ] V0 (t) 1 V 1 (t) F s V 0(s) 1 V 1 (s). (15) The crucial observation is that, conditional on V 0 (s), V 1 (s) and V 0 (t) V 1 (t) = m, V 0 (t) Hyer(V 0 (s), V 1 (s); m). This follows from the exchangeability of the test statistics for the true and fake nulls (in other words, conditional on everything else, the order of the red markers in Figure 2 is random). Now, Lemma 3.2 assures us that [ V0 (t) 1 V 1 (t) V 0(s), V 1 (s), V 0 (t) V 1 (t) = m ] V 0(s) 1 V 1 (s) regardless of the value of m. Hence, the suermartingale roerty (15) holds (and it is also clear that τ is a stoing time). By the otional stoing theorem for suermartingales, [ V0 (t) 1 V 1 (t) ] [ V0 (0) 1 V 1 (0) ] = ( (16) [ ]) V0 (0) 1 V 1 (0) V 0(0) V 1 (0). (17) But conditional on V 0 (0) V 1 (0) = m we have that V 0 (0) Hyer(m 0, r; m), and so, invoking Lemma 3.2 once again, we have [ ] V0 (0) 1 V 1 (0) V 0(0) V 1 (0) = m m 0 1 r no matter the value of m. 3.2 stimating the roortion of nulls From our analysis, we see that FDR control would continue to hold if we relaced the number of hyotheses = H with the number of nulls m 0 = H 0 in the definition of τ. Of course, this quantity is unobservable, or, equivalently, π 0 = H 0 / H is unobservable. We can, however, estimate π 0 using a similar aroach to that of Storey (2002), excet that we relace calculations under the theoretical null with the counting of knockoff variables at the bottom of the Lasso ath: secifically, for any λ 0 we have {j H 0 : T j λ 0 } H 0 again due to the exchangeability of null features. It follows that {j K 0 : T j λ 0 }, K 0 H 0 K 0 {j H 0 : T j λ 0 } {j K 0 : T j λ 0 } ( K 0 1) 1 {j H 0 : T j λ 0 } {j K 0 : T j λ 0 } ( K 0 1) 1 {j H : T j λ 0 }. {j K 0 : T j λ 0 } For small λ 0, we exect only few non-nulls among {j H : T j λ 0 } so that {j H 0 : T j λ 0 } {j H : T j λ 0 }. Hence, we roose to estimate π 0 by π 0 = ( K 0 1) H We again state our result more generally. 1 {j H : T j λ 0 }. (18) {j K 0 : T j λ 0 } 9

10 Theorem 3.3 (FDR control for knockoff rocedure with an estimate of π 0 ). Set π 0 = ( K 0 1) H 1 {j H : T j t 0 } {j K 0 : T j t 0 } (19) ( π 0 is set to if the denominator vanishes). In the setting of Theorem 3.1, the rocedure that rejects H j 0 if T j τ, where τ = inf t t 0 : (1 V H π 1(t)) 0 1 K 0 q (20) 1 R(t) obeys FDR q. Proof. We use the same notation as in the roof of Theorem 3.1. As before, we have that [ ] V 0 (τ) FDR q (1 V 1 (τ)) π. 0 1r Again, the rocess {V 0 (t)/(1 V 1 (t))} t t0 starting at t 0, is a suermartingale w.r.t. {F t } t t0 and τ is a stoing time. Then the otional stoing theorem gives [ ] [ ] V 0 (τ) V 0 (t 0 ) (1 V 1 (τ)) π 0 1r (1 V 1 (t 0 )) π 0 1r [ ] V0 (t 0 ) = 1 V 1 (t 0 ) r V 1 (t 0 ) (21) 1 R(t 0 ) [ ] V0 (t 0 ) 1 V 1 (t 0 ) r V 1 (t 0 ). 1 m 0 V 0 (t 0 ) As before, V 0 (t 0 ) Hyer(m 0, r; m) conditional on V 0 (t 0 ) V 1 (t 0 ) = m. In the Aendix we rove that if a random variable X follows this distribution, then [ ] ( m0 )( ) r X 1 m X r X m m = 1 0 m m m 0 m ) (22) 1 m 0 X ( m0 r m In articular, for all m the result is no more than 1, thereby comleting the roof of the theorem. In ractice, we of course suggest to use the minimum between one and (18) as an estimator for π 0. While we cannot obtain a result similar to Theorem 3.3 when the truncated estimate of π 0 is used instead of (18), we can aeal to results from aroximate message assing, in the sirit of the calculations from Section 4, to obtain an aroximate asymtotic result. Thus, let π 0 = 1 ( ( K0 1) H 1 {j H : T j t 0 } {j K 0 : T j t 0 } and adot the working assumtions stated at the beginning of Section 4. As in the roof of the theorem above, [ ] V 0 (τ) FDR q (1 V 1 (τ)) π, 0 1r 10 ).

11 and by the same reasoning as above, [ ] [ ] V 0 (τ) V 0 (t 0 ) (1 V 1 (τ)) π 0 1r (1 V 1 (t 0 )) π 0 1r [ ( V0 (t 0 ) r = 1 V 1 (t 0 ) max V1 (t 0 ) 1 R(t 0 ), 1 r )] [ ( V0 (t 0 ) r 1 V 1 (t 0 ) max V1 (t 0 ) 1 m 0 V 0 (t 0 ), 1 r )]. Now, as in equation (26) below, we have (23) V 0 (t 0 ) = {j H 0 : T j t 0 } {j H 0 : β j (t 0 ) 0, β j = 0} 2(1 ɛ )Φ( α 1), where ɛ = ɛ/(1 ρ) and where α 1 is the same as in equation (31), when relacing λ 0 with t 0. Also, using equation (27), we have V 1 (t 0 ) r = {j K 0 : T j t 0 } r {j K 0 : β j (t 0 ) 0} r 2Φ( α 1). Treating the aroximations above as equalities (in Section 4 we argue that these should be good aroximations), we have / V 0 (t 0 ) 1 V 1 (t 0 ) r 2(1 ɛ )Φ( α 1 ) 2Φ( α 1 ) = 1 ɛ and Hence, / r V 1 (t 0 ) r 1 m 0 V 0 (t 0 ) [1 2Φ( α 1 )] (1 ɛ )[1 2Φ( α 1 )] = 1 / 1 ɛ, 1 r r 1. ( V 0 (t 0 ) r 1 V 1 (t 0 ) max V1 (t 0 ) 1 m 0 V 0 (t 0 ), 1 r ) We conclude that, u to the aroximations above, 4 Power Analysis lim su FDR q. n, (1 ɛ 1 ) max(, 1) = 1. 1 ɛ The main observation in this aer is that we can use the consequences of Theorem 1.5 in Bayati and Montanari (2012) to obtain an asymtotic TPP-FDP tradeoff diagram for the Lasso solution associated with the knockoff setu; that is, the Lasso solution corresonding to the augmented design X. Indeed, the working assumtions required for alying Lemma 2 continue to hold, only with a slightly different set of arameters. Recall that X R n (r) has i.i.d. N (0, 1/n) entries. Taking r = ρ for a constant ρ > 0, we have that n/( r) δ := δ/(1 ρ) as n,. Furthermore, because Y is related to X through Y = X where 0 = (0,..., 0) T R r, we have that the emirical distribution of the coefficients corresonding to the augmented design converges exactly to { Π Π, w.. ɛ, = 0, w.. 1 ɛ (24), 11 [ β 0 ],

12 where ɛ := ɛ/(1 ρ) = lim{ɛ/( ρ )}. Fortunately, it is only the emirical distribution of the coefficient vector that needs to have a limit of the form (24), for Lemma 2.1 to follow from Theorem 1.5 of Bayati and Montanari (2012); see also the remarks in Section 2 of Su et al. (2017). The working hyothesis of Section 2 therefore holds with δ and ɛ relaced by δ and ɛ. Recalling the definitions of H, H 0 and K 0 from Section 3, a counterart of Lemma 2.1 asserts that {j H : β j (λ) 0, β j = 0} {j H : β j (λ) 0, β j 0} 2(1 ɛ)φ( α ), (25) ɛp( Π τ W > α τ ) (26) and {j K 0 : β j (λ) 0} r 2Φ( α ), (27) where β j (λ) is the solution to (10); W is a N (0, 1) variable indeendent of Π; and τ > 0, α > max{α 0, 0} is the unique solution to (4) when δ is relaced by δ and ɛ is relaced by ɛ. Hence, FDP(λ) = {j H 0 : T j λ} 1 {j H : T j λ} = {j H 0 : β j (λ ) 0 for some λ λ} 1 {j H : β j (λ ) 0 for some λ λ} {j H 0 : β j (λ) 0} 1 {j H : β j (λ) 0} 2(1 ɛ)φ( α ) 2(1 ɛ)φ( α ) ɛp( Π τ W > α τ ) FDP aug(t). (28) Above, we allowed ourselves to aroximate {j H : β j (λ ) 0 for some λ λ} with {j H : β j (λ) 0}, and likewise for H 0. If least-angle regression (fron et al., 2004) were used instead of the Lasso, the two quantities would have been equal; but because the Lasso solution ath is considered here, we cannot comletely rule out the event in which a variable that entered the ath dros out, and the former quantity is in general larger. Still, we exect the difference to be very small, at least for large enough values of λ (corresonding to small FDP). This is verified by the simulation of Figure 3: the green circles, comuted using the statistics T j, are very close to the red circles, comuted when rejections corresond to variables with β j (λ) 0. All three simulation examles were comuted for a single realization on the original n design (not the augmented design). Similar results were obtained when we reeated the exeriment (i.e., for a different realization of the data). Similarly, TPP aug (λ) {j H \ H 0 : T j λ} 1 {j H : β j 0} {j H \ H 0 : β j (λ) 0} 1 {j H : β j 0} P( Π τ W > α τ ) TPP aug(λ) (29) 12

13 ε= TPP ε= FDP 0.4 FDP FDP ε= TPP TPP Figure 3: Discreancy between testing rocedures for three different values of. In all three examles n = = 5000, Π ex(1), and σ = 0.5; and TPP is lotted against FDP. Green circles corresond to the rocedure that uses the statistics Tj in (11). Red circles corresond to the rocedure that rejects when βbj (λ) 6= 0. Black circles are theoretical redictions from Section 4 with δ = 1. ach curve is is comuted from a single realization of the data, and uses an n design matrix (i.e., not the augmented design). and [ aug (λ) FDP (1 {j K0 : Tj λ} ) H b π0 1 K0 {j H : Tj λ} (1 {j K0 : βbj (λ) 6= 0} ) 2Φ( α0 ) 2(1 )Φ( α0 ) P( Π τ 0 W > α0 τ 0 ) H b π0 1 K0 {j H : βbj (λ) 6= 0} (30) lim π b0, where W, τ 0 > 0, α0 > max{α00, 0} are as before, and where π b0 is any estimate of π0 that has a limit. Taking π b0 to be the minimum between one and (19), we have that the limit in (30) can be aroximated by 2Φ( α0 ) 2(1 )Φ( α0 ) P( Π τ 0 W > α0 τ 0 ) [P( Π τ20 W > α20 τ20 ) P( Π τ10 W > α10 τ10 )] [ aug (λ), 1 1 FDP 0 0 2[Φ( α2 ) Φ( α1 )] (31) where τ10 > 0, α10 > max{α00, 0} is the unique solution to (4) when λ = λ0 and when δ is relaced by δ 0 and is relaced by 0 ; and where τ20 > 0, α20 > max{α00, 0} is the limit of the unique solution to (4) when λ 0 also with δ 0 (res. 0 ) in lieu of δ (res. ). Indeed, {j H : 0 < Tj < λ0 } {j H : Tj > 0} {j H : Tj λ0 } = {j H : βbj (0 ) 6= 0} {j H : βbj (λ0 ) 6= 0} 0 2(1 )Φ( α2 ) P( Π τ20 W > α20 τ20 ) (32) 2(1 )Φ( α10 ) P( Π τ10 W > α10 τ10 ) and, by a similar calculation, {j K0 : 0 < Tj λ0 } 2Φ( α20 ) 2Φ( α10 ); K0 13 (33)

14 δ=2, ε=0.2 δ=1, ε=0.2 FDP q=0.3 q=0.1 q=0.5 FDP q=0.1 q=0.3 q= TPP TPP δ=2, ε=0.4 δ=1, ε=0.4 FDP q=0.1 q=0.3 q=0.5 FDP q=0.1 q=0.3 q= TPP TPP Figure 4: Tradeoff diagrams for Π ex(1). σ = 0.5; ρ = 1. Different anels corresond to different combinations of δ and ɛ, as aears in the title of each of the lots. The black (res. light blue) curve corresonds to the oracle (res. knockoffs). Markers denote the air (t, fd) of asymtotic ower and tye I error, attained for three examle values of q: black for oracle, light blue for knockoff. The knockoff rocedure loses a little bit of ower due to the estimate of 1 ɛ (roortion of true nulls). 14

15 Theoretical: Π * = δ 50 Theoretical: Π * = ex(1) FDP q=0.1 q=0.5 q=0.3 FDP q=0.1 q=0.3 q= TPP TPP Theoretical: Π * = 0.7N(0,1)0.3N(2,1) Theoretical: Π * = 0.5δ δ 50 FDP q=0.1 q=0.3 q=0.5 FDP q=0.5 q=0.3 q= TPP TPP Figure 5: Tradeoff diagrams for different Π, with ɛ = 0.2, δ = 1 and σ = 0.5. using (32) and (33) to comute the limit of (19), we obtain (31). As λ > 0 varies, a arametric curve (FDP aug(λ), TPP aug(λ)) is traced which secifies the FDP level attainable at a secific TPP level for the Lasso solution corresonding to the augmented design, as q Π aug(tpp aug(λ); ɛ, δ, σ) = FDP aug(λ). (34) Figure 4 shows this curve against the analogous curve (FDP (λ), TPP (λ)), associated with the original design X and the oracle rocedure, when Π is exonential with mean 1. The different anels corresond to different combinations of δ and ɛ. The roortion ρ was taken to be 1 in all scenarios, in other words, the number of knockoff variables r was taken to be equal to the number of original variables ; we observed only very slight differences in the curves when ρ was varied between 0.1 and 1. Figure 5 shows the two curves, (FDP (λ), TPP (λ)) and (FDP aug(λ), TPP aug(λ)), for different distributions Π. In all situations deicted in Figures 4 and 5, the light blue curve, corresonding to the augmented setu, lies very close to the black curve, which corresonds to the original design. Hence, augmenting the design with knockoffs seems to have little effect on the tradeoff diagram, or, the achievable ower for a given FDP level, at least for FDP levels small enough to be of interest. For examle, if we set q = 0.05, then there exist values λ and t such that FDP (λ) = FDP aug(λ ) = 0.05 and TPP (λ) TPP aug(λ ). Of course, 15

16 the knockoff rocedure achieves this without knowledge of ɛ and Π, while the oracle relies on knowing these quantities in choosing λ. Note, still, that for a given inut q, the knockoff rocedure essentially (asymtotically) chooses λ such that FDP aug(λ ), not FDP aug(λ ), is equal to q. Because the former uses a conservative estimate of 1 ɛ, FDP aug(λ ) is usually larger than FDP aug(λ ) and, consequently, the ower attained by the knockoff rocedure is a little lower. This ga is deicted by the markers in the lots, each air of blue and black markers corresonding to a articular value of q. For examle, for δ = 1 and ɛ = 0.2, oracle ower is and knockoffs ower is 0.18 at q = 0.1. The value of t 0 used is 0.1. Figure 1 gives an alternative reresentation (see cation). 5 Prediction Armed with a rocedure that controls the FDR for the model (1), we now exlore utilizing knockoffs for estimation uroses (the rediction error and the estimation error are equivalent here because we have uncorrelated redictors). Our ursuit is to a large degree insired by the work of Abramovich et al. (2006), which for the first time rigorously linked FDR control and estimation error. At the same time, the working assumtions we adot in the current aer allow us to take further advantage of the results of Bayati and Montanari (2012), namely, to asymtotically analyze not only the ower or the knockoff rocedure but also the estimation error associated with it. As in the receding analysis of testing errors, the (normalized) risk of the Lasso estimator at a fixed λ tends in robability to a simle limit. Secifically, by Corollary 1.6 of Bayati and Montanari (2012), we have that, for a given air of ɛ and Π, and for fixed λ, lim 1 [ β(λ) β 2 (η ατ (Π τw ) Π) 2] R (λ; Π, ɛ), (35) where W N (0, 1) indeendently of Π, and τ > 0, α > max{α 0, 0} is the unique solution to (4). Because we have a tractable form for the limiting risk, we can minimize it over λ to obtain an oracle choice for λ, λ OL(Π, ɛ) = argmin R (λ; Π, ɛ). (36) λ The oracle choice of λ deends on Π and on ɛ and therefore does not roduce a legitimate estimator for β, but it sets a benchmark for evaluating the erformance of a rocedure that lugs in any (ossibly datadeendent) value for λ. Consider now a fixed q and the knockoff rocedure from Section 3.2, which selects the j-th variable if T j τ, where τ is given by (12) and π 0 by (19) and Λ = [t 0, ). This is a legal variable selection rocedure (it does not deend on Π or on ɛ) with the FDR control guarantees of Theorem 3.3. In the limit as n,, it corresonds to a choice of λ defined by FDP aug(λ) = q where FDP aug(λ) is given in (31). Call this the knockoff choice for λ and denote it by λ KO (q). Then the quantity { R (λ KO ν(q) su (q); } Π, ɛ) Π R (λ OL (Π, ɛ); Π = su, ɛ) Π { R (λ KO (q); } Π, ɛ) inf λ R (λ; Π, ɛ) is the worst-case (in Π ) inflation of the asymtotic risk due to using λ KO (q) instead of the otimal choice for λ (our use of the term risk inflation alludes to the similarity to the criterion roosed in Foster and George, 1994). It is not clear that ν(q) is at all finite; but if it were true that there exists a choice of q such that for all ɛ, δ, σ, ν(q) is bounded by 1 for small, it would be rather reassuring. ven for fixed ɛ, δ and σ, comuting (37) is not trivial because we need to maximize the ratio over all distributions Π ; neither were we able to bound (37) in general. The roblem is much simler in the case 16 (37)

17 where Π is a oint mass at some nonzero (ositive) value λ, because in this case the risk inflation is a univariate function. In Figure 6 we lot the risk inflation as a function of λ for ɛ = 0.1 and δ = 1, σ = 0.5. The maximum risk inflation is and attained at t = 1.9, which is quite interesting as this is a case of neither very low nor very high signal to noise ratio. Changing ɛ from 0.1 to 0.05 yielded maximum risk inflation of only risk inflation ε= t Figure 6: Risk inflation for oint mass at λ, as a function of λ. ɛ = 0.1, δ = 1, σ = 0.5. The maximum is attained at t = 1.9. We continue our investigation with an emirical study. Hence, we chose a large number of members of a flexible arametric family of distributions, and evaluate the risk inflation on each. Secifically, we took δ = 1, σ = 1 and ɛ = 0.1, and consider the family of mixtures of Gamma distributions { 8 } 8 w i Gam(α i ) : 0 w i 1, w i = 1 i=1 where α = (0.1, 0.8, 1.5, 2.2, 2.9, 3.6, 4.3, 5.0) and where Gam(α) is the Gamma distribution with shae arameter α and rate one. We then consider a restriction of the original family, namely, the subset { 8 (u i / } u j )Gam(α i ) : u i {0, 1, 2, 3, 4} i=1 which includes 383,809 unique members. A few of these distributions are lotted in color in Figure 7 (the black curves corresond to distributions which belong to the original mixture family but not the restricted one). Next, for each of these 383,809 distributions Π we comute the ratio R (λ KO (q); Π, ɛ) inf λ R (λ; Π, ɛ) numerically. For q = 0.7, this ratio was always below Figure 8 shows a histogram of the 383,809 values: the distribution with the maximum risk inflation aears in Figure 7 in red; the runner-us look similar. We also comuted the ratio for some distributions that belong to the original mixture family but not the restricted one, and these are shown in black in Figure 7; the ratio for both was less than To summarize, for q = 0.7 the risk inflation did not exceed 17% in any of our examles. i=1 17

18 density Figure 7: Gamma mixture distributions. The colored lines reresent distributions in the restricted family of 383,809 distributions. Black lines are distributions that belong to the original mixture family, but not the restricted one. The red curve corresonds to the distribution that attained the maximum ratio among the 383,809 distributions. Numbers in legend are risk inflation values corresonding to the different distributions in the lot. Frequency ratio Figure 8: Histogram of the 383,809 comuted risk ratios. 6 Discussion The Lasso is nowadays frequently used for selecting variables in a linear model. Here we have considered the controlled variable selection roblem, where a valid rocedure must satisfy that the exected roortion 18

19 of incorrectly selected variables (the FDR) is ket below a re-secified level q. This connects our work to that of Tibshirani et al. (2016) and Fithian et al. (2015), who consider hyothesis testing for rocedures which sequentially select variables into a model. However, the null hyothesis tested in these aers is different, corresonding to an incremental test versus a full-model test; see the elaborate and illuminating discussion in Fithian et al. (2015). We have roosed an adatation of the Lasso-based knockoff rocedure of Barber and Candès (2015), which has rovable FDR guarantees. Under the working assumtions, which include a Gaussian i.i.d. design, it is ossible to analyze the asymtotic tradeoff between the false discovery roortion and ower of the roosed rocedure. Also, while the i.i.d. N (0, 1) design is erhas not realistic, our aim is to comare knockoffs against the best ossible mechanism for selection over the lasso ath, and such a mechanism has been develoed exlicitly only for the i.i.d. N (0, 1) setting. We have demonstrated that the roosed rocedure comes close in erformance to an oracle rocedure, which uses knowledge about the underlying signal to choose the Lasso tuning arameter in such a way that FDP is controlled. Moving away from the testing roblem, we have also demonstrated that FDR calibration with knockoffs has otential in achieving good rediction erformance, adatively in the sarsity ɛ and in Π. In this article the focus was on rocedures tied to the Lasso ath. There is certainly room for investigating the use of other test statistics to measure feature imortance. Using the Lasso statistics T j in (11) for the knockoff rocedure might be subotimal, as can be seen in Figure 5 and as was imlied in Su et al. (2017): in the Lasso ath false discoveries are intersersed between true ones. As a matter of fact, an otimal oracle rocedure exists for the setu considered in the current aer. Namely, the rocedure which rejects the hyotheses with T j τ where T j = P(β j = 0 y, X) is the local false discovery rate at (y, X) and where τ is selected so that the rocedure has FDR control at q. Candès et al. (2016) have already roosed to use similar Bayesian statistics. Note that Tj defined above deends on Π and ɛ, hence we associate it with an oracle. It would be interesting to think if there is a way to mimic this oracle with a comutationally feasible emirical Bayes rocedure. ven if we commit to Lasso statistics, it might be ossible to increase ower by ordering the variables differently. For examle, Candès et al. (2016) suggested to use the absolute value of the Lasso coefficient, evaluated at the cross-validated λ, to measure feature imortance. We leave for future work the investigation of the otential gain in ower in using alternative knockoff statistics. Acknowledgements. C. was artially suorted by the Office of Naval Research under grant N , by the National Science Foundation via DMS , and by the Math X Award from the Simons Foundation. A. W. is suorted by the same Math X Award. A Proofs Proof of Lemma 3.2. By definition, Y = 1 k m Assuming n 0, n 1 > 0, for 1 k m we have ( ) ( ) n0 n0 1 k = n 0, k k 1 ( n0 k 1 m k 1 1 m k 19 )( ) n1 k m k ) ( n0 n 1 m ( ) n1 = 1 ( ) n1 1. m k n 1 1 m k 1

20 Therefore, Y = n 0 1 n 1 1 ( n0 n 1 m ) 1 k m ( ) ( ) ( n0 1 n1 1 = n 0 1 k 1 m k 1 1 n 1 ( n0 1 ( m n0 n 1 m If n 0 = 0, then X = Y = 0. If n 1 = 0, then X = Y = m. In both cases the claimed result still holds. ) ) ). Proof of identity (22). Assuming m 0, r > 0, we have ( = = X 1 m X m m 0 ) r X m = 1 m 0 X m 0! (k 1)!(m 0 k 1)! k=1 (m r1) m m 0 ( m0 )( r k 1 m k1 ( m0 r k=1 (m r1) m ) ) = 1 m m 0 k=1 (m r1) k 1 m k r k m 1 m 0 k ( m0 k )( r ) m k ) ( m0 r m r! (m k 1)!(r m k 1)! m!(m 0 r m)! (m 0 r)! )( ) r ( m0 m 0 m ( m0 r m m m 0 m If m 0 = 0, then X = 0 and the identity still holds. If r = 0, then X = m and the identity again continues to holds. ) B Comuting the curve q Π To comute q Π (TPP (λ); ɛ, δ, σ) for secific Π and ɛ, we need to be able to comute the air (FDP (λ), TPP (λ)) for each λ > 0. This, in turn, requires to find for each λ the air (α, τ) given by the system of equations (4). Hence, the functionals and f(π, ɛ) := (η ατ (Π τw ) Π) 2 = (1 ɛ)(η ατ (τw )) 2 ɛ(η ατ (Π τw ) Π ) 2 g(π, ɛ) := P( Π τw > ατ) = 2(1 ɛ)φ( α) ɛp( Π τw > ατ), which aear in (4), are evaluated numerically for a given CDF F Π of Π. For a fixed value of λ, our code takes as an inut ɛ and a CDF F Π of Π (as well as δ and σ 2 ) and returns a solution for (α, τ) in (4). Thus, for given values of α and τ, we need to be able to comute f(π, ɛ) and g(π, ɛ) defined in Section 2, as functionals of F Π. We exlain how we imlement this. Let R(µ) := (η ατ (µ τw ) µ) 2 and Q(µ) := P ( µ τw > ατ). Writing f(π, ɛ) as a Riemann- Stieltjes integral and using integration by arts, we have f(π, ɛ) := (η ατ (Π τw ) Π) 2 = = R( )F Π ( ) R( )F Π ( ) = R( )F Π ( ) R( )F Π ( ) = 1 α 2 τ 2 R(µ) df Π F Π (µ) dr(µ) F Π (µ)r (µ)dµ F Π (µ)[2µ (Φ (α µ/τ)) (Φ ( α µ/τ))]dµ 20

21 where we substituted a closed-form exression for the derivative of the risk of soft thresholding (see, e.g., Johnstone, 2011, Ch. 2.7). The last exression is evaluated using numerical integration. Similarly, g(π, ɛ) := P( Π τw > ατ) = Q(µ) df Π = Q( )F Π ( ) Q( )F Π ( ) = Q( )F Π ( ) Q( )F Π ( ) = 1 1/τ F Π (µ) dq(µ) F Π (µ)q (µ) dµ F Π (µ)[φ(α µ/τ) φ( α µ/τ)] dµ, and, again, the last exression is evaluated using numerical integration. References Felix Abramovich, Yoav Benjamini, David L Donoho, and Iain M Johnstone. Secial invited lecture: adating to unknown sarsity by controlling the false discovery rate. The Annals of Statistics, ages , Rina Foygel Barber and mmanuel J Candès. Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43(5): , Rina Foygel Barber and mmanuel J Candès. A knockoff filter for high-dimensional selective inference. arxiv rerint arxiv: , Mohsen Bayati and Andrea Montanari. The lasso risk for gaussian matrices. I Transactions on Information Theory, 58(4): , Małgorzata Bogdan, wout van den Berg, Weijie Su, and mmanuel J Candès. Sulementary materials for statistical estimation and testing via the sorted l1 norm Małgorzata Bogdan, wout van den Berg, Chiara Sabatti, Weijie Su, and mmanuel J Candès. Sloe adative variable selection via convex otimization. The annals of alied statistics, 9(3):1103, mmanuel Candès, Yingying Fan, Lucas Janson, and Jinchi Lv. Panning for gold: Model-free knockoffs for high-dimensional controlled variable selection. arxiv rerint arxiv: , Ran Dai and Rina Foygel Barber. The knockoff filter for fdr control in grou-sarse and multitask regression. arxiv rerint arxiv: , Bradley fron, Trevor Hastie, Iain Johnstone, Robert Tibshirani, et al. Least angle regression. The Annals of statistics, 32(2): , William Fithian, Jonathan Taylor, Robert Tibshirani, and Ryan Tibshirani. Selective sequential model selection. arxiv rerint arxiv: , Dean P Foster and dward I George. The risk inflation criterion for multile regression. The Annals of Statistics, ages ,

22 Iain M Johnstone. Gaussian estimation: Sequence and wavelet models. Unublished manuscrit, John D Storey. A direct aroach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3): , Weijie Su, mmanuel Candès, et al. Sloe is adative to unknown sarsity and asymtotically minimax. The Annals of Statistics, 44(3): , Weijie Su, Małgorzata Bogdan, mmanuel Candès, et al. False discoveries occur early on the lasso ath. The Annals of Statistics, 45(5): , Ryan J Tibshirani, Jonathan Taylor, Richard Lockhart, and Robert Tibshirani. xact ost-selection inference for sequential regression rocedures. Journal of the American Statistical Association, 111(514): ,

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

Supplementary Materials for Robust Estimation of the False Discovery Rate

Supplementary Materials for Robust Estimation of the False Discovery Rate Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

Notes on Instrumental Variables Methods

Notes on Instrumental Variables Methods Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of

More information

Hotelling s Two- Sample T 2

Hotelling s Two- Sample T 2 Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test

More information

General Linear Model Introduction, Classes of Linear models and Estimation

General Linear Model Introduction, Classes of Linear models and Estimation Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)

More information

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution Monte Carlo Studies Do not let yourself be intimidated by the material in this lecture This lecture involves more theory but is meant to imrove your understanding of: Samling distributions and tests of

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

Information collection on a graph

Information collection on a graph Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split A Bound on the Error of Cross Validation Using the Aroximation and Estimation Rates, with Consequences for the Training-Test Slit Michael Kearns AT&T Bell Laboratories Murray Hill, NJ 7974 mkearns@research.att.com

More information

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test) Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

A Note on Guaranteed Sparse Recovery via l 1 -Minimization

A Note on Guaranteed Sparse Recovery via l 1 -Minimization A Note on Guaranteed Sarse Recovery via l -Minimization Simon Foucart, Université Pierre et Marie Curie Abstract It is roved that every s-sarse vector x C N can be recovered from the measurement vector

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

State Estimation with ARMarkov Models

State Estimation with ARMarkov Models Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition A Qualitative Event-based Aroach to Multile Fault Diagnosis in Continuous Systems using Structural Model Decomosition Matthew J. Daigle a,,, Anibal Bregon b,, Xenofon Koutsoukos c, Gautam Biswas c, Belarmino

More information

Distributed Rule-Based Inference in the Presence of Redundant Information

Distributed Rule-Based Inference in the Presence of Redundant Information istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

arxiv: v2 [stat.me] 3 Nov 2014

arxiv: v2 [stat.me] 3 Nov 2014 onarametric Stein-tye Shrinkage Covariance Matrix Estimators in High-Dimensional Settings Anestis Touloumis Cancer Research UK Cambridge Institute University of Cambridge Cambridge CB2 0RE, U.K. Anestis.Touloumis@cruk.cam.ac.uk

More information

Estimation of Separable Representations in Psychophysical Experiments

Estimation of Separable Representations in Psychophysical Experiments Estimation of Searable Reresentations in Psychohysical Exeriments Michele Bernasconi (mbernasconi@eco.uninsubria.it) Christine Choirat (cchoirat@eco.uninsubria.it) Raffaello Seri (rseri@eco.uninsubria.it)

More information

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests 009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract

More information

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material Robustness of classifiers to uniform l and Gaussian noise Sulementary material Jean-Yves Franceschi Ecole Normale Suérieure de Lyon LIP UMR 5668 Omar Fawzi Ecole Normale Suérieure de Lyon LIP UMR 5668

More information

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales Lecture 6 Classification of states We have shown that all states of an irreducible countable state Markov chain must of the same tye. This gives rise to the following classification. Definition. [Classification

More information

An Analysis of Reliable Classifiers through ROC Isometrics

An Analysis of Reliable Classifiers through ROC Isometrics An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014

Morten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014 Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 All models are aroximations! The best model does not exist! Comlicated models needs a lot of data. lower your ambitions or get

More information

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi

More information

COMMUNICATION BETWEEN SHAREHOLDERS 1

COMMUNICATION BETWEEN SHAREHOLDERS 1 COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov

More information

Cryptanalysis of Pseudorandom Generators

Cryptanalysis of Pseudorandom Generators CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we

More information

Introduction to Probability and Statistics

Introduction to Probability and Statistics Introduction to Probability and Statistics Chater 8 Ammar M. Sarhan, asarhan@mathstat.dal.ca Deartment of Mathematics and Statistics, Dalhousie University Fall Semester 28 Chater 8 Tests of Hyotheses Based

More information

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

On the Toppling of a Sand Pile

On the Toppling of a Sand Pile Discrete Mathematics and Theoretical Comuter Science Proceedings AA (DM-CCG), 2001, 275 286 On the Toling of a Sand Pile Jean-Christohe Novelli 1 and Dominique Rossin 2 1 CNRS, LIFL, Bâtiment M3, Université

More information

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund

More information

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables High-dimensional Ordinary Least-squares Projection for Screening Variables Xiangyu Wang and Chenlei Leng arxiv:1506.01782v1 [stat.me] 5 Jun 2015 Abstract Variable selection is a challenging issue in statistical

More information

Statics and dynamics: some elementary concepts

Statics and dynamics: some elementary concepts 1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and

More information

The non-stochastic multi-armed bandit problem

The non-stochastic multi-armed bandit problem Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at

More information

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete

More information

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA Proceedings of the 2011 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelsach, K. P. White, and M. Fu, eds. EFFICIENT RARE EVENT SIMULATION FOR HEAVY-TAILED SYSTEMS VIA CROSS ENTROPY Jose

More information

STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2

STA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2 STA 25: Statistics Notes 7. Bayesian Aroach to Statistics Book chaters: 7.2 1 From calibrating a rocedure to quantifying uncertainty We saw that the central idea of classical testing is to rovide a rigorous

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

Chapter 3. GMM: Selected Topics

Chapter 3. GMM: Selected Topics Chater 3. GMM: Selected oics Contents Otimal Instruments. he issue of interest..............................2 Otimal Instruments under the i:i:d: assumtion..............2. he basic result............................2.2

More information

Partial Identification in Triangular Systems of Equations with Binary Dependent Variables

Partial Identification in Triangular Systems of Equations with Binary Dependent Variables Partial Identification in Triangular Systems of Equations with Binary Deendent Variables Azeem M. Shaikh Deartment of Economics University of Chicago amshaikh@uchicago.edu Edward J. Vytlacil Deartment

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

LOGISTIC REGRESSION. VINAYANAND KANDALA M.Sc. (Agricultural Statistics), Roll No I.A.S.R.I, Library Avenue, New Delhi

LOGISTIC REGRESSION. VINAYANAND KANDALA M.Sc. (Agricultural Statistics), Roll No I.A.S.R.I, Library Avenue, New Delhi LOGISTIC REGRESSION VINAANAND KANDALA M.Sc. (Agricultural Statistics), Roll No. 444 I.A.S.R.I, Library Avenue, New Delhi- Chairerson: Dr. Ranjana Agarwal Abstract: Logistic regression is widely used when

More information

t 0 Xt sup X t p c p inf t 0

t 0 Xt sup X t p c p inf t 0 SHARP MAXIMAL L -ESTIMATES FOR MARTINGALES RODRIGO BAÑUELOS AND ADAM OSȨKOWSKI ABSTRACT. Let X be a suermartingale starting from 0 which has only nonnegative jums. For each 0 < < we determine the best

More information

Stochastic integration II: the Itô integral

Stochastic integration II: the Itô integral 13 Stochastic integration II: the Itô integral We have seen in Lecture 6 how to integrate functions Φ : (, ) L (H, E) with resect to an H-cylindrical Brownian motion W H. In this lecture we address the

More information

Chapter 13 Variable Selection and Model Building

Chapter 13 Variable Selection and Model Building Chater 3 Variable Selection and Model Building The comlete regsion analysis deends on the exlanatory variables ent in the model. It is understood in the regsion analysis that only correct and imortant

More information

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit Chater 5 Statistical Inference 69 CHAPTER 5 STATISTICAL INFERENCE.0 Hyothesis Testing.0 Decision Errors 3.0 How a Hyothesis is Tested 4.0 Test for Goodness of Fit 5.0 Inferences about Two Means It ain't

More information

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP

CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP Submitted to the Annals of Statistics arxiv: arxiv:1706.07237 CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP By Johannes Tewes, Dimitris N. Politis and Daniel J. Nordman Ruhr-Universität

More information

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity Bayesian Satially Varying Coefficient Models in the Presence of Collinearity David C. Wheeler 1, Catherine A. Calder 1 he Ohio State University 1 Abstract he belief that relationshis between exlanatory

More information

ECE 534 Information Theory - Midterm 2

ECE 534 Information Theory - Midterm 2 ECE 534 Information Theory - Midterm Nov.4, 009. 3:30-4:45 in LH03. You will be given the full class time: 75 minutes. Use it wisely! Many of the roblems have short answers; try to find shortcuts. You

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

Positive decomposition of transfer functions with multiple poles

Positive decomposition of transfer functions with multiple poles Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.

More information

Maxisets for μ-thresholding rules

Maxisets for μ-thresholding rules Test 008 7: 33 349 DOI 0.007/s749-006-0035-5 ORIGINAL PAPER Maxisets for μ-thresholding rules Florent Autin Received: 3 January 005 / Acceted: 8 June 006 / Published online: March 007 Sociedad de Estadística

More information

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms

More information

Adaptive estimation with change detection for streaming data

Adaptive estimation with change detection for streaming data Adative estimation with change detection for streaming data A thesis resented for the degree of Doctor of Philosohy of the University of London and the Diloma of Imerial College by Dean Adam Bodenham Deartment

More information

Estimating function analysis for a class of Tweedie regression models

Estimating function analysis for a class of Tweedie regression models Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Estimating Time-Series Models

Estimating Time-Series Models Estimating ime-series Models he Box-Jenkins methodology for tting a model to a scalar time series fx t g consists of ve stes:. Decide on the order of di erencing d that is needed to roduce a stationary

More information

Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis

Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis HIPAD LAB: HIGH PERFORMANCE SYSTEMS LABORATORY DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING AND EARTH SCIENCES Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis Why use metamodeling

More information

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment

More information

On a Markov Game with Incomplete Information

On a Markov Game with Incomplete Information On a Markov Game with Incomlete Information Johannes Hörner, Dinah Rosenberg y, Eilon Solan z and Nicolas Vieille x{ January 24, 26 Abstract We consider an examle of a Markov game with lack of information

More information

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar 15-859(M): Randomized Algorithms Lecturer: Anuam Guta Toic: Lower Bounds on Randomized Algorithms Date: Setember 22, 2004 Scribe: Srinath Sridhar 4.1 Introduction In this lecture, we will first consider

More information

Variable Selection and Model Building

Variable Selection and Model Building LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 38 Variable Selection and Model Building Dr. Shalabh Deartment of Mathematics and Statistics Indian Institute of Technology Kanur Evaluation of subset regression

More information

2 Asymptotic density and Dirichlet density

2 Asymptotic density and Dirichlet density 8.785: Analytic Number Theory, MIT, sring 2007 (K.S. Kedlaya) Primes in arithmetic rogressions In this unit, we first rove Dirichlet s theorem on rimes in arithmetic rogressions. We then rove the rime

More information

Robust hamiltonicity of random directed graphs

Robust hamiltonicity of random directed graphs Robust hamiltonicity of random directed grahs Asaf Ferber Rajko Nenadov Andreas Noever asaf.ferber@inf.ethz.ch rnenadov@inf.ethz.ch anoever@inf.ethz.ch Ueli Peter ueter@inf.ethz.ch Nemanja Škorić nskoric@inf.ethz.ch

More information

2 Asymptotic density and Dirichlet density

2 Asymptotic density and Dirichlet density 8.785: Analytic Number Theory, MIT, sring 2007 (K.S. Kedlaya) Primes in arithmetic rogressions In this unit, we first rove Dirichlet s theorem on rimes in arithmetic rogressions. We then rove the rime

More information

An Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem

An Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem An Ant Colony Otimization Aroach to the Probabilistic Traveling Salesman Problem Leonora Bianchi 1, Luca Maria Gambardella 1, and Marco Dorigo 2 1 IDSIA, Strada Cantonale Galleria 2, CH-6928 Manno, Switzerland

More information

where x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H:

where x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H: Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 2 October 25, 2017 Due: November 08, 2017 A. Growth function Growth function of stum functions.

More information

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 000-9939XX)0000-0 ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS WILLIAM D. BANKS AND ASMA HARCHARRAS

More information

Brownian Motion and Random Prime Factorization

Brownian Motion and Random Prime Factorization Brownian Motion and Random Prime Factorization Kendrick Tang June 4, 202 Contents Introduction 2 2 Brownian Motion 2 2. Develoing Brownian Motion.................... 2 2.. Measure Saces and Borel Sigma-Algebras.........

More information

Developing A Deterioration Probabilistic Model for Rail Wear

Developing A Deterioration Probabilistic Model for Rail Wear International Journal of Traffic and Transortation Engineering 2012, 1(2): 13-18 DOI: 10.5923/j.ijtte.20120102.02 Develoing A Deterioration Probabilistic Model for Rail Wear Jabbar-Ali Zakeri *, Shahrbanoo

More information

Adaptive Estimation of the Regression Discontinuity Model

Adaptive Estimation of the Regression Discontinuity Model Adative Estimation of the Regression Discontinuity Model Yixiao Sun Deartment of Economics Univeristy of California, San Diego La Jolla, CA 9293-58 Feburary 25 Email: yisun@ucsd.edu; Tel: 858-534-4692

More information

8 STOCHASTIC PROCESSES

8 STOCHASTIC PROCESSES 8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular

More information

Location of solutions for quasi-linear elliptic equations with general gradient dependence

Location of solutions for quasi-linear elliptic equations with general gradient dependence Electronic Journal of Qualitative Theory of Differential Equations 217, No. 87, 1 1; htts://doi.org/1.14232/ejqtde.217.1.87 www.math.u-szeged.hu/ejqtde/ Location of solutions for quasi-linear ellitic equations

More information

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &

More information

1-way quantum finite automata: strengths, weaknesses and generalizations

1-way quantum finite automata: strengths, weaknesses and generalizations 1-way quantum finite automata: strengths, weaknesses and generalizations arxiv:quant-h/9802062v3 30 Se 1998 Andris Ambainis UC Berkeley Abstract Rūsiņš Freivalds University of Latvia We study 1-way quantum

More information

7.2 Inference for comparing means of two populations where the samples are independent

7.2 Inference for comparing means of two populations where the samples are independent Objectives 7.2 Inference for comaring means of two oulations where the samles are indeendent Two-samle t significance test (we give three examles) Two-samle t confidence interval htt://onlinestatbook.com/2/tests_of_means/difference_means.ht

More information

Inference for Empirical Wasserstein Distances on Finite Spaces: Supplementary Material

Inference for Empirical Wasserstein Distances on Finite Spaces: Supplementary Material Inference for Emirical Wasserstein Distances on Finite Saces: Sulementary Material Max Sommerfeld Axel Munk Keywords: otimal transort, Wasserstein distance, central limit theorem, directional Hadamard

More information

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1) CERTAIN CLASSES OF FINITE SUMS THAT INVOLVE GENERALIZED FIBONACCI AND LUCAS NUMBERS The beautiful identity R.S. Melham Deartment of Mathematical Sciences, University of Technology, Sydney PO Box 23, Broadway,

More information

False Discovery Rate

False Discovery Rate False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR

More information

arxiv: v1 [cs.gt] 2 Nov 2018

arxiv: v1 [cs.gt] 2 Nov 2018 Tight Aroximation Ratio of Anonymous Pricing Yaonan Jin Pinyan Lu Qi Qi Zhihao Gavin Tang Tao Xiao arxiv:8.763v [cs.gt] 2 Nov 28 Abstract We consider two canonical Bayesian mechanism design settings. In

More information

A note on the random greedy triangle-packing algorithm

A note on the random greedy triangle-packing algorithm A note on the random greedy triangle-acking algorithm Tom Bohman Alan Frieze Eyal Lubetzky Abstract The random greedy algorithm for constructing a large artial Steiner-Trile-System is defined as follows.

More information

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM

More information

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education CERIAS Tech Reort 2010-01 The eriod of the Bell numbers modulo a rime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education and Research Information Assurance and Security Purdue University,

More information

VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES

VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES Journal of Sound and Vibration (998) 22(5), 78 85 VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES Acoustics and Dynamics Laboratory, Deartment of Mechanical Engineering, The

More information

Unobservable Selection and Coefficient Stability: Theory and Evidence

Unobservable Selection and Coefficient Stability: Theory and Evidence Unobservable Selection and Coefficient Stability: Theory and Evidence Emily Oster Brown University and NBER August 9, 016 Abstract A common aroach to evaluating robustness to omitted variable bias is to

More information

The Noise Power Ratio - Theory and ADC Testing

The Noise Power Ratio - Theory and ADC Testing The Noise Power Ratio - Theory and ADC Testing FH Irons, KJ Riley, and DM Hummels Abstract This aer develos theory behind the noise ower ratio (NPR) testing of ADCs. A mid-riser formulation is used for

More information