Recovering Block-structured Activations Using Compressive Measurements

Size: px
Start display at page:

Download "Recovering Block-structured Activations Using Compressive Measurements"

Transcription

1 Recovering Block-structured Activations Using Copressive Measureents Sivaraan Balakrishnan, Mladen Kolar, Alessandro Rinaldo, and Aarti Singh Abstract We consider the probles of detection and support recovery of a contiguous block of weak activation in a large atrix, fro a sall nuber of noisy, possibly adaptive, copressive (linear) easureents. We characterize the tradeoffs between the various proble diensions, the signal strength and the nuber of easureents required to reliably detect and recover the support of the signal. In each case sufficient conditions are copleented with inforation theoretic lower bounds. This is closely related to the proble of (noisy) copressed sensing, where the analogous task is to detect or recover the support of a sparse vector using a sall nuber of linear easureents. In copressed sensing, it has been shown that, at least in a iniax sense, for both detection and support recovery, adaptivity and contiguous structure only reduce signal strength requireents by logarithic factors. On the contrary, we show that while for detection neither adaptivity nor structure reduce the signal strength requireent, for support recovery the signal strength requireent is strongly influenced by both structure and the ability to choose easureents adaptively. Keywords: Adaptive procedures, copressive easureents, large average subatrix, signal detection, signal localization, structured Noral eans Introduction In this paper, we consider the probles of detecting the presence of and estiating the support of a sall contiguous block of positive signal that is ebedded in a large data atrix, given access to noisy copressive easureents, that is, linear cobinations of the atrix entries corrupted with noise. Statistical inference leveraging large structured atrices is the one of the central thees in several applications such atrix copletion, counity detection and ultivariate linear regression. Data atrices with sparse, contiguous block-structured signal coponents for an idealized odel for signals arising in several real-world applications. Consider, for instance, the expression pattern resulting fro genes, grouped by pathways, expressed under the influence of drugs, grouped by siilarity. The block structure in this case arises fro the fact that groups of genes (belonging to coon pathways, say) are co-expressed under Departent of Statistics, UC Berkeley; e-ail: sbalakri@cs.cu.edu. Machine Learning Departent, Carnegie Mellon University; e-ail: ladenk@cs.cu.edu. Departent of Statistics, Carnegie Mellon University; e-ail: arinaldo@stat.cu.edu. Machine Learning Departent, Carnegie Mellon University; e-ail: aarti@cs.cu.edu.

2 the influence of sets of siilar drugs (Yoon et al., 2005). Siilarly, the autoated detection of particular artificial objects in natural satellite iages is an iportant task in coputer vision (Caron et al., 2002). The block-structured odel can be seen as an idealization of a odel for vehicles or buildings in natural iages. The block structure in this case is a consequence of the geoetry of the artificial objects. More generally, the task of detecting and localizing groups of anoalies in networks has a variety of applications in environental onitoring using sensor networks, as detailed in the work of Arias-Castro et al. (20a). The block-structured odel we consider is an idealization of the so called thick cluster odel considered in that work, and it can be expected that several of our results and techniques can be extended to handle this ore flexible notion of structure. Indeed, in work that followed up on the initial posting of this work, Krishnaurthy et al. (203) extend our results to the ore general setting of graph-structured signals, although at the expense of optial results in the setting we consider. In several applications, it is often difficult to easure, copute or store all the entries of the data atrix. For exaple, acquiring and storing high-resolution iages is expensive, while easuring expression levels of all genes under all possible drugs is tie-consuing and costly. However, if we have access to linear cobinations of atrix entries (that is, copressive easureents) such as those obtained fro copressive iaging caeras (Wakin et al., 2006) or fro the cobined expression of ultiple genes under the influence of ultiple drugs (see, for exaple, Kainkarya et al., 200, Kainkarya and Woolf, 2009), then we ight need to only ake and store few such easureents, while still being able to infer the existence or estiate the position of the block of signal in the data atrix. In each of these cases the goal is to detect or recover the signal block, corresponding to a set of co-expressed genes and drugs or to the vehicles/buildings using only few copressive easureents of the data atrix. In this work we consider both the passive (non-adaptive) and active (adaptive) copressive easureents. The non-adaptive easureents are rando or pre-specified linear cobinations of atrix entries. The active easureents schees are adaptive and use feedback fro previously taken easureents to sequentially design linear cobinations that are ore inforative. Related work. In the last few years, there has been a flurry of interest in the use of copressive easureents, priarily for inferring sparse vectors. Specifically, several papers, including Candés and Tao (2006), Donoho (2006), Candés and Tao (2007), Candés and Wakin (2008), have shown that it is possible to recover, in an l 2 sense, a k-sparse vector in n diensions using only O(k log n) incoherent copressive easureents, instead of easuring all of the n coordinates. Along with l 2 recovery, researchers have also considered the probles of detection (Duarte et al., 2006, Haupt and Nowak, 2007, Arias-Castro, 202, Arias-Castro et al., 20b, Ingster et al., 200) and support recovery (Wainwright, 2009a,b, Aeron et al., 200, Reeves and Gastpar, 202) of the non-zero entries of a sparse vector fro non-adaptive copressive easureents. More recently, researchers have analyzed the possibility of taking adaptive copressive easureents, that is, where subsequent easureents are designed based on past observations (Candés and Davenport, 203, Arias-Castro et al., 203, Haupt et al., 2009). For exaple, Haupt et al. (2009) deonstrate upper bounds on the l 2 error for recovering a sparse vector using an adaptive procedure called copressive distilled sensing, and Malloy 2

3 and Nowak (202b) show that a copressive version of standard binary search achieves iniax perforance for the localization of a one-sparse vector. Malloy and Nowak (202a) extend the binary search procedure for identifying the support of k-sparse vectors. In Arias- Castro et al. (203), the authors show that adaptive copressive schees offer iproveents over the passive schees which, in ters of l 2 recovery and support recovery, are liited to a log factor. Arias-Castro (202) further extends these results to the proble of detection fro adaptive copressive easureents. In order to copare with the results we present in this work, Table suarizes soe results for the detection and exact support recovery of a sparse vector using passive and active copressive easureents. In order to avoid clutter, in each case we only give sufficient conditions. The corresponding references detail all the assuptions on the easureent atrices (for instance, the result of Wainwright (2009b) applies to Gaussian ensebles). In the table, we follow the standardization used throughout this paper (and various others on this topic), that is, the length of the vector is n and the nuber of non-zero coordinates in the vector is k. The nuber of easureents is and each easureent is assued to have unit l 2 nor (in expectation, if the easureent is rando). Table : Suary of known results for the sparse vector case. µ in represents the iniu (absolute) signal aplitude over all non-zero coordinates, and represents the standard deviation of the noise. C is a universal constant independent of the proble paraeters. Detection Localization Passive µ in µ in C n for positive signals. k Ck log(n/k) 2 C n k for arbitrary signals. µ Active Arias-Castro (202) in n log n Wainwright (2009b) C Ck log(n/k) Arias-Castro et al. (203) µ in C n log k Malloy and Nowak (202a) Alost all of the prior work described so far has been focused on inference for sparse unstructured data vectors fro (passive or adaptive) copressed easureents. In this paper, we focus on the relatively unexplored probles of detection and support recovery for structured data atrices fro copressive easureents. We are concerned with signals that are both sparse and highly structured, taking the for of a contiguous sub-atrix within a larger noisy atrix. If the signal coponents are unstructured, the treatent of data atrices is exactly equivalent to the treatent of data vectors. However, in the structured case, it is clear that soe notions of structure are ore natural when considered in the context of data atrices rather than data vectors. Specifically, the contiguous block signals considered in this work can (soewhat unnaturally) be viewed as arising fro a particular collection of structured signals on vectors. The works of Baraniuk et al. (200) and Huang et al. (20) have analyzed different fors of structured sparsity in the vector setting, for exaple, when the non-zero locations in a data vector for non-overlapping or partially-overlapping groups. These works do not however provide sharp results when specialized to the setting we consider and do not address the adaptive setting. Castro (202) and Tánczos and Castro (203) study support recovery of unstructured and structured signals fro adaptive easureents when the signal is observed 3

4 directly. Soni and Haupt (20) consider the case of adaptive sensing of tree-structured signals. The class of tree-structured signals is quite different fro the class of signals we consider, and the gains fro adaptive sensing for the tree-structured signals are relatively liited (to a log factor as in the unstructured setting). Also related is the recent work on the recovery of low-rank atrices (see, for exaple, Negahban and Wainwright, 20, Koltchinskii et al., 20). Indeed, our analysis for support recovery in the passive setting (see Theore 3) is inspired by restricted strong convexity analysis of Negahban and Wainwright (20). The signals under consideration in this paper are a special case of low rank and sparse atrices. Richard et al. (202) study the recovery of low rank and sparse atrices fro direct observations while Golbabaee and Vandergheynst (202) study recovery fro copressive easureents. Each of these results focuses on recovery in Frobenius nor fro passive easureents while our focus is on detection and signal support recovery fro possibly active easureents. Furtherore, the additional structure of block-structured signals is crucially leveraged in our analysis and by our estiators, and specializing these earlier results does not give sharp results even in the passive setting. Results applicable to the sparse and low-rank setting do however extend to the setup where there is a non-contiguous sub-atrix or block, also considered in Sun and Nobel (203), Butucea and Ingster (203), Butucea et al. (203), Bhaidi et al. (202) and Kolar et al. (20) in the case of direct easureents. This setting is beyond the scope of our paper. Suary of our contributions. We establish lower bounds on the iniu nuber of copressive easureents and the iniu (non-zero) signal aplitude needed to detect the presence of a block with positive signal, as well as to recover the support of the signal block, using both non-adaptive and adaptive easureents. We also establish atching upper bounds for the procedures that guarantee consistent detection and support recovery of weak block-structured signals using few non-adaptive and adaptive copressive easureents. Our results indicate that adaptivity and structure play a key role and that significant iproveents are possible over non-adaptive and unstructured settings. This is unlike the vector case where, at least in the settings analogous to the ones we consider here, contiguous structure and adaptivity have been shown to provide inor, if any, iproveent. In our setting we take copressive easureents of a data atrix of size (n n 2 ), the block containing the signal is of size (k k 2 ), with iniu signal aplitude µ in in each coordinate of the (k k 2 ) block. The easureents are corrupted with Gaussian noise with standard deviation, and we have a budget of copressive easureents with each easureent atrix constrained to have unit Frobenius nor. Table 2 describes our ain findings. In this table we describe both necessary and sufficient conditions. To reduce clutter in the expressions we use n = n 2 := n and k = k 2 := k. For detection, akin to the vector setting, structure and adaptivity play no role. The structured data atrix setting requires the aplitude of a signal to scale as for both n k 2 non-adaptive and adaptive cases, which is sae as the aplitude needed to detect a k 2 - sparse non-negative vector of length n 2 as deonstrated in Arias-Castro (202). As in the vector case, the structure of the signal coponents as well as the power of adaptivity offer no advantage in the detection proble. For support recovery of the signal block first notice that the necessary and sufficient conditions are tight in the scalings of µ in / upto log factors (which we ignore for this discussion). We use C for possibly different universal constants. The structured data atrix setting re- 4

5 Table 2: Suary of ain results of our paper. We use C for possibly different universal constants. Detection Passive and Active Passive µ in C n k 2, necessary. Theore. µ in C n k 2, sufficient. Arias-Castro (202). µ in C n k ax Localization (, log n k ), necessary. Theore 2. C log n and µ in C n k log n, sufficient. Theore 3. Active ( µ in C ax n ( C log n and µ in C ax k 2, k ), necessary. Theore 4. log(k) log log(k) k ), sufficient. Theore 5. n k 2, quires µ in / to scale as µ in C n k when using non-adaptive copressive easureents. In contrast, the unstructured setting requires a higher aplitude of µ in C n (the necessary conditions appear in Wainwright (2009b), Aeron et al. (200), and Reeves and Gastpar (202)). Structure without adaptivity already yields a factor of k reduction in the sallest µ in that still allows for reliable support recovery. Adaptivity in the copressive easureent design yields further iproveents: with adaptive easureents, identifying the signal ( block requires µ in C ax n,. In contrast, for the sparse vector case, Arias-Castro k 2 et al. (203) showed that adaptive copressive easureents cannot recover the locations of the non-zero coponents if the aplitude of a signal is saller than Cn. It is clear that the k ) potential for gains with an adaptive easureent schee are considerable, either a k 2 factor or an n k factor. Exploiting the structure of the signal coponents and designing adaptive linear easureents can both yield significant gains if the signal corresponds to a contiguous block in a data atrix. The rest of this paper is organized as follows. We describe the proble set up and notation in Section 2. We study the detection proble in Section 3. Section 4 is devoted to non-adaptive support recovery, while Section 5 is focused on adaptive support recovery. Finally, in Section 6 we present and discuss soe siulations that support our findings. The proofs are given in the Appendix. 2 Preliinaries We begin by defining soe notation that will be used frequently in the paper. Notation: We denote [n] to be the set {,..., n}. I{ } denotes the indicator function. For a vector a R n, we let supp(a) = {j : a j 0} be the support set (with an analogous definition for atrices A R n n 2 ), a q, q [, ), the l q -nor defined as a q = ( i [n] a i q ) /q 5

6 with the usual extensions for q {0, }, that is, a 0 = supp(a) and a = ax i [n] a i. For a atrix A R n n 2, we use the notation vec(a) to denote the vector in R n n 2 fored by stacking the coluns of A. We denote the Frobenius nor of A by A 2 fro := i [n ],j [n 2 ] a2 ij. For two atrices A and B of identical diensions, A, B denotes the trace inner product, A, B = trace(a T B). For a atrix A R n n 2, define n = in(n, n 2 ), and denote the ordered singular values of A by (A)... n (A). Denote, the operator nor of A by A op := (A). For two sequences of nubers {a n } n= and {b n} n=, we use a n = O(b n ) to denote that a n Cb n for soe finite positive constant C, and for all n large enough. If a n = O(b n ) and b n = O(a n ), we use the notation a n b n. The notation a n = o(b n ) is used to denote that a n b n n 0. Let A R n n 2 be a signal atrix with unknown entries. We are interested in a structured setting where the support set of the atrix A, that is, the set of non-zero eleents, is a contiguous block of size (k k 2 ), and the eleents in the support set are equal to or larger than the level µ in > 0. Although we focus on the case when k and k 2 are known we also discuss the cases in which one can adapt to the unknown size of the support. In order to siplify the presentation of our results we will assue that k j n j /2 for j =, 2. The following collection contains locations of all contiguous blocks of size k k 2 B = { I r I c : I r and I c are contiguous subsets of [n ] and [n 2 ], I r = k, I c = k 2 }. (2.) With this, the signal atrix A = (a ij ) can be represented as a ij = µ ij I{(i, j) B } with µ ij µ in > 0 for soe (unknown) B B. We consider the following observation odel under which we collect noisy linear easureents, {y,..., y } of A y i = A, X i + ɛ i, i =,...,, (2.2) iid where ɛ,..., ɛ N (0, 2 ), with > 0 known. 2 We use the ter aplitude to refer to µ in. The sensing atrices (X i ) i [] R n n 2 are noralized to satisfy one of the following two conditions: ) X i 2 fro = or 2) E X i 2 fro = for (i []). This noralization ensures that every easureent is ade with the sae aount of energy (possibly in expectation). The first condition is used when the easureents are non-rando, see, for exaple, the detection proble described in Section 3, while the second condition is used for rando easureents. Under the observation odel in Eq. (2.2), we study two tasks: () detecting whether a contiguous block of positive signal exists in A and (2) identifying the support of the signal, B. Both of these are forally defined below. We develop efficient algoriths for these two tasks that provably require the sallest nuber of easureents (up to logarithic factors). The algoriths are designed for one of two easureent schees: () the easureent schee can be ipleented in an adaptive or sequential fashion by letting each X i be a (possibly randoized) function of (y j, X j ) j [i ], or (2) the easureent atrices are chosen We only deal with positive signal in the ain text and briefly discuss how the ethods should be changed to address the signals that are both positive and negative. 2 The easureent schee could be equivalently written in a vector for as y i = vec(a) T vec(x i) + ɛ i. To ephasize that the support of the signal is a contiguous block, we prefer the representation of Eq. (2.2). 6

7 all at once, ignoring the outcoes of previous easureents. We call the first schee active, while the second schee is referred to as passive. Detection. The detection proble consists in deciding whether a (positive) contiguous block of signal exists in A. As we will show later, we can detect the presence of a contiguous block with a uch saller nuber of easureents than is required for identifying its support. Forally, detection is a hypothesis testing proble, with a coposite alternative, of the for H 0 : A = 0 n n 2 (2.3) H : A A(µ in ), where the notation 0 n n 2 is used for the all-zero atrix in R n n 2 and A(µ in ) = {A : A has entries a ij = µ ij I {(i, j) B }, µ ij µ in, B B}. A test T is a function that takes (y i, X i ) i [] as an input and outputs either, if the null hypothesis is rejected, and 0 otherwise. For any test T, we define its risk as R det (T ) := P 0 [ T ( (yi, X i ) i [] ) = ] + sup A A(µ in ) P A [ T ( (yi, X i ) i [] ) = 0 ], where P 0 and P A denote the joint probability distributions of ( (y i, X i ) i [] ) under the null hypothesis and alternative hypothesis, respectively. The risk R det (T ) easures the su of type I and axial type II errors over the set of alternatives. The overall difficulty of the detection proble is quantified by the iniax detection risk R det := inf T Rdet (T ), where the infiu is taken over all tests. When the aplitude of the signal is sufficiently sall, as easured by µ in, the iniax risk can reain bounded away fro zero, which iplies that no test can reliably (that is, with a vanishing probability of error) distinguish H 0 fro H. In Section 3 we will characterize necessary and sufficient conditions for distinguishing H 0 and H in ters of µ in. Identifying the support of the signal A. The proble of identifying the support of the signal A (or support recovery) is that of deterining the exact location of the non-zero eleents of A. Let Ψ be an estiator of B, that is, a function that takes (y i, X i ) i [] as input and outputs an eleent fro B, the estiated location of the support. We define the risk of any such estiator as R sup (Ψ) := while the iniax support recovery risk is [ ( ) ] sup P A Ψ (yi, X i ) i [] supp(a), A A(µ in ) R sup := inf Ψ Rsup (Ψ), where the infiu is taken over all such estiators Ψ. Like in the detection task, the iniax risk specifies the inial risk of any support identification procedure. We focus in this paper on exact support recovery. An interesting extension of the work in this paper, that we do 7

8 not address, would be to understand the iniax risk of support recovery in the Haing etric (for instance). Below we provide upper and lower bounds on the iniax risk for the probles of detection and support recovery as functions of tuples of (n, n 2, k, k 2,, µ in, ), for both the active and passive sapling schees. Our results identify (up to logarithic factors) necessary and sufficient conditions on the inial aplitude of the signal µ in that can be detected or whose support can be recovered given a budget of possibly adaptive easureents. 3 Detection of contiguous blocks In this section, we give necessary and sufficient conditions on the proble paraeters (n, n 2, k, k 2,, µ in, ), for distinguishing between H 0 and H defined in Eq. (2.3). We start with odel (2.2) and assue that each easureent atrix X i can be chosen adaptively. We have the following lower bound on the agnitude of the signal. Theore. Fix any 0 < α <. Based on (possibly adaptive) easureents, if then the iniax detection risk R det α. µ in 8( α) n n 2 k 2, k2 2 The lower bound given in Theore is established by analyzing the risk of the (optial) likelihood ratio test under a unifor prior over the alternatives. Careful odifications of standard arguents are necessary to account for adaptivity. We closely follow the approach of Arias-Castro (202) who established the analogue of Theore in the vector setting. We ake inor odifications (detailed in Appendix A.) to deal with the particular for of the block-structured signals we consider. Next, we describe a test for distinguishing H 0 and H, proposed in Arias-Castro (202). The test is given as T ( ) { (y i ) i [] = I y i > } 2 log(α ) (3.) i with the easureent atrices chosen passively as X i = (n n 2 ) /2 n n 2 (i =,..., ), where n is used to denote the vector of ones in R n. Fro Proposition in Arias-Castro (202), if µ in C n n 2 log(α k ) (3.2) 2k2 2 for soe universal constant C > 0, then R det (T ) α. The lower bound given in Theore atches the upper bound given in Eq. 3.2 up to constants. Taken together they establish that the iniax rate for detection under the odel in Eq. (2.2) is µ in (k k 2 ) n n 2. It is worth pointing out that the structure of the activation pattern does not play any role in the iniax detection proble. Indeed the rate established atches the known bounds for detection in the unstructured vector case Arias-Castro (202). We will contrast this to the proble of support recovery below. Furtherore, the procedure that achieves the adaptive lower bound (up to constants) is non-adaptive, indicating that adaptivity can not help uch in the detection proble. 8

9 4 Support recovery fro passive easureents In this section, we address the proble of estiating the support set B of the signal A fro noisy linear easureents as in equation (2.2). We give a lower bound for the case when the easureent atrices (X i ) i [] are independent with jointly Gaussian entries, with variance n n 2. The variance of the entries is set so that the energy of each easureent E X i 2 fro =. For the upper bound we consider the case when the signal is block constant, that is, µ ij = µ in for (i, j) B. We consider the ore general case of easureent atrices which have possibly correlated sub-gaussian entries. In particular, denote by X the ( n n 2 ) atrix where each row is vec(x i ). We give upper bounds for the case when X = ΨΣ /2, where Ψ R ( n n 2 ) has independent sub-gaussian entries with variance n n 2, where Σ R (n n 2 n n 2 ) satisfies 0 < c λ in (Σ) λ ax (Σ) C for universal constants c, C. We state our results in ters of λ ax (Σ) and λ in (Σ) and track the dependence on the in our proofs. The energy of each easureent E X i 2 tr(σ) fro is n n 2. In order to copare results across different easureent schees we prefer to treat λ in (Σ) and λ ax (Σ) as universal constants, so that the energy of each easureent is also at ost a universal constant. In this case we say the easureent atrices are drawn fro the Σ sub-gaussian enseble. It is easy to see that this generalizes the case of an independent Gaussian enseble (by taking λ in (Σ) = λ ax (Σ) = ). 4. Lower bound The following theore gives a lower bound on the aplitude of the signal, below which no procedure can reliably identify the support set B. Theore 2. Consider sensing with easureent atrices X i for which each entry is independent Gaussian with variance n n 2. There exist positive constants C, α > 0 independent of the proble paraeters (k, k 2, n, n 2 ), such that if ( µ in C n n 2 ax in(k, k 2 ), log(n ) n 2 ), k k 2 then the iniax support recovery risk R sup α > 0. The proof is based on a standard technique described in Chapter 2.6 of Tsybakov (2009). We start by identifying a subset of atrices that are hard to distinguish. Once a suitable finite set is identified, tools for establishing lower bounds on the error in ultiple-hypothesis testing can be directly applied. These tools only require coputing the Kullback-Leibler (KL) divergence between the induced distributions, which in our case are two ultivariate noral distributions. Notice that the bound in Theore 2 has two ters. The first ter characterizes the difficulty of distinguishing two signals whose supports overlap considerably. The second ter characterizes the difficulty of identifying the correct support when the signal is one of a large faily of signals whose supports do not overlap. These constructions and calculations are described in detail in Appendix A.2. 9

10 4.2 Upper bound We will investigate a procedure that searches over all contiguous blocks of size (k k 2 ) as defined in Eq. (2.). For any block B B, let A B be the atrix I{B} I{B} T R n n 2, where I{B} denotes the (n n 2 ) atrix with entries which are for indices in B and 0 otherwise. Our procedure is to solve the following progra: B = arg in B B in µ R + 2 (y i µ A B, X i ) 2. (4.) This progra can be solved in polynoial tie by a brute force search for the outer iniization. We show the following result for the output of this procedure. Theore 3. Consider sensing with easureent atrices X i, that are drawn fro the Σ sub-gaussian enseble. For block-constant signals, for ( ) λax (Σ) 2 ( n n ) 2 C log, λ in (Σ) α for a universal constant C if µ in 2 λax (Σ) λ in (Σ) 2 n n 2 log (2n n 2 /α) in(k, k 2 ) the procedure described above outputs a block B such that R sup ( B) α. The proof follows the general recipe outlined in Negahban and Wainwright (20) of establishing the strong convexity of the easureent enseble for atrices that satisfy a certain cone condition. However, a direct application of those results leads to uch weaker bounds in our setting and several careful odifications are needed in order to obtain sharp results. We present two other analyses in the Appendix. In Appendix A.4, we study a procedure that outputs the block B for which Ψ(B) := y i X i,ab, (4.2) (a,b) B i [] is axiized. We show that this procedure achieves the sae rate for uncorrelated designs but without the assuption of block-constant signal. We also show that a odification of the procedure in Equation 4. is able to estiate a support of a signal that has both positive and negative coponents, without the assuption of a block-constant signal, albeit at a slower rate, in Appendix B. Coparing to the lower bound in Theore 2, we observe that the procedure outlined in this section achieves the lower bound up to a log (n n 2 ) factor. It is also worth noting that the aplitude of the signal needed for correct support recovery is larger than the bound we saw earlier for passive detection. The block structure of the activation allows us, even in the passive setting, to localize uch weaker signals. A straightforward adaptation of results on the LASSO (Wainwright, 0

11 2009a) suggest that if the non-zero entries are spread out (say at rando) then we would require µ in C n n 2 log(n n 2 ) to correctly identify the block B. Exploiting the structure in the signal collection allows us to gain a / in(k, k 2 ) factor. As we will see in Section 5 even ore substantial gains are possible when we choose easureents adaptively. We also note that one can also use a siilar procedure to adapt to the unknown size of the activation block. In particular, one can perfor exhaustive search procedure for all possible sizes of activation blocks. Sall odifications to the proof of Theore 3 can be used to show that this procedure adapts to the unknown block size while still achieving the sae risk. We oit the details but note that siilar odifications have been detailed in Butucea and Ingster (203) for a related proble. 5 Support recovery fro active easureents In this section, we study identification of B using adaptive procedures, that is, the easureent atrix X i ay be a function of (y j, X j ) j [i ]. 5. Lower bound The following theore gives a lower bound on the signal aplitude needed for any active procedure to estiate the support set B. Theore 4. Fix any 0 < α <. There exists a constant C > 0 independent of the proble paraeters (k, k 2, n, n 2 ), such that, given adaptively chosen easureents, if then R sup α. µ in < C( α) ax ( n n 2 k 2, k2 2 ) in(k, k 2 ) The proof is siilar to the proof of Theore 2 in that we construct specific pairs of hypotheses that are hard to distinguish. The two ters in the lower bound reflect the hardness of estiating the support of the signal in two iportant cases. The first ter reflects the difficulty of approxiately estiating the support B. This ter grows at the sae rate as the detection lower bound, and its proof is siilar. Given a coarse estiate of the support, we still need to identify the exact support of the signal. The hardness of this proble gives rise to the second ter in the lower bound. The ter is independent of n and n 2 but has a considerably worse dependence on k and k Upper bound The upper bound is established by analyzing the procedures described in Algoriths and 2 for approxiate and exact recovery of the support of the signal. Algorith is used to approxiately estiate the support B, that is, it locates a 8k 8k 2 block that contains the support B with high probability. The algorith essentially perfors copressive binary

12 Algorith Approxiate support recovery input Measureent budget log p a, a collection of size b p of blocks D of size (2k 2k 2 ). Initial support: J () 0 := {,..., p}, s 0 := log p. For each s in,..., log 2 p. Allocate: s := ( s 0 )s2 s Split: Partition J (s) 0 into two collections of blocks of equal size, J (s) contains the first p/2 s blocks and J (s) 2 the reainder. 3. Sensing atrix: X s = 4. Measure: y (s) i 2 (s 0 s+) 4k k 2 on J (s), X 2 s = (s 0 s+) 4k k 2 = A, X s + z (s) i for i [,..., s ]. on J (s) 2 and 0 otherwise. 5. Update support: J (s+) 0 = J (s) if s y(s) i > 0 and J (s+) 0 = J (s) 2 otherwise. output The single block in J (s0+) 0. a Recall the definition of p = n n 2 4k k 2. b We assue p is dyadic to siplify our presentation of the algorith. Algorith 2 Exact recovery (of coluns) input Measureent budget, a sub-atrix B R 4k 4k2, success probability α.. Measure: y c i = (4k ) /2 4k l= B lc+z c i for i = {,..., /5} and c {, k 2+, 2k 2 +, 3k 2 +}. 2. Let l = argax c /5 yc i, r = l + k 2, b = 6 log 2 k While r l (a) Let c = r+l 2. (b) Measure y c i = (4k ) /2 4k l= B lc + z c i for i = {,..., b}. (c) If a ( ) ) b yc i (log O log k2 b 2 α log k 2 then l = c, otherwise r = c. output Set of coluns {l k 2 +,..., l}. a The exact constants appear in the proof of Theore 5. 2

13 search (Davenport and Arias-Castro (202)) on a collection of non-overlapping blocks that partition the atrix. Specifically, we create 4 collections D, D 2, D 3 and D 4 defined as 3 and D := {B, := [,..., 2k ] [,..., 2k 2 ], B,2 := [2k +,..., 4k ] [,..., 2k 2 ]..., B,n n 2 /4k k 2 := [n 2k,..., n ] [n 2 2k 2,..., n 2 ] } D 2 := {B 2, := [k,..., 3k ] [k 2,..., 3k 2 ], B 2,2 := [3k +,..., 5k ] [k 2,..., 3k 2 ]..., B 2,n n 2 /4k k 2 := [n k,..., n,,..., k ] [n 2 k 2,..., n 2,,..., k 2 ] } D 3 := {B 3, := [k,..., 3k ] [,..., 2k 2 ], B 3,2 := [3k +,..., 5k ] [,..., 2k 2 ]..., B 3,n n 2 /4k k 2 := [n k,..., n,,..., k ] [n 2 2k 2,..., n 2 ] }, D 4 := {B 4, := [,..., 2k ] [k 2,..., 3k 2 ], B 4,2 := [2k +,..., 4k ] [k 2,..., 3k 2 ]..., B 4,n n 2 /4k k 2 := [n 2k,..., n ] [n 2 k 2,..., n 2,,..., k 2 ] } each of which contains p := n n 2 /(4k k 2 ) non-overlapping blocks of the atrix A. D is a partition of the atrix into disjoint blocks of size (2k 2k 2 ), D 3 is a siilar partition shifted down by k rows, D 4 is shifted to the right by k 2 coluns and D 2 is both shifted down by k rows and to the right by k 2 coluns. Figure illustrates this. Notice, that one of these collections ust include a partition (one of D through D 4 ) that copletely contains the support B. Algorith is run separately on each of the collections and it outputs 4 blocks. As we will show in our analysis of this procedure, one of these 4 blocks contains the support B with high probability. Algorith is a odification of the copressive binary search (Davenport and Arias-Castro, 202) adapted to perfor the copressive binary search in a atrix setting. We focus our attention on the collection that copletely contains B. The algorith proceeds in steps. Starting with a collection of disjoint blocks, at each step the collection is split into two halves and a test is perfored to deterine which half is ore likely to contain the support of B. This process continues in log 2 (p) steps until only one block reains. Algorith 2 is used next to precisely identify B within one of the four blocks identified by Algorith. Algorith 2 itself works in several stages: in the first stage the procedure easures repeatedly a sall nuber of coluns, exactly one of which is contained in B, to identify an active colun with high probability. The next stage finds the first non-active colun to the left and right by testing coluns using a binary search procedure. In this way, all the active coluns are located. Finally, Algorith 2 is repeated on the rows to identify the active rows. The following theore states that Algorith and Algorith 2 succeed in estiating the support B with high probability if the aplitude of the signal is large enough. Theore 5. If ( ( )) µ in C n n 2 log (ax(k, k 2 )) log log(k k 2 ) log(/α) ax k 2, k2 2 in(k, k 2 ) 3 For siplicity, we assue n is a ultiple of 2k and n 2 of 2k 2 3

14 Figure : The collection of blocks D is shown in solid lines and the collection D 2 is shown in dashed lines. The collections D 3 and D 4 overlap with these and are not shown. The (k k 2 ) block of activation is shown in red. and 3 log(n n 2 ) then R( B) α, where B is the block output by running Algorith for approxiate localization followed by Algorith 2 on the subset of rows and coluns returned by Algorith. ( Note that the upper bound given in Theore 5 atches the lower bound up to a log ) O (ax(k, k 2 )) log log(k k 2 ) factor. It is worth noting that for sall k and k 2, when the first ter doinates, our active support recovery procedure achieves the detection liits. This is the best result we could hope for. When the support is larger, the lower bound indicates that no procedure can achieve the detection rate. The active procedure still reains significantly ore efficient than the passive one, and is able to recover the support of signals that are weaker by a n n 2 factor. The great potential for gains fro adaptive easureents is clearly seen in our odel which captures the fundaental interplay between structure and adaptivity. 6 Experients In this section, we perfor a set of siulation studies to illustrate finite saple perforance of the proposed procedures. We let n = n 2 = n and k = k 2 = k. Theore 8 and Theore 5 characterize the signal aplitude needed for the passive and active identification of a contiguous block, respectively. We deonstrate that the scalings predicted by these theores are sharp by plotting the probability of successful recovery against appropriately rescaled signal aplitude and showing that the curves for different values of n and k line up. 4

15 Experient. Figure 2 shows the probability of successful localization of B using B defined in Eq. (A.5) plotted against n k (µ in /), where the nuber of easureents = 00. Each plot in Figure 2 represents different relationship between k and n; in the first plot, k = Θ(log n), in the second k = Θ( n), while in the third plot k = Θ(n). The dashed vertical line denotes the threshold position for the scaled aplitude at which the probability of success is larger than We observe that irrespective of the proble size and the relationship between n and k, Theore 8 tightly characterizes the iniu signal aplitude needed for successful identification. Probability of success n k SNR (n, k) = ( 50, 4) (n, k) = (00, 5) (n, k) = (50, 6) (n, k) = (200, 7) n k SNR (n, k) = ( 50, 7) (n, k) = (00, 0) (n, k) = (50, 2) (n, k) = (200, 4) n k SNR (n, k) = ( 50, 0) (n, k) = (00, 20) (n, k) = (50, 30) (n, k) = (200, 40) Figure 2: Probability of success with passive easureents (averaged over 00 siulation runs). Experient 2. Figure 3 shows the probability of successful localization of B using the procedure outlined in Section 5.2., with = 500 adaptively chosen easureents, plotted against the scaled signal aplitude. (µ in /) is scaled by n k 2 in the first two plots where k = Θ(log n) and k = Θ( n) respectively, while in the third plot the aplitude is scaled by k/ log k as k = Θ(n). The dashed vertical line denotes the threshold position for the scaled signal aplitude at which the probability of success is larger than We observe that Theore 5 sharply characterizes the iniu signal aplitude needed for successful identification. Probability of success (n, k) = (280, 0) (n, k) = (286, ) (n, k) = (644, 2) n k 2 SNR (n, k) = ( 8960, 70) (n, k) = (0240, 80) (n, k) = (520, 90) n k 2 SNR Probability of success (n, k) = ( 4000, 000) (n, k) = (6000, 2000) (n, k) = (24000, 3000) k/log(k) SNR Figure 3: Probability of success with adaptively chosen easureents (averaged over 00 siulation runs). 5

16 7 Discussion In this paper, we establish the fundaental liits for the proble of detecting and localizing a contiguous block of weak activation in a data atrix fro either adaptive or non-adaptive copressive easureents. Our bounds characterize the tradeoffs between the signal aplitude, size of the atrix, size of the non-zero block and nuber of easureents. We also deonstrate constructive coputationally efficient estiators that achieve these bounds. Soe interesting direct extensions of the work in this paper would be to address the case of unknown signal support size, to address the ore general proble of recovering a signal block within a randoly peruted data atrix, coonly known as biclustering, and finally to understand localization beyond exact recovery. The contiguous block odel we consider is one useful for of the ore general class of structured sparse signals, where adaptivity helps considerably. Techniques developed in this work have since been applied to the probles of adaptive copressive sensing of tree-sparse and graph-structured signals in the works of Soni and Haupt (203) and Krishnaurthy et al. (203). We expect a full characterization of structured signals for which adaptive sensing can be useful will be a fruitful path for future investigation. Acknowledgeents We would like to thank Larry Wasseran for his ideas, indispensable advice and wise guidance. SB would also like to thank Martin Wainwright for helpful discussions related to the proof of Theore 3. This research is supported in part by AFOSR under grant FA , NSF under grant IIS-6458 and NSF CAREER grant DMS References S. Aeron, V. Saligraa, and M. Zhao. Inforation theoretic bounds for copressed sensing. IEEE Trans. Inf. Theory, 56(0):5 530, 200. E. Arias-Castro. Detecting a vector based on linear easureents. Electron. J. Stat., 6: , 202. E. Arias-Castro, E. J. Candés, and A. Durand. Detection of an anoalous cluster in a network. Ann. Stat., 39(): , 20a. E. Arias-Castro, E. J. Candés, and Y. Plan. Global testing under sparse alternatives: Anova, ultiple coparisons and the higher criticis. Ann. Stat., 39(5): , 20b. E. Arias-Castro, E. J. Candés, and M. A. Davenport. On the fundaental liits of adaptive sensing. IEEE Trans. Inf. Theory, 59():472 48, 203. R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde. Model-based copressive sensing. IEEE Trans. Inf. Theory, 56(4): , 200. S. Bhaidi, P. S. Dey, and A. B. Nobel. Energy landscape for large average subatrix detection probles in gaussian rando atrices. arxiv preprint arxiv:2.2284,

17 L. Birgé. An alternative point of view on lepski s ethod. In State of the art in probability and statistics (Leiden, 999), volue 36 of IMS Lecture Notes Monogr. Ser., pages Inst. Math. Statist., Beachwood, OH, 200. C. Butucea and Y. I. Ingster. Detection of a sparse subatrix of a high-diensional noisy atrix. Bernoulli, 9(5B): , 203. C. Butucea, Y. I. Ingster, and I. Suslina. Sharp variable selection of a sparse subatrix in a high-diensional noisy atrix. arxiv preprint arxiv; , 203. E. J. Candés and M. A. Davenport. How well can we estiate a sparse vector? Appl. Coput. Haron. Anal., 34(2):37 323, 203. E. J. Candés and T. Tao. The Dantzig selector: Statistical estiation when p is uch larger than n. Ann. Stat., 35(6): , E. J. Candés and T. Tao. Near-optial signal recovery fro rando projections: Universal encoding strategies? IEEE Trans. Inf. Theory, 52(2): , E. J. Candés and M. B. Wakin. An introduction to copressive sapling. IEEE Signal Process. Mag., 25(2):2 30, Y. Caron, P. Makris, and N. Vincent. A ethod for detecting artificial objects in natural environents. In 6th Int. Conf. Pattern Recogn., volue, pages IEEE, R. M. Castro. Adaptive sensing perforance lower bounds for sparse signal detection and support estiation. arxiv preprint arxiv: , 202. M. A. Davenport and E. Arias-Castro. Copressive binary search. arxiv preprint arxiv: , 202. D. L. Donoho. Copressed sensing. IEEE Trans. Inf. Theory, 52(4): , M. F. Duarte, M. A. Davenport, M. J. Wainwright, and R. G. Baraniuk. Sparse signal detection fro incoherent projections. In Proc. IEEE Int. Conf. Acoustics Speed and Signal Processing, pages III 305 III 308, M. Golbabaee and P. Vandergheynst. Copressed sensing of siultaneous low-rank and joint-sparse atrices. arxiv preprint arxiv:2.5058, 202. J. D. Haupt and R. D. Nowak. Copressive sapling for signal detection. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, volue 3, pages III 509 III 52. IEEE, J. D. Haupt, R. G. Baraniuk, R. M. Castro, and R. D. Nowak. Copressive distilled sensing: Sparse recovery using adaptivity in copressive easureents. In Proc. 43rd Asiloar Conf. Signals, Systes and Coputers, pages IEEE, J. Huang, T. Zhang, and D. Metaxas. Learning with structured sparsity. J. Mach. Learn. Res., 2: , 20. 7

18 Y. I. Ingster, A. B. Tsybakov, and N. Verzelen. Detection boundary in sparse regression. Electron. J. Stat., 4: , 200. I. M. Johnstone and A. Y. Lu. On consistency and sparsity for principal coponents analysis in high diensions. J. A. Stat. Assoc., 04(486): , R. M. Kainkarya and P. J. Woolf. Pooling in high-throughput drug screening. Current opinion in drug discovery & developent, 2(3): , R. M. Kainkarya, A. Bruex, A. C. Gilbert, J. Schiefelbein, and P. J. Woolf. poolmc: Sart pooling of RNA saples in icroarray experients. Bioinforatics, :299, 200. M. Kolar, S. Balakrishnan, A. Rinaldo, and A. Singh. Miniax localization of structural inforation in large noisy atrices. In J. Shawe-Taylor, R. S. Zeel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, editors, Advances in Neural Inforation Processing Systes 24, pages , 20. V. Koltchinskii, K. Lounici, and A. B. Tsybakov. Nuclear-nor penalization and optial rates for noisy low-rank atrix copletion. Ann. Stat., 39(5): , 20. A. Krishnaurthy, J. Sharpnack, and A. Singh. Recovering graph-structured activations using adaptive copressive easureents. arxiv preprint arxiv: , 203. B. Laurent and P. Massart. Adaptive estiation of a quadratic functional by odel selection. Ann. Stat., 28(5): , M. L. Malloy and R. D. Nowak. Near-optial adaptive copressed sensing. In Proc. 46th Asiloar Conf. Signals, Systes and Coputers, pages IEEE, 202a. M. L. Malloy and R. D. Nowak. Near-optial copressive binary search. arxiv preprint arxiv: , 202b. S. Negahban and M. J. Wainwright. Estiation of (near) low-rank atrices with noise and high-diensional scaling. Ann. Stat., 39(2): , 20. G. Reeves and M. Gastpar. The sapling rate-distortion tradeoff for sparsity pattern recovery in copressed sensing. IEEE Trans. Inf. Theory, 58(5): , 202. E. Richard, P.-A. Savalle, and N. Vayatis. Estiation of siultaneously sparse and low rank atrices. In Proc. 29th Int. Conf. Mach. Learn., pages , 202. M. Rudelson and R. Vershynin. Non-asyptotic theory of rando atrices: extree singular values. arxiv preprint arxiv: , 200. M. Rudelson and R. Vershynin. Hanson-wright inequality and sub-gaussian concentration. Electron. Coun. Probab., 8(0), 203. A. Soni and J. D. Haupt. Efficient adaptive copressive sensing using sparse hierarchical learned dictionaries. In Proc. 45th Conf. Signals, Systes and Coputers, pages IEEE, 20. 8

19 A. Soni and J. D. Haupt. On the fundaental liits of recovering tree sparse vectors fro noisy linear easureents. arxiv preprint arxiv: , 203. X. Sun and A. B. Nobel. On the axial size of large-average and ANOVA-fit subatrices in a gaussian rando atrix. Bernoulli, 9(): , 203. E. Tánczos and R. M. Castro. Adaptive sensing for estiation of structured sparse signals. arxiv preprint arxiv:3.78, 203. A. B. Tsybakov. Introduction To Nonparaetric Estiation. Springer Series in Statistics. Springer, New York, M. J. Wainwright. Sharp thresholds for high-diensional and noisy sparsity recovery using l -constrained quadratic prograing (lasso). IEEE Trans. Inf. Theory, 55(5): , 2009a. M. J. Wainwright. Inforation-theoretic liits on sparsity recovery in the high-diensional and noisy setting. IEEE Trans. Inf. Theory, 55(2): , 2009b. M. B. Wakin, J. N. Laska, M. F. Duarte, D. Baron, S. Sarvotha, D. Takhar, K. F. Kelly, and R. G. Baraniuk. An architecture for copressive iaging. In Proc. IEEE Int. Conf. Iage Processing,, pages IEEE, S. Yoon, C. Nardini, L. Benini, and G. De Micheli. Discovering coherent biclusters fro gene expression data using zero-suppressed binary decision diagras. IEEE/ACM Trans. Coput. Biol. Bioinf., 2(4): ,

20 A Proofs of Main Results In this appendix, we collect proofs of the results stated in the paper. Throughout the proofs, we will denote c, c 2,... positive constants that ay change their value fro line to line. A. Proof of Theore We lower bound the Bayes risk of any test T. defined in Eq. (2.3), Recall, the null and alternate hypothesis, H 0 : A = 0 n n 2 H : A A(µ in ). We will accoplish this by lower bounding the Bayes risk of any test T for a sipler hypothesis testing proble H 0 : A = 0 n n 2 H : A = (a ij ) with a ij = µ in I {(i,j) B}, B B. It is clear that inf T R det (T ) inf T R det (T ). To further bound the worst case risk of T, we consider a unifor prior over the alternatives π, and bound the average risk R det π (T ) = P 0 [T = ] + E A π P A [T = 0], which provides a desired lower bound. Under the prior π, the hypothesis testing proble becoes to distinguish between H 0 : A = 0 n n 2 H : A = (a ij ) with a ij = µ in E B π [I {(i,j) B} ]. Both H 0 and H are siple and the likelihood ratio test is optial by the Neyan-Pearson lea. The likelihood ratio is L E πp A [(y i, X i ) i [] ] = E π P A[y i X i ] P 0 [(y i, X i ) i [] ] P, 0[y i X i ] where the second equality follows by decoposing the probabilities by the chain rule and observing that P 0 [X i (y j, X j ) j [i ] ] = P A [X i (y j, X j ) j [i ] ], since the sapling strategy (whether active or passive) is the sae irrespective of the true hypothesis. The likelihood ratio can be further siplified as ( ) 2y i A, X i A, X i 2 L = E π exp 2 2. The average risk of the likelihood ratio test Rπ det (T ) = 2 E πp A P 0 TV is deterined by the total variation distance between the ixture of alternatives fro the null. By Pinkser s inequality Tsybakov (2009), E π P A P 0 TV KL(P 0, E π P A )/2, 20

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Hamming Compressed Sensing

Hamming Compressed Sensing Haing Copressed Sensing Tianyi Zhou, and Dacheng Tao, Meber, IEEE Abstract arxiv:.73v2 [cs.it] Oct 2 Copressed sensing CS and -bit CS cannot directly recover quantized signals and require tie consuing

More information

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010 A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING By Eanuel J Candès Yaniv Plan Technical Report No 200-0 Noveber 200 Departent of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

A Probabilistic and RIPless Theory of Compressed Sensing

A Probabilistic and RIPless Theory of Compressed Sensing A Probabilistic and RIPless Theory of Copressed Sensing Eanuel J Candès and Yaniv Plan 2 Departents of Matheatics and of Statistics, Stanford University, Stanford, CA 94305 2 Applied and Coputational Matheatics,

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

arxiv: v5 [cs.it] 16 Mar 2012

arxiv: v5 [cs.it] 16 Mar 2012 ONE-BIT COMPRESSED SENSING BY LINEAR PROGRAMMING YANIV PLAN AND ROMAN VERSHYNIN arxiv:09.499v5 [cs.it] 6 Mar 0 Abstract. We give the first coputationally tractable and alost optial solution to the proble

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Highly Robust Error Correction by Convex Programming

Highly Robust Error Correction by Convex Programming Highly Robust Error Correction by Convex Prograing Eanuel J. Candès and Paige A. Randall Applied and Coputational Matheatics, Caltech, Pasadena, CA 9115 Noveber 6; Revised Noveber 7 Abstract This paper

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Generalized Augmentation for Control of the k-familywise Error Rate

Generalized Augmentation for Control of the k-familywise Error Rate International Journal of Statistics in Medical Research, 2012, 1, 113-119 113 Generalized Augentation for Control of the k-failywise Error Rate Alessio Farcoeni* Departent of Public Health and Infectious

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

A Simple Homotopy Algorithm for Compressive Sensing

A Simple Homotopy Algorithm for Compressive Sensing A Siple Hootopy Algorith for Copressive Sensing Lijun Zhang Tianbao Yang Rong Jin Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China Departent of Coputer

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

A method to determine relative stroke detection efficiencies from multiplicity distributions

A method to determine relative stroke detection efficiencies from multiplicity distributions A ethod to deterine relative stroke detection eiciencies ro ultiplicity distributions Schulz W. and Cuins K. 2. Austrian Lightning Detection and Inoration Syste (ALDIS), Kahlenberger Str.2A, 90 Vienna,

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013). A Appendix: Proofs The proofs of Theore 1-3 are along the lines of Wied and Galeano (2013) Proof of Theore 1 Let D[d 1, d 2 ] be the space of càdlàg functions on the interval [d 1, d 2 ] equipped with

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

On the theoretical analysis of cross validation in compressive sensing

On the theoretical analysis of cross validation in compressive sensing MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.erl.co On the theoretical analysis of cross validation in copressive sensing Zhang, J.; Chen, L.; Boufounos, P.T.; Gu, Y. TR2014-025 May 2014 Abstract

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Compressive Sensing Over Networks

Compressive Sensing Over Networks Forty-Eighth Annual Allerton Conference Allerton House, UIUC, Illinois, USA Septeber 29 - October, 200 Copressive Sensing Over Networks Soheil Feizi MIT Eail: sfeizi@it.edu Muriel Médard MIT Eail: edard@it.edu

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Reed-Muller codes for random erasures and errors

Reed-Muller codes for random erasures and errors Reed-Muller codes for rando erasures and errors Eanuel Abbe Air Shpilka Avi Wigderson Abstract This paper studies the paraeters for which Reed-Muller (RM) codes over GF (2) can correct rando erasures and

More information

Recovery of Sparsely Corrupted Signals

Recovery of Sparsely Corrupted Signals TO APPEAR IN IEEE TRANSACTIONS ON INFORMATION TEORY 1 Recovery of Sparsely Corrupted Signals Christoph Studer, Meber, IEEE, Patrick Kuppinger, Student Meber, IEEE, Graee Pope, Student Meber, IEEE, and

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

An RIP-based approach to Σ quantization for compressed sensing

An RIP-based approach to Σ quantization for compressed sensing An RIP-based approach to Σ quantization for copressed sensing Joe-Mei Feng and Felix Kraher October, 203 Abstract In this paper, we provide new approach to estiating the error of reconstruction fro Σ quantized

More information

Kernel-Based Nonparametric Anomaly Detection

Kernel-Based Nonparametric Anomaly Detection Kernel-Based Nonparaetric Anoaly Detection Shaofeng Zou Dept of EECS Syracuse University Eail: szou@syr.edu Yingbin Liang Dept of EECS Syracuse University Eail: yliang6@syr.edu H. Vincent Poor Dept of

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks Bounds on the Miniax Rate for Estiating a Prior over a VC Class fro Independent Learning Tasks Liu Yang Steve Hanneke Jaie Carbonell Deceber 01 CMU-ML-1-11 School of Coputer Science Carnegie Mellon University

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding IEEE TRANSACTIONS ON INFORMATION THEORY (SUBMITTED PAPER) 1 Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding Lai Wei, Student Meber, IEEE, David G. M. Mitchell, Meber, IEEE, Thoas

More information

Leonardo R. Bachega*, Student Member, IEEE, Srikanth Hariharan, Student Member, IEEE Charles A. Bouman, Fellow, IEEE, and Ness Shroff, Fellow, IEEE

Leonardo R. Bachega*, Student Member, IEEE, Srikanth Hariharan, Student Member, IEEE Charles A. Bouman, Fellow, IEEE, and Ness Shroff, Fellow, IEEE Distributed Signal Decorrelation and Detection in Sensor Networks Using the Vector Sparse Matrix Transfor Leonardo R Bachega*, Student Meber, I, Srikanth Hariharan, Student Meber, I Charles A Bouan, Fellow,

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

A Theoretical Framework for Deep Transfer Learning

A Theoretical Framework for Deep Transfer Learning A Theoretical Fraewor for Deep Transfer Learning Toer Galanti The School of Coputer Science Tel Aviv University toer22g@gail.co Lior Wolf The School of Coputer Science Tel Aviv University wolf@cs.tau.ac.il

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Some Proofs: This section provides proofs of some theoretical results in section 3.

Some Proofs: This section provides proofs of some theoretical results in section 3. Testing Jups via False Discovery Rate Control Yu-Min Yen. Institute of Econoics, Acadeia Sinica, Taipei, Taiwan. E-ail: YMYEN@econ.sinica.edu.tw. SUPPLEMENTARY MATERIALS Suppleentary Materials contain

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors New Slack-Monotonic Schedulability Analysis of Real-Tie Tasks on Multiprocessors Risat Mahud Pathan and Jan Jonsson Chalers University of Technology SE-41 96, Göteborg, Sweden {risat, janjo}@chalers.se

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Tie-Varying Jaing Links Jun Kurihara KDDI R&D Laboratories, Inc 2 5 Ohara, Fujiino, Saitaa, 356 8502 Japan Eail: kurihara@kddilabsjp

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information