A Simple Homotopy Algorithm for Compressive Sensing

Size: px
Start display at page:

Download "A Simple Homotopy Algorithm for Compressive Sensing"

Transcription

1 A Siple Hootopy Algorith for Copressive Sensing Lijun Zhang Tianbao Yang Rong Jin Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China Departent of Coputer Science, the University of Iowa, Iowa City, USA Departent of Coputer Science and Engineering, Michigan State University, East Lansing, USA Institute of Data Science and Technologies at Alibaba Group, Seattle, USA {zhanglj, Abstract In this paper, we consider the proble of recovering the s largest eleents of an arbitrary vector fro noisy easureents. Inspired by previous work, we develop an hootopy algorith which solves the l 1 -regularized least square proble for a sequence of decreasing values of the regularization paraeter. Copared to the previous ethod, our algorith is ore efficient in the sense it only updates the solution once for each interediate proble, and ore practical in the sense it has a siple stopping criterion by checking the s- parsity of the interediate solution. Theoretical analysis reveals that our ethod enjoys a linear convergence rate in reducing the recovery error. Furtherore, our guarantee for recovering the top s eleents of the target vector is tighter than previous results, and that for recovering the target vector itself atches the state of the art in copressive sensing. 1 Introduction Copressive Sensing CS) is a new paradig of data acquisition that enables reconstruction of sparse or copressible signals fro a relatively sall nuber of linear easureents Candès and Tao, 2006; Donoho, 2006). The standard assuption is that one has access to linear easureents of the for y = U x +e where x R d is the unknown target vector, U R d is the sensing atrix and e R is a vector Appearing in Proceedings of the 18 th International Conference on Artificial Intelligence and Statistics AISTATS) 2015, San Diego, CA, USA. JMLR: W&CP volue 38. Copyright 2015 by the authors. of noise. Recently, substantial process has been ade in designing the encoder U and the associated decoder which recovers x fro U and y Davenport et al., 2012). One of the ost faous decoders for CS is the l 1 - regularized least squares l 1 -LS) forulation, known as Lasso in statistics Tibshirani, 1996), given by in 1 2 y U x 2 2 +λ x 1. The recovery perforance of Lasso has been extensively studied and generally speaking, it can recover x up to the noise level under appropriate assuptions Daubechies et al., 2004; Tropp, 2006; Zhang, 2009). Under the assuption that λ is given beforehand, Xiao and Zhang 2012) propose a proxial-gradient hootopy ethod to iprove the efficiency of l 1 -LS. The key idea is to solve the l 1 -LS proble for a sequence of decreasing values of the regularization paraeter, and use an approxiate solution at the end of each stage to war start the next stage. In this study, we ake three steps further. We show that i) it is sufficient to run coposite gradient apping only once for each regularization paraeter, ii) the target regularization paraeter can be detected adaptively based on the sparsity of the current solution, and iii) this siple algorith can deliver a tighter bound for recovering the s largest eleents of x. For a vector x, we denote by x s the vector that contains the s largest eleents of x. Let ˆx be the solution returned by our algorith. Under the assuption that U is a sub-gaussian rando atrix, our algorith is able to reduce the recovery error exponentially over iterations, andthefinalrecovererror ˆx x s 2 issaller than ) slogd O e 2 + x x s 2 )+ x x s ) s 2 where e is the vector of noise, and x x s ) s 2 is the 1116

2 A Siple Hootopy Algorith for Copressive Sensing l 2 -nor of the s largest eleents of x x s. In contrast, previous analysis in CS Davenport et al., 2012) can only upper bound the recovery error for x s by O e 2 + x x s 2 ). Thus, our recovery guarantee could be significantly better than previous results if the l 2 -nor of x x s is not concentrated on its s largest eleents i.e., x x s ) s 2 x x s 2 ). Followingthetriangleinequality, we obtain the following bound for recovering x ˆx x 2 O x x s 2 + ) slogd e 2 which atches state of the art DeVore et al., 2009). 2 Related Work Existing algoriths in CS could be roughly categorized into convex optiization based approaches and greedy approaches Davenport et al., 2012; Bluensath et al., 2012). Roughly speaking, convex approaches have better theoretical guarantee, while greedy approaches are ore efficient. In the noise-free setting, Candès and Tao 2005) pose thefollowingl 1 -iniizationproble, denotedby 1, for decoding in x 1 s.t. U x = y. To analyze the recovery perforance, they introduce the Restricted Isoetry Property RIP) for atrices. Definetheisoetryconstant ofu asthesallestnuber δ s such that the following holds for all s-sparse vectors x R d 1 δ s ) x 2 2 U x δ s ) x 2 2. It has been shown that if δ s + δ 2s + δ 3s < 1 or δ 2s < 2 1, the decoder 1 yields perfect recovery for all s-sparse vectors x R d Candès and Tao, 2005; Candès, 2008). If U is constructed as the rando atrix with independent sub-gaussian coluns, a sufficient condition is to take = Ωslogd) easureents Mendelson et al., 2008). In the general case, Candès 2008) propose the following convex forulation, denoted by ǫ 1, for decoding in x 1 s.t. y U x 2 ǫ where ǫ is an upper bound of e 2. Let ǫ 1U x +e) be the solution returned by the above decoder. Suppose U satisfies the RIP of order 2s with δ 2s < 2 1, ǫ 1U x x s ) x +e) x 2 O 1 +ǫ s for all x R d. An obvious drawback of this approach is that we ust have a good a priori estiate of e 2. This liitation is soon addressed by Wojtaszczyk 2010), who shows that the decoder 1 perfors very well even in the noise setting. In particular, if U is a Gaussian rando atrix and = Ωs log d), with an overwheling probability, 1 U x x s ) x +e) x 2 O 1 + e 2 s for all x R d. Another possible way is to estiate the noise level under the Bayesian fraeworkji et al., 2008). Notice that in the above inequalities, the l 2 -nor of the recovery error is upper bounded by the l 1 - nor of the corresponding error of the best s-ter approxiation. To ake the result ore consistent, it is natural to ask whether we could upper bound U x +e) x 2 by x x s 2 for soe decoder. Unfortunately, even in the noise-free setting, if we want the following inequality U x ) x 2 O x x s 2 ) to hold for all x R d, the nuber of easureents needs to be on the order of d Cohen et al., 2009). This difficulty otivates the study of instance optiality in probability, which eans we are looking for soe perforance guarantee that holds with a high probability for an arbitrary but fixed vector x. When U is a Gaussian rando atrix and = Ωslogd), Wojtaszczyk 2010) shows that with an overwheling probability 1 U x +e) x 2 O x x s 2 + e 2 ) for any fixed x R d. This result is extended to ore general failies of atrices by DeVore et al. 2009). The typical greedy approaches include Matching Pursuit MP) Mallat and Zhang, 1993), Iterative Hard Thresholding IHT) Bluensath and Davies, 2008; Garg and Khandekar, 2009), and Copressive Sapling Matching Pursuit CoSaMP) Needell and Tropp, 2009). Due to space liitation, we only give a brief description of IHT. In each iteration, IHT first perfors gradient descent with respect to U x y 2 2 and then applies hard-thresholding to keep the s largest eleents, in contrast to the soft-thresholding in our ethod. Let x t denote the solution of IHT in the t-th iteration. The updating rule is given by x t+1 = [ x t ηuu x t y) ] s where η is the step size. Under certain RIP condition, it has been shown that the recovery error of IHT can 1117

3 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou Algorith 1 A Siple Hootopy Algorith Input: Sensing Matrix U R d n, Measureents y R, Shrinking Paraeter γ, Sparsity s, Maxiu NuberofIterationsT, 1: Initialize x 1 = 0, λ 1 = Uy 2: for t = 1,2,...,T do 3: x t+1 = P λt xt UU x t y) ) 4: if x t+1 0 > 2s then 5: Return x t 6: end if 7: λ t+1 = γλ t 8: end for 9: Return x T+1 be upper bounded by O x x s 2 + x x s ) 1 + e 2 s for all x R d Bluensath et al., 2012). 3 A Siple Hootopy Algorith for Copressive Sensing We first introduce the proposed hootopy algorith, and then present its theoretical guarantee. 3.1 The Algorith In our algorith, we solve a sequence of l 1 -regularized least squares l 1 -LS) with decreasing regularization paraeters. In particular, we set λ 1 = Uy, λ t+1 = γλ t for soe γ < 1. For each interediate l 1 -LS proble, we use the solution fro the previous iteration as the initial point, and perfor coposite gradient apping Nesterov, 2013) to update it once, i.e., x t+1 =argin x,uu x t y) + x x t 2 2 +λ t x 1 x R 2 d =P λt xt UU x t y) ) where P λt ) is the soft-thresholding operator Donoho, 1995) defined as { 0, if α λt ; P λt α) = signα) α λ t ), otherwise. The algorith will stop when the sparsity of interediate solution exceeds the budget 2s. The above procedure is suarized in Algorith 1. Although the basic idea of our algorith is otivated fro the proxial-gradient hootopy PGH) ethod Xiao and Zhang, 2012), we ake the following substantial extensions. Multiple iterations are needed by PGH to solve the interediate optiization proble to achieve soe predefined precision, in contrast, our algorith only updates once for each proble. While PGH assues a target regularization paraeter is given beforehand, our algorith is able to detect it adaptively based on the sparsity of the interediate solution. Xiao and Zhang 2012) only analyze the recovery error of PGH for exactly sparse vectors, in coparison, we provide recovery guarantee for an arbitrary vector. This difference is ore fundaental in the context of copressive sensing. 3.2 Main Results We first describe the assuptions about the sensing atrix U = 1 [u 1,...,u ] = 1 [v 1,...,v d ] R d where u i R d,i = 1,...,, and v j R,j = 1,...,d. We assue Both the coluns and rows of U are sub-gaussian vectors, and for the sake of siplicity, we assue the sub-gaussian nor is saller than 1. That is, for any vector x R d and z R, x u i ψ2 x 2,i = 1,..., and z v j ψ2 z 2,j = 1,...,d, where ψ2 is the sub- Gaussian nor of rando variables Vershynin, 2012). u i s are sapled independently fro an isotropic distribution, that is, E[u i u i ] = I. Exaples of such rando vectors are Gaussian vectors and Radeacher vectors. Given a fixed vector x, we receive easureents y = U x + e, where e encodes potential noise. In this paper,eisunknown andweakenoassuptionabout the noise odel. Our goal is to recover the s largest eleents of x fro U and y. We have the following theore to bound the recovery error. 1 Theore 1. Let ˆx be the solution output fro our algorith and T is the axiu nuber of iterations allowed. Choosing γ = , then, with a probability at least 1 6Te τ, ˆx x s 2 ax 6Λ,3λ 1 sγ T ) 1 As a reinder, x s denotes the vector that contains the s largest eleents of x. 1118

4 A Siple Hootopy Algorith for Copressive Sensing where sτ +logd) Λ = e 2 ) +C x x s ) s 2 + x x s 2 for soe universal constant C, provided 1) C ) Reark The constants in the above theore should not be treated literally, because ade no effort to optiize the. Generally speaking, a saller γ will lead to a fast convergence rate, but a larger constant, which is 6 now, in the recovery guarantee. The above theore iplies that the recovery error reduces exponentially until it reaches OΛ). Thus, with a sufficiently large T, the recovery error of x s can be upper bounded by O ) slogd e 2 + x x s 2 )+ x x s ) s 2. In the noise-free setting, our analysis also iplies exact recovery of s-sparse vectors, since Λ = 0 if e 2 = 0 and x 0 s. Fro the literature of CS Davenport et al., 2012, Theore 1.14), we find that previous analysis is able to upper bound the recovery error of x s by c x x s 2 + e 2 ) for soe constant c > 1. Thus, when is large enough, our upper bound could be significantly saller than the existing results. Finally, it is worth to ention that if our goal is to recover x, following the triangle inequality, our analysis yields the sae upper bound as previous studies DeVore et al., 2009). 3.3 A Post-processing Step Since ˆx is 2s-sparse and x s is s-sparse, one ay ask whether it is possible to find a good s-sparse vector to approxiate x s. The following theore shows that we can siply select the s largest eleents of ˆx to approxiate x s and the recovery error is on the sae order. Theore 2. Let y R d be a s-sparse vector. Then, x s y 2 3 x y 2, x R d. 4 Analysis We here present the proofs of ain theores. The oitted proofs are provided in the suppleentary aterial. 4.1 Proof of Theore 1 Notice that starting our algorith with λ 1 = Uy has the sae effect as staring with Uy γ k, k Z, which eans we can set λ 1 as large as we need. Thus, without loss of generality, we can assue λ s ax xs 2,6Λ). 3) We first state two theores that are central to our analysis. Theore 3 reveals that the recovery error of our algorith will reduce by a constant factor until it reaches the optial level. Then, Theore 4 shows that the recovery error will reain sall, as long as the sparsity of the solution does not exceed 2s. We denote by S and S t the support set of x s and x t, respectively. Theore 3. Assue S t \ S s, x t x s 2 3λ t s, and Λ 1 2 λ t s, where Λ is given in 1). Then, with a probability at least 1 6e τ, S t+1 \S s and x t+1 x s 2 3λ t+1 s provided the condition in 2) is true. Theore 4. Assue S t 2s, x t x s 2 6Λ, and Λ > 1 2 λ t s, where Λ is given in 1). If St+1 2s, then with a probability at least 1 6e τ, x t+1 x s )Λ provided the condition in 2) is true. We continue the proof of Theore 1 in the following. Let { k = in t : Λ > 1 } 2 λ 3) t s > 1. In the following, we will show that the recovery error x t x s 2 will first decrease exponentially as t approaches k, and then keep below 6Λ. T < k Fro 3), x 1 x s 2 = x s 2 3λ 1 s. Since the condition Λ λ t s 2 holds for t = 1,...,T, we can apply Theore 3 to bound the recovery error in each iteration. Thus, with a probability at least 1 6Te τ, ˆx x s 2 = x T+1 x s 2 3λ T+1 s = 3λ1 sγ T. 1119

5 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou T k Fro the above analysis, with a probability at least 1 6k 1)e τ, x k x s 2 3 sλ k and S k \S s, which also eans our algorith arrives the k-th iteration. In the k-th iteration, there will be two cases: S k+1 > 2s and S k+1 2s. For the first case, our algorith terinates, and return x k as the final solution, iplying ˆx x s 2 = x k x s 2 3λ k s 6Λ. For the second case, our algorith keeps running, and wecanboundtherecoveryerrorofx k+1 bytheore4. If T = k or S k+2 > 2s, our algorith terinates and return x k+1 as the final solution, which iplies ˆx x s 2 = x k+1 x s )Λ. Otherwise, our algorith keeps running. Since ) 6, the condition in Theore 4 are satisfied, and thus can be applied repeatedly to bound the recovery error for all the rest iterations. 4.2 Proof of Theore 3 We need the following theore to analyze the behavior of the coposite gradient descent. Theore 5. Suppose x t x s is a 3s-sparse vector. With a probability at least 1 6e τ, [ U U x t y ) x t x s ) ] s 2 Λ+C x t x s 2 where Λ is given in 1). Given a set S [d], x S denotes the vector which coincides with x on S and has zero coordinates outside S. We denote the sub-gradient of 1 by 1. Using the fact that x t+1 1 =argin x xt +U U x t y ) 2 x R 2 +λ 2 t x 1, d 0 x t+1 x t +U U x t y ) +λ t x t+1 1, x s x t+1 x t+1 x t +U U x t y ),x s x t+1 +λ t x s 1 λ t x t+1 1 x t+1 x t +U U x t y ),x s x t+1 +λ t x s 1 λ t [x t+1 ] S 1 and thus x t+1 x t +U U x t y ),x s x t+1 +λ t [x t+1 x s ] S 1 x t+1 x t +U U x t y ),x s x t+1 +λ t s xt+1 x s 2 λ t s xt+1 x s 2 x t+1 x t +U U x t y ),x t+1 x s = x t+1 x s 2 2 +x t+1 x s ) U U x t y ) x t x s ) ). 4) According to Theore 5, with a probability at least 1 6e τ, [ U U x t y ) x t x s ) ] s 2 Λ+C x t x s 2 2) Λ+ 1 6 x t x s 2 λ t s. The above inequality iplies the agnitude of the s- allest d s eleents of x t UU x t y) x s is saller than λ t. Cobining with the fact that x t+1 = P λt xt UU x t y) ), it is easy to verify that S t+1 \S s. Furtherore, xt+1 x s ) U U x t y ) x t x s ) ) [x t+1 x s ] S 2 + [x t+1 x s ] St+1\S 2 ) λt s λ t 2s xt+1 x s 2. Substituting the above inequality into4), with a probability at least 1 6e τ, x t+1 x s 2 ) 2 λ t s+λt 2s x t+1 x s 2 and thus x t+1 x s )λ t s 3λt+1 s. 4.3 Proof of Theore 4 We need to reuse 4) that appears in the analysis of Theore 3. According to Theore 5, with a probability at least 1 6e τ, [ U U x t y ) x t x s ) ] s 2 2) Λ+ 1 6 x t x s 2 2Λ. Notice that x t+1 x s is 3s-sparse in this case, and it is easy to verify that xt+1 x s ) U U x t y ) x t x s ) ) 2 3Λ x t+1 x s

6 A Siple Hootopy Algorith for Copressive Sensing Substituting the above inequality into 4), x t+1 x s 2 ) 2 λ t s+2 3Λ x t+1 x s 2 and thus 21+ 3)Λ x t+1 x s 2, x t+1 x s )Λ. 4.4 Proof of Theore 5 In the analysis, we need to bound UU z) s 2 for a fixed vector z R d, and [UU I)z] s 2, for all 3ssparse vectors z R d. Thus, we build the following two theores. Theore 6. For a fixed z R d, with a probability at least 1 2e τ, UU z ) ) s 2 C z 2 + z s 2 for soe constant C > 0. Theore 7. With a probability at least 1 2e τ, for all z R d with z 0 3s, [ UU I)z ] s 2 C z 2 for soe constant C > 0. We rewrite U U x t y ) x t x s ) as U U x t y ) x t x s ) =U U x t U x e ) x t x s ) =UU x s x ) }{{} +UU I)x t x s }{{ ) } :=w a :=w b Then, [ U U x t y ) x t x s ) ] s 2 w s a 2 + w s b 2 + w s c 2. }{{} Ue. :=w c Bounding wa s 2 According to Theore 6, with a probability at least 1 2e τ, wa s 2 = [ UU x s x ) ] s 2 ) C x x s 2 + x x s ) s 2 for soe constant C > 0. Bounding wb s 2 Notice that x t x s is a 3s-sparse vector. According to Theore 7, with a probability at least 1 2e τ, wb s 2 = [ UU I)x t x s ) ] s 2 C x t x s 2 for soe constant C > 0. Bounding w s c 2 Since U = 1 [v 1,...,v d ], and we assue v i is a sub-gaussian vector. We have v j e ψ2 e 2, j = 1,...,d. Using the property of Orliz nor Koltchinskii, 2009, 2011), with a probability at least 1 2e τ, v j e v j e ψ2 τ e 2 τ. By taking the union bound,, with a probability at least 1 2e τ, Ue = 1 τ +logd ax v j e e 2 j iplying wc s 2 = Ue) s sτ +logd) 2 e 2. We coplete the proof by cobining the bounds for w s a 2, w s b 2, and w s c Proof of Theore 6 We define the set of s-sparse vectors with length saller than 1 as K d,s = { x R d : x 2 1, x 0 s }. Then, it is easy to check that E s z) := ax w K d,s w UU z = UU z) s 2. Let K d,s ǫ) be a proper ǫ-net for K d,s with the sallest cardinality, and NK d,s,ǫ) = K d,s ǫ) be the covering nuber for K d,s. We have the following lea for bounding NK d,s,ǫ) Plan and Vershynin, 2013, Lea 3.3). Lea 1. For ǫ 0,1) and s d, lognk d,s,ǫ) slog ) 9d. ǫs 1121

7 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou Using the ǫ-net K d,s ǫ), we define a discretized version of E s z) as E s z,ǫ) = ax w K d,s ǫ) w UU z. The following lea relates E s z) with E s z,ǫ). Lea 2. For ǫ 0,1/ 2), E s z) E sz,ǫ) 1 2ǫ. Based on the conclusion fro Lea 2, it is sufficient to bound E s z,ǫ). To this end, we need the following lea to bound the difference between w UU z w z for a fixed w. Lea 3. For fixed w and z with w 2 1, with a probability at least 1 2e τ, τ w UU z w z C z 2 for soe constant C > 0. By taking the union bound, with a probability at least 1 2e τ, ax w UU z w z Since w K d,s ǫ) τ +slog9d/[ǫs]) C z 2. ax w K d,s ǫ) w z ax w z = z s 2, w K d,s E s z,ǫ) = ax w K d,s ǫ) w UU z ax w UU z w z + ax w K d,s ǫ) τ +slog9d/[ǫs]) C z 2 + z s 2. w K d,s ǫ) w z WecopletetheproofbyusingLea2withǫ = 1/ Proof of Theore 7 Recall the definitions of K d,s and K d,s ǫ) in the proof of Theore 6. Evidently, for any z with z 0 3s, we have [ UU I)z ] s 2 G 3s z 2 where G 3s = ax r K d,3s [ UU I)r ] s 2. Thus, what we need is to provide an upper bound for G 3s. Define F s r) = [ UU I)r ] s 2 = ax w K d,s w UU I)r. We first provide an upper bound for F s r), and then G 3s. Fro the analysis of Theore 6, with a probability at least 1 2e τ, F s r,ǫ) = ax w K d,s ǫ) w UU I)r τ +slog9d/[ǫs]) C r 2. for soe constant C > 0. Following the proof of Lea 2, it is straightforward to show that F s r) F sr,ǫ) 1 2ǫ, ǫ 0,1/ 2). Siilar to the proof of Theore 6, we can set ǫ = 1/2. Thus, for a fixed r, with a probability at least 1 2e τ, F s r) C r 2 for soe constant C > 0. Next, we repeat the above arguent again to bound G 3s. We define a discreted version of G 3s as G 3s ǫ) = ax r K d,3s ǫ) F sr). By taking the union bound, with a probability at least 1 2e τ, +3slog9d/[3ǫs]) G 3s ǫ) C for soe constant C > 0. We coplete the proof by using G 3s G 3sǫ) 1 2ǫ, ǫ 0,1/ 2). 5 Epirical Study In this section, we perfor several experients to exaine the recovery perforance of our ethod. We first exaine the ability of exact recovery under noise-free setting. x R is a 20-sparse vector, that is generated randoly and noralized to unit length. We construct the sensing atrix U R as the Gaussian rando atrix. Fig. 1 shows how the recovery error of our algorith decreases with different choices of γ. It is clear that our algorith achieves a linear convergence rate, and a saller γ leads to a faster convergence rate. 1122

8 A Siple Hootopy Algorith for Copressive Sensing Recovery Error γ=0.8 γ=0.92 γ=0.97 γ=0.99 Recovery Error γ=0.8 γ=0.92 γ=0.97 γ= # of Iterations # of Iterations Figure 1: x is sparse. Figure 2: x is dense. Recovery Error γ=0.8 γ=0.92 γ=0.97 γ=0.99 Recovery Error SHAγ=0.87) SHAγ=0.89) SHAγ=0.9) PGH # of Iterations # of Updates Figure 3: x is dense and y contains noise. Figure 4: SHA versus PGH. We then generate a unit length x R that has a power-law decay. Specifically, the i-th largest eleents of x is proportional to i 1. Fig. 2 shows the recovery error for x 20 without noise and Fig. 3 shows the recovery error when the easureents are containated by Gaussian noise with e 2 = 0.5. We observe that the recovery error still decreases rapidly until it reaches certain precision, at which the sparsity of the current solution exceeds 40 and our ethod terinates. Finally, we copare our siple hootopy algorith SHA) with the proxial-gradient hootopy ethod PGH). Following the setting in Xiao and Zhang, 2012), we construct U R where entries are sapled independently fro U 3, 3), i.e., the u- nifor distribution over [ 3, 3], x R 5000 where x 0 = 100andthenon-zeroentriesaresapledfro U 1,1), and e R 1000 where entries are sapled fro U 0.01, 0.01). Fig. 4 shows that our algorith converges faster. 6 Conclusion and Future Work In this paper, we provide a siple hootopy algorith for CS that is ore efficient and practical than its counterpart. Theoretical analysis shows that our algorith has a linear convergence rate in reducing the recoveryerror, andtherecoveryguarantee for x s could be uch tighter than previous results under appropriate conditions. Notice that existing studies in CS provide atching lower and upper bounds for recovering the whole vector x Donoho, 2006; Cohen et al., 2009). It is unclear to us what would be the lower bound if our goal istorecoverx s. Wewillinvestigatethisissueinfuture. We will also study how to apply our algorith to onebit copressive sensing Boufounos and Baraniuk, 2008; Plan and Vershynin, 2013; Zhang et al., 2014), which is a new setting of CS where the easureent is quantized to a single bit. Acknowledgeents This research was supported by National Science Foundation of China ), Collaborative Innovation Center of Novel Software Technology and Industrialization, NSF IIS ) and ONR Award N

9 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou References T. Bluensath and M. E. Davies. Iterative thresholding for sparse approxiations. Journal of Fourier Analysis and Applications, 145-6): , T. Bluensath, M. E. Davies, and G. Rilling. Greedy algoriths for copressed sensing. In Copressed Sensing, Theory and Applications, chapter 8, pages Cabridge University Press, P. T. Boufounos and R. G. Baraniuk. 1-bit copressive sensing. In Proceedings of the 42nd Annual Conference on Inforation Sciences and Systes, pages 16 21, E. J. Candès. The restricted isoetry property and its iplications for copressed sensing. Coptes Rendus Matheatique, ): , E. J. Candès and T. Tao. Decoding by linear prograing. Inforation Theory, IEEE Transactions on, 5112): , E. J. Candès and T. Tao. Near-optial signal recovery fro rando projections: Universal encoding s- trategies? IEEE Transactions on Inforation Theory, 5212): , A. Cohen, W. Dahen, and R. DeVore. Copressed sensing and best k-ter approxiation. Journal of the Aerican Matheatical Society, 221):211C 231, I. Daubechies, M. Defrise, and C. D. Mol. An iterative thresholding algorith for linear inverse probles with a sparsity constraint. Counications on Pure and Applied Matheatics, 5711): , M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok. Introduction to copressed sensing. In Copressed Sensing, Theory and Applications, chapter 1, pages Cabridge University Press, R. DeVore, G. Petrova, and P. Wojtaszczyk. Instanceoptiality in probability with an l 1 -iniization decoder. Applied and Coputational Haronic Analysis, 273): , D. L. Donoho. De-noising by soft-thresholding. IEEE Transactions on Inforation Theory, 413): , D. L. Donoho. Copressed sensing. IEEE Transactions on Inforation Theory, 524): , R. Garg and R. Khandekar. Gradient descent with sparsification: An iterative algorith for sparse recovery with restricted isoetry property. In Proceedings of the 26th Annual International Conference on Machine Learning, pages , S. Ji, Y. Xue, and L. Carin. Bayesian copressive sensing. IEEE Transactions on Signal Processing, 566): , V. Koltchinskii. The dantzig selector and sparsity oracle inequalities. Bernoulli, 153): , V. Koltchinskii. Oracle Inequalities in Epirical Risk Miniization and Sparse Recovery Probles. Springer, S. G. Mallat and Z. Zhang. Matching pursuits with tie-frequency dictionaries. IEEE Transactions on Signal Processing, 4112): , S. Mendelson, A. Pajor, and N. Toczak-Jaegerann. Unifor uncertainty principle for bernoulli and subgaussian ensebles. Constructive Approxiation, 283): , D. Needell and J. A. Tropp. Cosap: Iterative signal recovery fro incoplete and inaccurate saples. Applied and Coputational Haronic Analysis, 26 3): , Y. Nesterov. Gradient ethods for iniizing coposite functions. Matheatical Prograing, 140 1), Y. Plan and R. Vershynin. One-bit copressed sensing by linear prograing. Counications on Pure and Applied Matheatics, 668): , R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B Methodological), 581): , J. A. Tropp. Just relax: convex prograing ethods for identifying sparse signals in noise. IEEE Transactions on Inforation Theory, 523): , R. Vershynin. Introduction to the non-asyptotic analysis of rando atrices. In Copressed Sensing, Theory and Applications, chapter 5, pages Cabridge University Press, P. Wojtaszczyk. Stability and instance optiality for gaussian easureents in copressed sensing. Foundations of Coputational Matheatics, 101), ISSN L. Xiao and T. Zhang. A proxial-gradient hootopy ethodforthel 1 -regularizedleast-squaresproble. In Proceedings of the 29th International Conference on Machine Learning, pages , L. Zhang, J. Yi, and R. Jin. Efficient algoriths for robust one-bit copressive sensing. In Proceedings of the 31st International Conference on Machine Learning, T. Zhang. Soe sharp perforance bounds for least squares regression with l 1 regularization. The Annals of Statistics, 375A): ,

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

On the theoretical analysis of cross validation in compressive sensing

On the theoretical analysis of cross validation in compressive sensing MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.erl.co On the theoretical analysis of cross validation in copressive sensing Zhang, J.; Chen, L.; Boufounos, P.T.; Gu, Y. TR2014-025 May 2014 Abstract

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Hamming Compressed Sensing

Hamming Compressed Sensing Haing Copressed Sensing Tianyi Zhou, and Dacheng Tao, Meber, IEEE Abstract arxiv:.73v2 [cs.it] Oct 2 Copressed sensing CS and -bit CS cannot directly recover quantized signals and require tie consuing

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

An RIP-based approach to Σ quantization for compressed sensing

An RIP-based approach to Σ quantization for compressed sensing An RIP-based approach to Σ quantization for copressed sensing Joe-Mei Feng and Felix Kraher October, 203 Abstract In this paper, we provide new approach to estiating the error of reconstruction fro Σ quantized

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

arxiv: v5 [cs.it] 16 Mar 2012

arxiv: v5 [cs.it] 16 Mar 2012 ONE-BIT COMPRESSED SENSING BY LINEAR PROGRAMMING YANIV PLAN AND ROMAN VERSHYNIN arxiv:09.499v5 [cs.it] 6 Mar 0 Abstract. We give the first coputationally tractable and alost optial solution to the proble

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010 A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING By Eanuel J Candès Yaniv Plan Technical Report No 200-0 Noveber 200 Departent of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065

More information

Efficient Algorithms for Robust One-bit Compressive Sensing

Efficient Algorithms for Robust One-bit Compressive Sensing Efficient Algoriths for Robust One-bit Copressive Sensing Lijun Zhang ZHANGLJ@LAMDA.NJU.EDU.CN National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 003, China Jinfeng Yi JINFENGY@US.IBM.COM

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison yströ Method vs : A Theoretical and Epirical Coparison Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou Machine Learning Lab, GE Global Research, San Raon, CA 94583 Michigan State University,

More information

A Probabilistic and RIPless Theory of Compressed Sensing

A Probabilistic and RIPless Theory of Compressed Sensing A Probabilistic and RIPless Theory of Copressed Sensing Eanuel J Candès and Yaniv Plan 2 Departents of Matheatics and of Statistics, Stanford University, Stanford, CA 94305 2 Applied and Coputational Matheatics,

More information

Compressive Sensing Over Networks

Compressive Sensing Over Networks Forty-Eighth Annual Allerton Conference Allerton House, UIUC, Illinois, USA Septeber 29 - October, 200 Copressive Sensing Over Networks Soheil Feizi MIT Eail: sfeizi@it.edu Muriel Médard MIT Eail: edard@it.edu

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Highly Robust Error Correction by Convex Programming

Highly Robust Error Correction by Convex Programming Highly Robust Error Correction by Convex Prograing Eanuel J. Candès and Paige A. Randall Applied and Coputational Matheatics, Caltech, Pasadena, CA 9115 Noveber 6; Revised Noveber 7 Abstract This paper

More information

arxiv: v1 [math.na] 10 Oct 2016

arxiv: v1 [math.na] 10 Oct 2016 GREEDY GAUSS-NEWTON ALGORITHM FOR FINDING SPARSE SOLUTIONS TO NONLINEAR UNDERDETERMINED SYSTEMS OF EQUATIONS MÅRTEN GULLIKSSON AND ANNA OLEYNIK arxiv:6.395v [ath.na] Oct 26 Abstract. We consider the proble

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Tail estimates for norms of sums of log-concave random vectors

Tail estimates for norms of sums of log-concave random vectors Tail estiates for nors of sus of log-concave rando vectors Rados law Adaczak Rafa l Lata la Alexander E. Litvak Alain Pajor Nicole Toczak-Jaegerann Abstract We establish new tail estiates for order statistics

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

ORIE 6340: Mathematics of Data Science

ORIE 6340: Mathematics of Data Science ORIE 6340: Matheatics of Data Science Daek Davis Contents 1 Estiation in High Diensions 1 1.1 Tools for understanding high-diensional sets................. 3 1.1.1 Concentration of volue in high-diensions...............

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

A Note on Online Scheduling for Jobs with Arbitrary Release Times

A Note on Online Scheduling for Jobs with Arbitrary Release Times A Note on Online Scheduling for Jobs with Arbitrary Release Ties Jihuan Ding, and Guochuan Zhang College of Operations Research and Manageent Science, Qufu Noral University, Rizhao 7686, China dingjihuan@hotail.co

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION GUANGHUI LAN AND YI ZHOU Abstract. In this paper, we consider a class of finite-su convex optiization probles defined over a distributed

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Suprema of Chaos Processes and the Restricted Isometry Property

Suprema of Chaos Processes and the Restricted Isometry Property Suprea of Chaos Processes and the Restricted Isoetry Property Felix Kraher, Shahar Mendelson, and Holger Rauhut June 30, 2012; revised June 20, 2013 Abstract We present a new bound for suprea of a special

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Sigma Delta quantization of sub-gaussian frame expansions and its application to compressed sensing

Sigma Delta quantization of sub-gaussian frame expansions and its application to compressed sensing Inforation and Inference: A Journal of the IMA (204) 3, 40 58 doi:0093/iaiai/iat007 Advance Access publication on 4 February 204 Siga Delta quantization of sub-gaussian frae expansions and its application

More information

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions Journal of Matheatical Research with Applications Jul., 207, Vol. 37, No. 4, pp. 496 504 DOI:0.3770/j.issn:2095-265.207.04.0 Http://jre.dlut.edu.cn An l Regularized Method for Nuerical Differentiation

More information

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel 1 Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel Rai Cohen, Graduate Student eber, IEEE, and Yuval Cassuto, Senior eber, IEEE arxiv:1510.05311v2 [cs.it] 24 ay 2016 Abstract In

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Tight Complexity Bounds for Optimizing Composite Objectives

Tight Complexity Bounds for Optimizing Composite Objectives Tight Coplexity Bounds for Optiizing Coposite Objectives Blake Woodworth Toyota Technological Institute at Chicago Chicago, IL, 60637 blake@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Recovery of Sparsely Corrupted Signals

Recovery of Sparsely Corrupted Signals TO APPEAR IN IEEE TRANSACTIONS ON INFORMATION TEORY 1 Recovery of Sparsely Corrupted Signals Christoph Studer, Meber, IEEE, Patrick Kuppinger, Student Meber, IEEE, Graee Pope, Student Meber, IEEE, and

More information

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Tie-Varying Jaing Links Jun Kurihara KDDI R&D Laboratories, Inc 2 5 Ohara, Fujiino, Saitaa, 356 8502 Japan Eail: kurihara@kddilabsjp

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

Transformation-invariant Collaborative Sub-representation

Transformation-invariant Collaborative Sub-representation Transforation-invariant Collaborative Sub-representation Yeqing Li, Chen Chen, Jungzhou Huang Departent of Coputer Science and Engineering University of Texas at Arlington, Texas 769, USA. Eail: yeqing.li@avs.uta.edu,

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Weighted Superimposed Codes and Constrained Integer Compressed Sensing

Weighted Superimposed Codes and Constrained Integer Compressed Sensing Weighted Superiposed Codes and Constrained Integer Copressed Sensing Wei Dai and Olgica Milenovic Dept. of Electrical and Coputer Engineering University of Illinois, Urbana-Chapaign Abstract We introduce

More information

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

Figure 1: Equivalent electric (RC) circuit of a neurons membrane Exercise: Leaky integrate and fire odel of neural spike generation This exercise investigates a siplified odel of how neurons spike in response to current inputs, one of the ost fundaental properties of

More information

Robustness and Regularization of Support Vector Machines

Robustness and Regularization of Support Vector Machines Robustness and Regularization of Support Vector Machines Huan Xu ECE, McGill University Montreal, QC, Canada xuhuan@ci.cgill.ca Constantine Caraanis ECE, The University of Texas at Austin Austin, TX, USA

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

The Transactional Nature of Quantum Information

The Transactional Nature of Quantum Information The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.

More information

Mixed Robust/Average Submodular Partitioning

Mixed Robust/Average Submodular Partitioning Mixed Robust/Average Subodular Partitioning Kai Wei 1 Rishabh Iyer 1 Shengjie Wang 2 Wenruo Bai 1 Jeff Biles 1 1 Departent of Electrical Engineering, University of Washington 2 Departent of Coputer Science,

More information

Lecture 9 November 23, 2015

Lecture 9 November 23, 2015 CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)

More information

Stability Bounds for Non-i.i.d. Processes

Stability Bounds for Non-i.i.d. Processes tability Bounds for Non-i.i.d. Processes Mehryar Mohri Courant Institute of Matheatical ciences and Google Research 25 Mercer treet New York, NY 002 ohri@cis.nyu.edu Afshin Rostaiadeh Departent of Coputer

More information