Hybrid Approximate Message Passing for Generalized Group Sparsity

Size: px
Start display at page:

Download "Hybrid Approximate Message Passing for Generalized Group Sparsity"

Transcription

1 Hybrid Approximate Message Passing for Generalized Group Sparsity Alyson K Fletcher a and Sundeep Rangan b a UC Santa Cruz, Santa Cruz, CA, USA; b NYU-Poly, Brooklyn, NY, USA ABSTRACT We consider the problem of estimating a group sparse vector x R n under a generalized linear measurement model Group sparsity of x means the activity of different components of the vector occurs in groups a feature common in estimation problems in image processing, simultaneous sparse approximation and feature selection with grouped variables Unfortunately, many current group sparse estimation methods require that the groups are non-overlapping This work considers problems with what we call generalized group sparsity where the activity of the different components of x are modeled as functions of a small number of boolean latent variables We show that this model can incorporate a large class of overlapping group sparse problems including problems in sparse multivariable polynomial regression and gene expression analysis To estimate vectors with such group sparse structures, the paper proposes to use a recently-developed hybrid generalized approximate message passing (HyGAMP) method Approximate message passing (AMP) refers to a class of algorithms based on Gaussian and quadratic approximations of loopy belief propagation for estimation of random vectors under linear measurements The HyGAMP method extends the AMP framework to incorporate priors on x described by graphical models of which generalized group sparsity is a special case We show that the HyGAMP algorithm is computationally efficient, general and offers superior performance in certain synthetic data test cases Keywords: Compressed sensing, group sparsity, message passing, graphical models, approximate message passing 1 INTRODUCTION Sparsity-based estimation methods have become popular in a number of areas including inverse problems in image processing, statistical feature selection, dimensionality reduction and, most recently, compressed sensing [1 3] One of the basic problems in sparse signal processing is to estimate a sparse vector x R n from noisy linear measurements of the form y = z + w, z = Ax, (1) where A R m n is a known transform matrix and w is additive noise More generally, one may also be interested in so-called generalized linear models [4, 5] where the output mapping from z to y is described by a general (possibly nonlinear) probabilistic transfer function P (y i z i ) so that y i P (y i z i ), z = Ax (2) In either case, x being sparse implies that the vector has few non-zero components reduces the effective degrees of freedom The goal of sparse estimation is to exploit this property for improved estimation of x from the observations y Bayesian formulations of sparse estimation [6 8] typically model sparsity by imposing a prior on the vector x such that the marginal distributions P (x j ) are sparse, meaning that the components have a high probability of being zero or close to zero In the simplest Bayesian models, the components x j would be modeled as independent However, in many practical problems, the sparsity between components have dependencies that Further author information: (Send correspondence to S Rangan) AKFletcher: afletcher@ucscedu, Telephone: SRangan: srangan@polyedu, Telephone:

2 impose further constraints on the signal that can be potentially exploited in signal recovery Recent years have thus seen considerable interest in finding suitable so-called structured sparse models that can capture a rich set of dependencies between components while enabling tractable estimation algorithms that can leverage that structure [9, 10] One particularly simple model for structured sparsity is group sparsity (also sometimes called block sparsity) [11,12] In its simplest form, one is given K groups G 1,, G K that form a disjoint partition of indices 1,, n} so that each component x j belongs to exactly one of K groups In the group sparse model, all the components x j within the same group j G k are zero (inactive) or non-zero (active) together This group structure can be found in a number of applications including simultaneous sparse approximation [13, 14], model selection with grouped variables [11], canonical correlation analysis [15], image annotation [16] and image reconstruction [17] Recovery of the vector x under a group sparse model is often performed with variants of traditional (ie non-group) sparse estimation methods For example, group variants [11, 12] of the LASSO method [18], estimate x by solving a regularized least-squares problem with a mixed l 1 -l 2 regularizer to promote the group sparsity Similar to the standard LASSO method, the resulting optimization problem is typically convex and can be solved via a number of fast methods including [19 21] Group variants [22,23] of widely-used matching pursuit methods [24] have also been successful However, a key limitation in these approaches is that the groups typically must be disjoint or non-overlapping The treatment of overlapping groups generally requires some approximations and restrictions [14, 25 27] We review some of these methods in more detail below In recent years, an alternate, and potentially more general, approach for handling structured sparse problems has been offered by so-called turbo and hybrid extensions of approximate message passing (AMP) AMP methods refer to a class of algorithms based on Gaussian and quadratic approximations of loopy belief propagation designed for the estimation of random vectors x from linear measurements [28 38] AMP methods have attracted considerable recent attention in the context of compressed sensing due to their computational simplicity, generality and analytic tractability Also, although traditional AMP methods generally require the vector x to have independent components, turbo and hybrid extensions [39 43] have been proposed that can incorporate priors on x described by arbitrary graphical models These turbo and hybrid methods operate by combining AMP updates across the transform A with standard loopy belief propagation updates in the factor graph associated with the prior on the vector x The methodology applies to a tremendous range of problems and one particular version of these methods, called Hybrid Generalized Approximate Message Passing (HyGAMP) described in [43], has been proposed for certain classes of group sparse problems The contribution of this paper is to extend and evaluate the HyGAMP methodology to a larger class of group sparsity problems that we call generalized group sparsity In the proposed generalized group sparse model, the sparsity pattern on the n components of the vector x is modeled as a deterministic function of K independent boolean latent variables, ξ k, k = 1,, K, where K is typically smaller than n The mapping between the latent variables ξ k and the activities of the components x j capture the correlations between components and can model a range of problems with overlapping groups We show that the proposed HyGAMP algorithm for generalized group sparsity offers a number of attractive features: Generality: The model applies to structured sparse models with an arbitrary mapping between the latent variables and the activities on the components of x j For computational purposes, the only limitation is that the activity of each component should only depend on a small number of variables We show that the model can incorporate a number of overlapping group sparse problems including unions and intersections of groups and applications including multivariable polynomial regression, sparse boolean regression and gene expression analysis Support for generalized linear models: The HyGAMP framework is an extension of the Generalized Approximate Message Passing (GAMP) method in [37, 38, 44] which allows for generalized linear models (2) As a result, our methodology can support both group sparse classification (where the outputs y i are discrete) as well as regression problems

3 f j (ξ γ(j) ) u j P (x j u j ) x j ξ 1 ξ 2 z i P (y i z i ) y i z = Ax ξ K Figure 1 Factor graph representation of the generalized group sparsity model The problem is to estimate the components x j of a random vector x observed through a known linear transform z = Ax followed by a componentwise, probabilistic measurement channel generating the observed variables y i The group sparse structure on the variables x j is modeled through a set of boolean latent variables ξ k, k = 1,, K Computational simplicity: As described in [45], the GAMP method on which the HyGAMP algorithm is based is essentially a first-order algorithm with similar per iteration cost as the fastest known compressed sensing method such as inexact ADMM and iterative thresholding In particular, each iteration of the approximate message passing updates requires only multiplications by A and A T No matrix inverses or vector-valued estimation updates are required As we will see, the additional updates associated with the HyGAMP algorithm for group sparsity are typically small Performance: In Section 5, we test the algorithm on random instances of an overlapping group sparse problem We show that the method outperforms a number of state of the art techniques including group variants of LASSO [11, 12] 2 GENERALIZED GROUP SPARSITY We consider the problem of estimating a vector x R n from measurements y R m under the factor graph model shown in Fig 1 A general treatment of graphical models can be found in [46] In the factor graph in Fig 1, the variables x j, j = 1,, n are the components of the unknown random vector x R n To model the sparse structure of x we assume that, corresponding to each component x j there is a boolean latent variable u j 0, 1} where u j = 0 when x j = 0 and u j = 1 when x j is potentially non-zero When u j = 1 we will say the component x j is active and we call the variables u j the variable activity indicators We assume that, given the vector u of the variable activity indicators, the components of x are independent with the conditional distributions P (x j u j ) given by 0 if uj = 0 x j (3) V j if u j = 1, where V j is a random variable having the distribution of the component x j in the event that the component is active More general two variable mixture distributions can also easily be incorporated into this model As a simple Bayesian formulation of the standard (ie non-group) compressed sensing problem, one could assume that the variable activity indicators u j are independent, with each variable having some small probability of being active (ie P (u j = 1) is small) However, our goal in this work is to consider a more general class of group structured sparse problems To this end, we assume that the variable activity indicators u j are themselves functions of a second set of boolean latent variables, ξ k 0, 1}, k = 1,, K, where K is typically less than n We assume that each variable activity indicator u j is a deterministic function of the variables ξ k of the form u j = f j (ξ γ(j) ), (4) where γ(j) is a subset of indices γ(j) 1,, K} and ξ γ(j) denotes the sub-vector of ξ with components ξ k, k γ(j) We let G k 1,, n} be the group of indices j such that k γ(j) Thus G k is the set of component indices j such that u j is a function of ξ k The groups G k may be overlapping The group G k will be called active

4 or inactive depending on whether ξ k = 1 or 0 We will call the variables ξ k the group activity indicators and model them as independent with P (ξ k = 1) = 1 P (ξ k = 0) = α k (5) for some activity level α k (0, 1) We will see below that this latent variable model can incorporate a wide range of interesting overlapping group sparse structures Similar to the GAMP framework in [37, 38, 44], we assume a generalized linear measurement model (2) where the observed variable vector y is generated by first passing x through a linear transform z = Ax followed by a separable componentwise measurement channel with probability distribution functions P (y i z i ) The model is general and includes standard additive white Gaussian noise (AWGN) models such as (1) where w has independent Gaussian components, but can also incorporate nonlinearities or non-gaussian randomness In particular, the model can be used for classification problems where the observations are boolean (y i 0, 1}) and the output transfer function P (y i = 1 z i ) is typically some sigmodial function of z i such as a logisitic or probit model [47] In the context of this work, these output channels enable Bayesian approaches to group sparse classification problems considered in [23, 48, 49] Earlier work of ours has also used the nonlinear outputs for Poisson spiking processes in neural recordings [50] and quantized outputs [51] Given the above description, the joint probability distribution function of the variables can be written as m n K n P (x, z, u, ξ, y) = 1 z=ax} P (y i z i ) P (x j u j ) P (ξ k ) 1 uj=f j(ξ γ(j) )} (6) i=1 j=1 where the indicators 1 z=ax} and 1 uj=f j(ξ γ(j))} are used to constrain the random vectors to satisfy the constraints z = Ax in (2) and u j = f j (ξ γ(j) ) in (4) Our goal in this work is to estimate the posterior marginals of the components of the transform inputs and outputs That is, given a vector of observations y, we are interested in estimating the posterior distributions P (x j y), P (z i y), (7) and possibly the posterior distributions on the activity indicators ξ k and u j as well From these marginal distributions, one can compute a variety of quantities of interest including the minimum mean squared error (MMSE) estimates E(x j y) and E(z i y) or optimal estimates with respect to any other loss functions In addition, given any estimate, the marginal distributions can be used to quantify the uncertainty through the distribution of the error Unfortunately, in general, exact computation of the posterior marginals in (7) is intractable since it involves marginalization of the joint distribution (6) across the group activity indicators ξ Since there are 2 K possible values for ξ, computing the marginal distributions generally grows exponentially in the number of groups The Hybrid-GAMP algorithm presented in Section 4 will provide an algorithm for approximately computing these marginal distributions k=1 3 GROUP SPARSE EXAMPLES Before describing the HyGAMP algorithm, it is useful to illustrate the above generalized group sparse model with some motivating examples j=1 Group Sparsity with Disjoint Groups We first consider the standard group sparse model with nonoverlapping groups This is the model used in most of the group or block sparse literature see, for example, [11, 12] In this problem, we are given groups G 1,, G K that form a disjoint partition of the set 1,, n} so that each component index j 1,, n} belongs to exactly one group γ(j) The variable x j is non-zero (ie active) only when its group G k is active Thus, all the components x j with indices j belonging to the same group G k are active or inactive together

5 To model this scenario in the above formalism, we let u j, j = 1,, n, represent the activity indicators for the variables x j and let ξ k, k = 1,, K be the activity indicators for the groups G k Since each variable is active only when its group is active, we assume 1 when ξ γ(j) = 1 u j = 0 when ξ γ(j) = 0, which clearly fits in the form of (4) If we additionally assume that the group activity variables ξ k are independent with probabilities of the form (5), and the components of x are also conditionally independent given u with the mixture distribution (3), we see that the non-overlapping group sparse structure on x can be modeled as a special case of the generalized group sparsity model in Section 2 Unions and Intersections of Groups The above example considers non-overlapping groups A simple example of overlapping groups that can be easily handled in the generalized sparsity framework is arbitrary union or intersections of groups As before, suppose that there are K groups, G 1,, G K, with the activity of each component x j being dependent on a subset γ(j) 1,, K} Now, suppose we take the activity functions u j = f j ( ) to be the logical or function 1 if ξ k = 1 for any k γ(j) f j (ξ γ(j) ) = f or (ξ γ(j) ) = (8) 0 else, or the logical and function f j (ξ γ(j) ) = f and (ξ γ(j) ) = 1 if ξ k = 1 for all k γ(j) 0 else (9) The or function corresponds to a union of groups so that the component x j is active if any of the groups it belongs to are active while the and function corresponds to the intersection of groups where x j is active only if all the groups it belongs to are active One application for the unions of groups model is gene pathway analysis [25, 52] In this application, A R m n is a data matrix of m samples of expression levels on n genes Each sample i is labeled with some target variable, y i, such as whether a cancer tested in that sample is metastatic or non-metastatic The goal is to find a linear classification or regression model between the expression data and target variables Since only a few genes are likely to play role, we expect that the regression coefficients are sparse Moreover, it is known that genes typically operate together in functional groups, with each gene potentially belonging to multiple groups Thus, it is desirable to explain the data with a minimal number of the functional groups A union of groups model can be used to enforce precisely this form of sparsity Sparse Multivariable Polynomial Regression An example where intersecting groups arise is multivariable polynomial regression Suppose we are given data (y i, v i )}, i = 1,, n, where, for each data sample i, y i is an (observed) target variable and v i = (v i1,, v ik ) is a vector of K covariates Suppose that we wish to fit a multivariable polynomial model of the form y i P (y i z i ), z i = w j v j i, (10) j J(d) where d = 0, 1, is the polynomial degree, J(d) is a set of generalized multivariable indices J(d) = j Z L j k = 0, 1,, K k=1 } j k d,

6 and v j denotes the multivariable monomial term v j := v j1 1 vj K K For example, when K = 2 and d = 2 the model (10) would reduce to the quadratic polynomial z i = w 00 + w 10 v 1 + w 01 v 2 + w 20 v w 02 v w 11 v 1 v 2 In (10), P (y i z i ) is the conditional distribution of the target variable y i and z i The observation model is general so that we can incorporate both regression and classification problems The problem is to estimate the regression coefficients w j, j J(d) If we let w be the vector of the regression coefficients w j, then we can write z = Aw for a suitable data matrix A built out of the covariates v i The challenge is that the number of regression coefficients grows exponentially in the polynomial degree d Specifically, if n is the dimension of w, then n = O(K d ) To reduce the dimensionality of the regression, it may be reasonable in some applications to assume that there are only a small number of indices k such that the covariates v ik have influence on the targets z i To model this assumption in the group sparse formalism described above, let ξ k 0, 1} be a boolean variable indicating that the k-th set of covariates v ik has influence on the targets y i, where ξ k = 1 when the covariates are active and ξ k = 0 are inactive Similarly, for each generalized index j J(d), let u j 0, 1} be the boolean variable with u j = 0 when the coefficient w j = 0 and u j = 1 when the coefficient is possibly non-zero Now, in the expansion (10), we assume that the coefficients w j = 0 whenever any of the variables with non-zero exponents are inactive We can write this as 1 if ξ k = 1 for all k γ(j), } u j = where γ(j) := k j k > 0 0 if ξ k = 0 for any k γ(j), This mapping is precisely the logical and function in (9) If, as before, we assume that group activity indicators ξ k are independent and the components of the vector x j are conditionally independent given u, the problem follows the generalized group sparse model in Section 2 Note that the groups G k in this example are the set of generalized indices j such that j k > 0 Thus, G k is the set of monomial terms v j = v j1 1 vj K K with a non-zero dependence on the variable covariate v k These groups are, in general, overlapping and thus this example provides a useful case when non-overlapping groups are necessary Sparse Linear Boolean Regression Closely related application to the above example is what we call sparse linear boolean regression As above, suppose we are given data (y i, v i )}, i = 1,, n, where for each data sample i, y i is a target variable and v i = (v i1,, v ik ) is a vector of covariates However, in this case, suppose that the covariates are boolean variables v i 0, 1}, and we wish to fit a linear model of the form z i = n w j φ j (v i,γ(j) ), (11) j=1 where each function φ j (v) is a boolean-valued function dependent on a small subset of components γ(j) 1,, K} The weights w j and output z i may be real-valued Thus, we are interested in fitting a real-valued function with discrete inputs As one example, suppose that we take the functions φ j ( ), j = 1,, n, to be the set of all d-literal clauses from the boolean vector v For example, when d = 3 and K = 5, the functions φ(v) would include the boolean expressions such as v 1 v 4 v 5, v 2 v 3 v 4, v 1 v 2 v 5,, where denotes logical and and logical negation As in the previous example, the number of terms grows as n = O(K d ) However, if we impose a sparsity constraint that only a small number of the boolean covariates v ik can influence on z i, the number of non-zero coefficients w j will be reduced making the regression more tractable

7 4 HY-GAMP FOR GENERALIZED GROUP SPARSITY Having presented some examples, we can now describe the HyGAMP algorithm [53] and its application to generalized group sparsity Given the separable structure of the joint distribution (6), graphical-model methods [46] provide a natural approach to estimating the marginal distributions such as (7) As we mentioned above, exact computation of marginal distributions is generally intractable Traditional graphical model techniques such as loopy belief propagation (loopy BP) attempt to reduce such inherently high-dimensional vector-valued estimation problems on a factorizable distributions such as (6), to a sequence of lower-dimensional problems associated only with the variables in each factor However, for standard loopy BP to be successful, each factor must depend only on a small number of the variables Unfortunately, for the distribution (6), this property will be valid only when the constraint matrix A is sparse While this sparsity property occurs in problems such as low-density parity check codes [54], transforms A arising in imaging and regression problems are often dense Approximate message passing (AMP) refers to a class of Gaussian and quadratic approximations of loopy BP that can be applied to dense A AMP approximations of loopy BP originated in CDMA multiuser detection problems [28 30] and have received considerable recent attention in the context of compressed sensing [31 36,38] The Gaussian approximations used in AMP are also closely related to expectation propagation techniques [55,56] These algorithms have been particularly attractive since they are general, computationally extremely simple, and for certain large random problem instances admit precise analyses with testable conditions for optimality, even when the problems are non-convex In addition, the methods can be easily integrated with EM techniques when the distributions parameters are unknown [57 61] The standard AMP algorithm considers estimation of a vector x with independent components To model dependencies, the works [39 43] proposed various turbo and hybrid extensions of the AMP algorithms analogous to similar turbo procedure used in conjunction with LDPC codes Specifically, correlations between components of the vector x are assumed to be modeled through prior P (x) described by a graphical model The graphical model may contain other latent variables in the distribution The turbo and hybrid AMP methods then use standard loopy belief propagation updates in the graphical model associated with the prior of x while using approximate message passing across the transform z = Ax One particular version of these algorithms is the so-called Hybrid Generalized Approximate Message Passing (HyGAMP) method of [43] The HyGAMP enables extensions of the AMP methods applicable to priors on x described by arbitrary graphical models of which generalized group sparsity is a special case Algorithm 1 below shows the steps in the HyGAMP algorithm in the case of generalized group sparsity and is very similar to a slightly more restricted group sparse method presented in [43] A detailed derivation of the method can be performed along the lines of the methods in [43]; here, we just provide a brief qualitative description of the method As in the HyGAMP group sparse algorithm in [43], Algorithm 1 is run in a sequence of iterations Each iteration t of the main repeat-until loop has two parts: The first half-iteration is the GAMP update part that generates quantities x j (t), ẑ i (t), τ x j (t) and τ z i (t) representing estimates of the posterior mean and variances of the unknown variables x j and z i This update is based on standard GAMP algorithm in [37] and uses, as an input, the parameter ρ j (t) representing the current estimate of posterior probabilities P (u j = 1 y) The second half-iteration is the sparsity update part that updates the estimates ρ j (t) of the posterior probabilities P (u j = 1 y) of the variable activity indicators The original GAMP paper [43] describes the equations for the GAMP update part of Algorithm 1 in more detail, and derives the updates based on certain Gaussian and quadratic approximations of sum-product loopy BP As in the standard GAMP method [38], the GAMP update part of Algorithm 1 is based on solving certain scalar AWGN estimation problems on the variables x j and z i Specifically, lines 8 and 9 compute the mean

8 Algorithm 1 Hybrid-GAMP for Generalized Group Sparsity 1: Initialization} 2: t 0 3: τ r j (t 1) 4: α j k (t) α k 5: repeat 6: Basic GAMP update} 7: ρ j (t) P (f j (ξ γ(j) ) = 1) with ξ k indep and P (ξ k = 1) = α j k (t) for all k γ(j) 8: x j (t) E(x j r j (t 1), τj r(t 1)), ρ j = ρ j (t))) 9: τj x(t) var(x j r j (t 1), τj r(t 1), ρ j = ρ j (t)) 10: τ p i (t) j A ij 2 τj x(t) 11: p i (t) j A ij x j (t) τ p i (t)ŝ i(t 1) 12: ẑ i (t) E(z i p i (t), τ p i (t)) 13: τi z(t) var(z i p i (t), τ p i (t)) 14: ŝ i (t) (ẑi 0 p i(t))/τ p i (t) 15: τi s p (t) τi (t)(1 τi z(t)/τ p i (t)) 16: τj r(t) 1/( i A ij 2 τi s(t)) 17: r j (t) x j (t) + τj r(t) i A ijŝi(t) 18: Sparsity update} 19: ρ j k (t, 0) P (f j (ξ γ(j) ) = 1 ξ k = 0) with ξ l indep and P (ξ l = 1) = α j l (t) for all l γ(j), l k 20: ρ j k (t, 1) P (f j (ξ γ(j) ) = 1 ξ k = 1) with ξ l indep and P (ξ l = 1) = α j l (t) for all l γ(j), l k 21: LLR j k (t) log p( r j (t); τj r(t), ρ j k(t, 1)) log p( r j (t); τj r(t), ρ j k(t, 0)) 22: LLR j k (t) log(α k /(1 α k )) + i G k j LLR i k(t) 23: α j k (t+1) 1/(1 + exp( LLR j k (t))) 24: t t+1 25: until Terminate and variance estimates x j (t) and τj x(t) We use the notation E(x j r j, τj r, ρ j) and var(x j r j, τj r, ρ j) to denote the expectation and variance with respect to the distribution P (x j r j, τj r, ρ j) defined as the posterior distribution of the scalar variable x j with activity probability ρ j observed through an AWGN measurement r j of the form r j = x j + w j, w j N (0, τ r V j with probability ρ j, ), x j = (12) 0 with probability 1 ρ j The density also provides the estimates of the posterior marginal distribution in that we take P (x j y) P (x j r j = r j (t), τ r j (t), ρ j (t)) Similarly, lines 12 and 13 compute the output mean and variance estimates ẑ i (t) and τ z i We use E(z i p i = p i, τ p i ) and var(z i p i = p i, τ p i ) to denote the mean and variance with respect to a distribution P (z i y i, p i, τ p i ) defined as the posterior distribution of a Gaussian variable z i observed with from the measurement y i as y i P (y i z i ), z i N (p i, τ p i ) (13) The distribution also provides the estimate of the posterior marginal for z i in that we take P (z i y) P (z i y i, p i (t), τ p i (t)) The second half of the iteration, labeled as the sparsity update, updates the parameters ρ j (t), which are the estimates of the posterior probabilities P (u j = 1 y) of the variable activity indicators As described in [43], this part of the algorithm is the standard loopy belief propagation update applied on the portion of the factor

9 graph to the left of the variables u j in Fig 1 Specifically, the quantities ρ j (t), ρ j k (t, 0) and ρ j k (t, 1) can be interpreted, respectively, as estimates for the posterior probabilities ρ j = Pr ( u j = 1 y ), ρ j k (0) = Pr ( u j = 1 y, ξ k = 0 ), ρ j k (1) = Pr ( u j = 1 y, ξ k = 1 ) Similarly, the quantities α j k (t) are estimates of the posterior probabilities P (ξ k = 1 y) on the group activity indicators, and LLR j k (t) and LLR j k (t) represent estimates of the corresponding log-likelihood ratios LLR k = log P (ξ k = 1 y) log P (ξ k = 0 y) Initially, in line 4, the algorithm sets α j k (t) to the prior probabilities α k for all j In each iteration, the posterior probabilities are updated with the standard loopy BP procedure In line 21, p(r j τ r, ρ j ) is the probability density function of the scalar random variable r j in (12) assuming a prior activity probability P (u j = 1) = 1 P (u j = 0) = ρ j It should be pointed out that the HyGAMP methodology of [53] also provides a systematic methodology for approximately computing the maximum a posteriori (MAP) estimate ( x, ẑ) := arg max P (x, z y) (14) x,z However, for space considerations, in this work, we only consider the estimation of the marginal distributions Computational Complexity One of the attractive features of the HyGAMP generalized group sparsity algorithm is its computational simplicity The GAMP update part of each iteration involves evaluating n scalar AWGN problems associated with the variables x j (lines 8 and 9); m scalar AWGN problems associated with the variables z i (lines 12 and 13); and multiplications by A and A 2 and their transposes In many cases, the scalar AWGN problems have closed form solutions, even when the P (x j u j ) or P (y i z i ) are non-gaussian In cases when closed form solutions are not available, the expectations and variances can be computed via one or two-dimensional numerical integration Thus the per iteration cost of the scalar AWGN problems is O(m + n) Hence, the dominant per iteration cost of the GAMP update is only the multiplication by the matrices A and A 2 which is O(mn) in the worst-case The per iteration cost is similar to most first-order methods for compressed sensing including iterative thresholding and ADMM The cost for the sparsity update part of the iteration depends on the complexity of the functions f j (ξ γ(j) ), but is often small For example, suppose f j (ξ γ(j) ) represents a logical and operation (9) Then, lines 7, 19 and 19 reduce to: ρ j (t) = α j k (t), ρ j k (t, 1) = ρ j (t)/α j k (t), ρ j k (t, 0) = 0 k γ(j) Thus, if each subset γ(j) has cardinality d, all the terms ρ j k (t) can be computed in O(nd) operations A similar expression is possible when f j (ξ γ(j) ) represents a logical or operation (8) Of course, for general functions f j ( ), there may not be a simple expression for computing the ρ terms However, even in the worst case, the updates of all the ρ j k (t) terms would require O(n2 d ) operations Thus, as long as d is small (ie each of the variable indicators u j depend on only small number d of the K groups), the computation will be tractable 5 NUMERICAL EXAMPLE To evaluate the HyGAMP methodology, we measured the algorithm performance on a large number of random instances of a sparsity recovery problem with overlapping groups In each random instance of the problem, the vector x R n was generated with iid Gauss-Bernoulli components 0 if u j = 0 x j = (15) N (0, 1) if u j = 1

10 Normalized MSE (db) LMMSE LASSO GLASSO GLASSO LAT GAMP HyGAMP Num measurements (m) Figure 2 Comparison of the average performance of various estimation algorithms for random instances of a overlapping group sparsity problem with K = 20 groups for a vector of dimension n = 100 The groups are active with probability α = 01 and the activity of each of the n = 100 components is dependent on a random d = 2 of the K = 20 groups The vector dimension was set to n = 100 and we assumed K = 20 groups with iid group activity indicators ξ k, k = 1,, K For the variable activity indicators, the sets γ(j) were generated by randomly selecting d = 2 of the K groups for each component u j The activity function used a logical or operation so that u j = 1 if and only if ξ k = 1 for any k γ(j) Recall that the estimator knows the sets γ(j) and all other statistics only the particular realization of ξ and x are unknown We set each of the groups to be active with probability α = 01 so that the components x j were active with probability 1 (1 α) d 019 Thus, not exploiting the group structure requires the estimator to identify, on average, 19 out of the 100 components of x However, since the groups are active with only probability 01, using the group structure requires, on average, the identification of only 2 of the 20 groups Hence, the example provides a test case where the correlated group structure can significantly reduce the effective degrees of freedom For the measurement matrix, we used a zero mean Gaussian iid matrix A R m n varying the number of measurements, m, from 10 to 200 The measurement vector y was generated by an AWGN measurement channel (1) with SNR = 30 db Fig 2 compares the performance of the proposed HyGAMP methodology with several other common algorithms For each method and number of measurements m, we generated 500 random Monte Carlo instances of the problem and measured the average normalized mean squared error E x x 2 2 Normalized MSE = 10 log 10 E x 2, 2 where the x is estimated vector and the expectations in the numerator and denominator are taken over the 500 Monte Carlo trials The details of the algorithms shown in Fig 2 are as follows: The LMMSE is the simple linear minimum mean squared estimator This method does not exploit any sparsity and thus the performance, as expected, is poor The LASSO method [18] is a standard algorithm for sparse recovery and finds an estimate by solving the optimization x = arg min x 1 2 y Ax γ n x j, (16) for some regularization parameter γ > 0 The regularization trades off the sparsity with the prediction error To provide the most favorable case for the LASSO method, we used an oracle method for selecting γ where, for each number of measurements m, various γ values were tested, and we selected the value of γ that resulted in the lowest MSE The LASSO method is well-known to exploit the sparse structure of the signal well, but not the group sparse structure It thus shows marked improvement in performance over simple LMMSE method but j=1

11 still does not perform as well the HyGAMP algorithm To incorporate group structure, one often uses a group LASSO method [11, 12] where the estimate is given by the solution to a mixed l 1 l 2 optimization x = arg min x 1 2 y Ax γ K x Gk 2, (17) where x Gk is the subvector of x with components in G k While this method works well for disjoint groups or intersecting groups, it is well-known to be problematic for unions of overlapping groups [25]: The reason is that if any x j is to be non-zero, all the groups k with j G k must be made active This behavior is undesirable when the activities of the variables require only one of the groups in γ(j) to be active The performance of the group LASSO estimate (17) is plotted in Fig 2 in the curve labeled GLASSO We see that, due to this overlapping problem, group LASSO does not even outperform the standard (non-group) LASSO To improve the performance of group LASSO with union overlaps, [25] proposed to transform the problem so that the vector x is replaced with a new vector of latent variables In the transformed domain, the dimension of the new vector is larger, but there are no overlapping groups One can then apply the group LASSO estimator (17) to the transformed problem The performance of this group LASSO method with latent variables is shown in Fig 2 in the curve labelled GLASSO-LAT We see that, as predicted in [25], the method offers significant improvement over group LASSO particularly when the data is undersampled (ie m < n) But, for most values of the measurement number, the HyGAMP offers a significant improvement over the group LASSO with latent variables The reason the HyGAMP method offers improved performance is likely due to the inherent dimension expansion required for latent variable group LASSO that is not required by HyGAMP In addition, since all the LASSO methods are based on an l 1 penalty, they introduce a small bias error due to implicit soft thresholding in that they become unable to recover oracle estimator even when the correct sparsity pattern is detected (See [62]) This bias may explains some of the gap with HyGAMP at large m We did not test any de-biasing methods Finally, the curve labelled GAMP in Fig 2 is the standard GAMP algorithm from [37] with iid priors This algorithm does not exploit the group sparsity, and thus also performs worse than the HyGAMP method It should be noted that we did not test the group OMP algorithms descried in [22,23], since there is no obvious way to incorporate overlapping groups in those methods In conclusion, we see that the HyGAMP method outperforms all the tested methods over a large range of the measurements In some cases, the performance improvement is significant Moreover, the algorithms closest to the HyGAMP in performance, the latent variable group LASSO method in [25] is particularly constructed for overlapping groups where the variable activities are a logical or of the group activities The HyGAMP method here is more general Also, the GAMP and HyGAMP curves plotted in Fig 2 were based on running the algorithm for only 20 iterations, which we observed to be sufficient to obtain convergence within less than 01 db Thus, the HyGAMP method is also computationally extremely fast for this test case k=1 6 CONCLUSIONS Turbo and hybrid extensions of approximate message passing methods provide a promising, systematic and general framework for a large class of structured sparsity problems The techniques capture the modularity of graphical models along with the computational simplicity of approximate message passing In this work, we have shown the Hybrid Generalized Approximate Message Passing (HyGAMP) method of [63], in particular, can incorporate very general forms of groups sparsity in a computationally efficient manner On synthetic test cases, our simulations illustrate that the HyGAMP methodology can outperform state-of-the-art methods while being more general Nevertheless, much remains to be understood about these methods Most importantly, our current results are based entirely on simulations since there are currently no results that quantitatively describe the behavior of AMP-like methods used in conjunction with turbo updates However, recent work [59 61] have provided methods for analyzing the behavior of AMP combined with EM updates, and an interesting avenue of future work is to see if these techniques extend to turbo and hybrid AMP methods

12 In addition, even without the turbo and hybrid extensions, much of the analysis on AMP applies only to certain large random iid matrices where the algorithms exhibit extremely good performance See, for example, the state evolution analysis in [28,30 38] However, many of the matrices arising in practical problems in imaging and regression are not well-known modeled as realizations of such iid matrices The behavior of AMP is not well-understood in these cases, and it is known, in fact that the algorithm may perform poorly or even diverge A central research challenge for both the AMP methods and their turbo and hybrid extensions is to understand what modifications are necessary so that the benefits of these methods can be realized in a broader class of practical problems REFERENCES 1 E J Candès, J Romberg, and T Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Trans Inform Theory 52, pp , Feb D L Donoho, Compressed sensing, IEEE Trans Inform Theory 52, pp , Apr E J Candès and T Tao, Near-optimal signal recovery from random projections: Universal encoding strategies?, IEEE Trans Inform Theory 52, pp , Dec J A Nelder and R W M Wedderburn, Generalized linear models, J Royal Stat Soc Series A 135, pp , P McCullagh and J A Nelder, Generalized linear models, Chapman & Hall, 2nd ed, D Wipf and B Rao, Sparse Bayesian learning for basis selection, IEEE Trans Signal Process 52, pp , Aug S Ji, Y Xue, and L Carin, Bayesian compressive sensing, IEEE Trans Signal Process 56, pp , June V Cevher, Learning with compressible priors, in Proc NIPS, (Vancouver, BC), Dec R G Baraniuk, V Cevher, M F Duarte, and C Hegde, Model-based compressed sensing, IEEE Trans Inform Theory 56, pp , Apr M Duarte and Y Eldar, Structured compressed sensing: From theory to applications, IEEE Trans Signal Process 59(9), pp , M Yuan and Y Lin, Model selection and estimation in regression with grouped variables, J Royal Statist Soc 68, pp 49 67, P Zhao, G Rocha, and B Yu, The composite absolute penalties family for grouped and hierarchical variable selection, Ann Stat 37(6), pp , D P Wipf and B Rao, An empirical Bayesian strategy for solving the simultaneous sparse approximation problem, IEEE Trans Signal Process 55, pp , July F R Bach, Consistency of the group lasso and multiple kernel learning, J Machine Learn Res 9, S Virtanen, A Klami, and S Kaski, Bayesian CCA via group sparsity, ICML, June S Zhang, J Huang, Y Huang, Y Yu, H Li, and D N Metaxas, Automatic image annotation using group sparsity, in Proc Conf on Computer Vision and Pattern Recognition (CVPR), pp , A Majumdar and R K Ward, Compressive color imaging with group-sparsity on analysis prior, in Proc Conf on Image Processing, pp , R Tibshirani, Regression shrinkage and selection via the lasso, J Royal Stat Soc, Ser B 58(1), pp , M Figueiredo, S J Wright, and R D Nowak, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems, IEEE J Sel Topics Signal Process 1, pp , Dec S Kim, K Koh, M Lustig, S Boyd, and D Gorinvesky, An interior point method for large-scale l 1 - regularized least squares, IEEE J Sel Topics Signal Process 1, pp , Dec S J Wright, R D Nowak, and M Figueiredo, Sparse reconstruction by separable approximation, IEEE Trans Signal Process 57, pp , July A C Lozano, G Świrszcz, and N Abe, Group orthogonal matching pursuit for variable selection and prediction, in Proc Neural Information Process Syst, (Vancouver, Canada), Dec 2008

13 23 A C Lozano, G Świrszcz, and N Abe, Group orthogonal matching pursuit for logistic regression, J Machine Learning Res 15, pp , S S Chen, D L Donoho, and M A Saunders, Atomic decomposition by basis pursuit, SIAM Rev 43(1), pp , L Jacob, G Obozinski, and J-P Vert, Group lasso with overlap and graph lasso, in Proc International Conf Machine Learning (ICML), pp , N S Rao, R D Nowak, S J Wright, and N G Kingsbury, Convex approaches to model wavelet sparsity patterns arxiv: [cscv], Apr G Peyré and J Fadili, Group sparsity with overlapping partition functions, Proc EUSIPCO 2011, pp , J Boutros and G Caire, Iterative multiuser joint decoding: Unified framework and asymptotic analysis, IEEE Trans Inform Theory 48, pp , July T Tanaka and M Okada, Approximate belief propagation, density evolution, and neurodynamics for CDMA multiuser detection, IEEE Trans Inform Theory 51, pp , Feb D Guo and C-C Wang, Asymptotic mean-square optimality of belief propagation for sparse linear systems, in Proc IEEE Inform Theory Workshop, pp , (Chengdu, China), Oct D L Donoho, A Maleki, and A Montanari, Message-passing algorithms for compressed sensing, Proc Nat Acad Sci 106, pp , Nov D L Donoho, A Maleki, and A Montanari, Message passing algorithms for compressed sensing I: motivation and construction, in Proc Info Theory Workshop, Jan D L Donoho, A Maleki, and A Montanari, Message passing algorithms for compressed sensing II: analysis and validation, in Proc Info Theory Workshop, Jan M Bayati and A Montanari, The dynamics of message passing on dense graphs, with applications to compressed sensing, IEEE Trans Inform Theory 57, pp , Feb S Rangan, Estimation with random linear mixing, belief propagation and compressed sensing, in Proc Conf on Inform Sci & Sys, pp 1 6, (Princeton, NJ), Mar A Montanari, Graphical model concepts in compressed sensing, in Compressed Sensing: Theory and Applications, Y C Eldar and G Kutyniok, eds, pp , Cambridge Univ Press, June S Rangan, Generalized approximate message passing for estimation with random linear mixing arxiv: v1 [csit], Oct S Rangan, Generalized approximate message passing for estimation with random linear mixing, in Proc IEEE Int Symp Inform Theory, pp , (Saint Petersburg, Russia), July Aug P Schniter, Turbo reconstruction of structured sparse signals, in Proc Conf on Inform Sci & Sys, (Princeton, NJ), Mar J Ziniel, L C Potter, and P Schniter, Tracking and smoothing of time-varying sparse signals via approximate belief propagation, in Conf Rec 44th Asilomar Conf Signals, Syst & Comput, pp , (Pacific Grove, CA), Nov S Som, L C Potter, and P Schniter, Compressive imaging using approximate message passing and a Markov-tree prior, in Conf Rec 44th Asilomar Conf Signals, Syst & Comput, pp , (Pacific Grove, CA), Nov P Schniter, A message-passing receiver for BICM-OFDM over unknown clustered-sparse channels, in Proc IEEE Workshop Signal Process Adv Wireless Commun, (San Francisco, CA), June S Rangan, A K Fletcher, V K Goyal, and P Schniter, Hybrid approximate message passing with applications to structured sparsity arxiv: [csit], Nov A Javanmard and A Montanari, State evolution for general approximate message passing algorithms, with applications to spatial coupling arxiv: [mathpr], Nov S Rangan, P Schniter, E Riegler, A Fletcher, and V Cevher, Fixed points of generalized approximate message passing with arbitrary matrices arxiv preprint, Jan M J Wainwright and M I Jordan, Graphical models, exponential families, and variational inference, Found Trends Mach Learn 1, 2008

14 47 C M Bishop, Pattern Recognition and Machine Learning, Information Science and Statistics, Springer, New York, NY, Y Kim, J Kim, and Y Kim, Blockwise sparse regression, Statistica Sinica 16, pp , L Meier, S van de Geer, and P Bühlmann, The group lasso for logistic regression, J Royal Statistical Society: Series B 70(1), pp 53 71, A K Fletcher, S Rangan, L Varshney, and A Bhargava, Neural reconstruction with approximate message passing (NeuRAMP), in Proc Neural Information Process Syst, (Granada, Spain), Dec U S Kamilov, V K Goyal, and S Rangan, Message-passing de-quantization with applications to compressed sensing, IEEE Trans Signal Process 60, pp , Dec A Subramanian, P Tamayo, V K Mootha, S Mukherjee, B L Ebert, M A Gillette, A Paulovich, S L Pomeroy, T R Golub, E S Lander, et al, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proceedings of the National Academy of Sciences 102(43), pp , S Rangan, A K Fletcher, V K Goyal, and P Schniter, Hybrid generalized approximation message passing with applications to structured sparsity, in Proc IEEE Int Symp Inform Theory, pp , (Cambridge, MA), July T J Richardson and R L Urbanke, Modern Coding Theory, Cambridge Univ Press, Cambridge, UK, T P Minka, A family of algorithms for approximate Bayesian inference PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, M Seeger, Bayesian inference and optimal design for the sparse linear model, J Machine Learning Research 9, pp , Sept J P Vila and P Schniter, Expectation-maximization Bernoulli-Gaussian approximate message passing, in Conf Rec 45th Asilomar Conf Signals, Syst & Comput, pp , (Pacific Grove, CA), Nov J P Vila and P Schniter, Expectation-maximization Gaussian-mixture approximate message passing, in Proc Conf on Inform Sci & Sys, (Princeton, NJ), Mar F Krzakala, M Mézard, F Sausset, Y Sun, and L Zdeborová, Statistical physics-based reconstruction in compressed sensing arxiv: , Sept F Krzakala, M Mézard, F Sausset, Y Sun, and L Zdeborová, Probabilistic reconstruction in compressed sensing: Algorithms, phase diagrams, and threshold achieving matrices arxiv: , June U S Kamilov, S Rangan, A K Fletcher, and M Unser, Approximate message passing with consistent parameter estimation and applications to sparse learning, in Proc NIPS, (Lake Tahoe, NV), Dec R Tibshirani, Regression shrinkage and selection via the lasso: a retrospective, J Royal Statistical Society: Series B (Statistical Methodology) 73(3), pp , S Rangan, A K Fletcher, and V K Goyal, Extension of replica analysis to MAP estimation with applications to compressed sensing, in Proc IEEE Int Symp Inform Theory, pp , (Austin, TX), June 2010

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns

More information

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems 1 Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Philip Schniter, and Sundeep Rangan Abstract arxiv:1706.06054v1 cs.it

More information

Risk and Noise Estimation in High Dimensional Statistics via State Evolution

Risk and Noise Estimation in High Dimensional Statistics via State Evolution Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical

More information

Performance Regions in Compressed Sensing from Noisy Measurements

Performance Regions in Compressed Sensing from Noisy Measurements Performance egions in Compressed Sensing from Noisy Measurements Junan Zhu and Dror Baron Department of Electrical and Computer Engineering North Carolina State University; aleigh, NC 27695, USA Email:

More information

Scalable Inference for Neuronal Connectivity from Calcium Imaging

Scalable Inference for Neuronal Connectivity from Calcium Imaging Scalable Inference for Neuronal Connectivity from Calcium Imaging Alyson K. Fletcher Sundeep Rangan Abstract Fluorescent calcium imaging provides a potentially powerful tool for inferring connectivity

More information

Turbo-AMP: A Graphical-Models Approach to Compressive Inference

Turbo-AMP: A Graphical-Models Approach to Compressive Inference Turbo-AMP: A Graphical-Models Approach to Compressive Inference Phil Schniter (With support from NSF CCF-1018368 and DARPA/ONR N66001-10-1-4090.) June 27, 2012 1 Outline: 1. Motivation. (a) the need for

More information

Approximate Message Passing Algorithms

Approximate Message Passing Algorithms November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)

More information

Mismatched Estimation in Large Linear Systems

Mismatched Estimation in Large Linear Systems Mismatched Estimation in Large Linear Systems Yanting Ma, Dror Baron, Ahmad Beirami Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 7695, USA Department

More information

Wiener Filters in Gaussian Mixture Signal Estimation with l -Norm Error

Wiener Filters in Gaussian Mixture Signal Estimation with l -Norm Error Wiener Filters in Gaussian Mixture Signal Estimation with l -Norm Error Jin Tan, Student Member, IEEE, Dror Baron, Senior Member, IEEE, and Liyi Dai, Fellow, IEEE Abstract Consider the estimation of a

More information

Approximate Message Passing for Bilinear Models

Approximate Message Passing for Bilinear Models Approximate Message Passing for Bilinear Models Volkan Cevher Laboratory for Informa4on and Inference Systems LIONS / EPFL h,p://lions.epfl.ch & Idiap Research Ins=tute joint work with Mitra Fatemi @Idiap

More information

Passing and Interference Coordination

Passing and Interference Coordination Generalized Approximate Message Passing and Interference Coordination Sundeep Rangan, Polytechnic Institute of NYU Joint work with Alyson Fletcher (Berkeley), Vivek Goyal (MIT), Ulugbek Kamilov (EPFL/MIT),

More information

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Learning MMSE Optimal Thresholds for FISTA

Learning MMSE Optimal Thresholds for FISTA MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Learning MMSE Optimal Thresholds for FISTA Kamilov, U.; Mansour, H. TR2016-111 August 2016 Abstract Fast iterative shrinkage/thresholding algorithm

More information

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar

More information

On convergence of Approximate Message Passing

On convergence of Approximate Message Passing On convergence of Approximate Message Passing Francesco Caltagirone (1), Florent Krzakala (2) and Lenka Zdeborova (1) (1) Institut de Physique Théorique, CEA Saclay (2) LPS, Ecole Normale Supérieure, Paris

More information

Recent Advances in Structured Sparse Models

Recent Advances in Structured Sparse Models Recent Advances in Structured Sparse Models Julien Mairal Willow group - INRIA - ENS - Paris 21 September 2010 LEAR seminar At Grenoble, September 21 st, 2010 Julien Mairal Recent Advances in Structured

More information

arxiv: v1 [cs.it] 21 Feb 2013

arxiv: v1 [cs.it] 21 Feb 2013 q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto

More information

Convex relaxation for Combinatorial Penalties

Convex relaxation for Combinatorial Penalties Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,

More information

Phil Schniter. Supported in part by NSF grants IIP , CCF , and CCF

Phil Schniter. Supported in part by NSF grants IIP , CCF , and CCF AMP-inspired Deep Networks, with Comms Applications Phil Schniter Collaborators: Sundeep Rangan (NYU), Alyson Fletcher (UCLA), Mark Borgerding (OSU) Supported in part by NSF grants IIP-1539960, CCF-1527162,

More information

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Topographic Dictionary Learning with Structured Sparsity

Topographic Dictionary Learning with Structured Sparsity Topographic Dictionary Learning with Structured Sparsity Julien Mairal 1 Rodolphe Jenatton 2 Guillaume Obozinski 2 Francis Bach 2 1 UC Berkeley 2 INRIA - SIERRA Project-Team San Diego, Wavelets and Sparsity

More information

OWL to the rescue of LASSO

OWL to the rescue of LASSO OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

An equivalence between high dimensional Bayes optimal inference and M-estimation

An equivalence between high dimensional Bayes optimal inference and M-estimation An equivalence between high dimensional Bayes optimal inference and M-estimation Madhu Advani Surya Ganguli Department of Applied Physics, Stanford University msadvani@stanford.edu and sganguli@stanford.edu

More information

Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016

Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016 Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines Eric W. Tramel itwist 2016 Aalborg, DK 24 August 2016 Andre MANOEL, Francesco CALTAGIRONE, Marylou GABRIE, Florent

More information

Estimation Error Bounds for Frame Denoising

Estimation Error Bounds for Frame Denoising Estimation Error Bounds for Frame Denoising Alyson K. Fletcher and Kannan Ramchandran {alyson,kannanr}@eecs.berkeley.edu Berkeley Audio-Visual Signal Processing and Communication Systems group Department

More information

Message-Passing De-Quantization With Applications to Compressed Sensing

Message-Passing De-Quantization With Applications to Compressed Sensing 6270 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 12, DECEMBER 2012 Message-Passing De-Quantization With Applications to Compressed Sensing Ulugbek S. Kamilov, Student Member, IEEE, Vivek K Goyal,

More information

Sparse Superposition Codes for the Gaussian Channel

Sparse Superposition Codes for the Gaussian Channel Sparse Superposition Codes for the Gaussian Channel Florent Krzakala (LPS, Ecole Normale Supérieure, France) J. Barbier (ENS) arxiv:1403.8024 presented at ISIT 14 Long version in preparation Communication

More information

An Overview of Multi-Processor Approximate Message Passing

An Overview of Multi-Processor Approximate Message Passing An Overview of Multi-Processor Approximate Message Passing Junan Zhu, Ryan Pilgrim, and Dror Baron JPMorgan Chase & Co., New York, NY 10001, Email: jzhu9@ncsu.edu Department of Electrical and Computer

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Linear Regression with Strongly Correlated Designs Using Ordered Weigthed l 1

Linear Regression with Strongly Correlated Designs Using Ordered Weigthed l 1 Linear Regression with Strongly Correlated Designs Using Ordered Weigthed l 1 ( OWL ) Regularization Mário A. T. Figueiredo Instituto de Telecomunicações and Instituto Superior Técnico, Universidade de

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Statistical Image Recovery: A Message-Passing Perspective. Phil Schniter

Statistical Image Recovery: A Message-Passing Perspective. Phil Schniter Statistical Image Recovery: A Message-Passing Perspective Phil Schniter Collaborators: Sundeep Rangan (NYU) and Alyson Fletcher (UC Santa Cruz) Supported in part by NSF grants CCF-1018368 and NSF grant

More information

Fast Hard Thresholding with Nesterov s Gradient Method

Fast Hard Thresholding with Nesterov s Gradient Method Fast Hard Thresholding with Nesterov s Gradient Method Volkan Cevher Idiap Research Institute Ecole Polytechnique Federale de ausanne volkan.cevher@epfl.ch Sina Jafarpour Department of Computer Science

More information

Approximate Message Passing

Approximate Message Passing Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters

More information

Message passing and approximate message passing

Message passing and approximate message passing Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Expectation Propagation in Dynamical Systems

Expectation Propagation in Dynamical Systems Expectation Propagation in Dynamical Systems Marc Peter Deisenroth Joint Work with Shakir Mohamed (UBC) August 10, 2012 Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1 Motivation Figure : Complex

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

Tractable Upper Bounds on the Restricted Isometry Constant

Tractable Upper Bounds on the Restricted Isometry Constant Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.

More information

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation

More information

CO-OPERATION among multiple cognitive radio (CR)

CO-OPERATION among multiple cognitive radio (CR) 586 IEEE SIGNAL PROCESSING LETTERS, VOL 21, NO 5, MAY 2014 Sparse Bayesian Hierarchical Prior Modeling Based Cooperative Spectrum Sensing in Wideb Cognitive Radio Networks Feng Li Zongben Xu Abstract This

More information

DNNs for Sparse Coding and Dictionary Learning

DNNs for Sparse Coding and Dictionary Learning DNNs for Sparse Coding and Dictionary Learning Subhadip Mukherjee, Debabrata Mahapatra, and Chandra Sekhar Seelamantula Department of Electrical Engineering, Indian Institute of Science, Bangalore 5612,

More information

A new method on deterministic construction of the measurement matrix in compressed sensing

A new method on deterministic construction of the measurement matrix in compressed sensing A new method on deterministic construction of the measurement matrix in compressed sensing Qun Mo 1 arxiv:1503.01250v1 [cs.it] 4 Mar 2015 Abstract Construction on the measurement matrix A is a central

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Compressed Sensing and Linear Codes over Real Numbers

Compressed Sensing and Linear Codes over Real Numbers Compressed Sensing and Linear Codes over Real Numbers Henry D. Pfister (joint with Fan Zhang) Texas A&M University College Station Information Theory and Applications Workshop UC San Diego January 31st,

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Acommon problem in signal processing is to estimate an

Acommon problem in signal processing is to estimate an 5758 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Necessary and Sufficient Conditions for Sparsity Pattern Recovery Alyson K. Fletcher, Member, IEEE, Sundeep Rangan, and Vivek

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Single-letter Characterization of Signal Estimation from Linear Measurements

Single-letter Characterization of Signal Estimation from Linear Measurements Single-letter Characterization of Signal Estimation from Linear Measurements Dongning Guo Dror Baron Shlomo Shamai The work has been supported by the European Commission in the framework of the FP7 Network

More information

Minimax MMSE Estimator for Sparse System

Minimax MMSE Estimator for Sparse System Proceedings of the World Congress on Engineering and Computer Science 22 Vol I WCE 22, October 24-26, 22, San Francisco, USA Minimax MMSE Estimator for Sparse System Hongqing Liu, Mandar Chitre Abstract

More information

of Orthogonal Matching Pursuit

of Orthogonal Matching Pursuit A Sharp Restricted Isometry Constant Bound of Orthogonal Matching Pursuit Qun Mo arxiv:50.0708v [cs.it] 8 Jan 205 Abstract We shall show that if the restricted isometry constant (RIC) δ s+ (A) of the measurement

More information

Sparse and Robust Optimization and Applications

Sparse and Robust Optimization and Applications Sparse and and Statistical Learning Workshop Les Houches, 2013 Robust Laurent El Ghaoui with Mert Pilanci, Anh Pham EECS Dept., UC Berkeley January 7, 2013 1 / 36 Outline Sparse Sparse Sparse Probability

More information

Optimality of Large MIMO Detection via Approximate Message Passing

Optimality of Large MIMO Detection via Approximate Message Passing ptimality of Large MIM Detection via Approximate Message Passing Charles Jeon, Ramina Ghods, Arian Maleki, and Christoph Studer arxiv:5.695v [cs.it] ct 5 Abstract ptimal data detection in multiple-input

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Thresholds for the Recovery of Sparse Solutions via L1 Minimization Thresholds for the Recovery of Sparse Solutions via L Minimization David L. Donoho Department of Statistics Stanford University 39 Serra Mall, Sequoia Hall Stanford, CA 9435-465 Email: donoho@stanford.edu

More information

Convergence Rates of Kernel Quadrature Rules

Convergence Rates of Kernel Quadrature Rules Convergence Rates of Kernel Quadrature Rules Francis Bach INRIA - Ecole Normale Supérieure, Paris, France ÉCOLE NORMALE SUPÉRIEURE NIPS workshop on probabilistic integration - Dec. 2015 Outline Introduction

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego

Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego 1 Outline Course Outline Motivation for Course Sparse Signal Recovery Problem Applications Computational

More information

Belief Propagation, Information Projections, and Dykstra s Algorithm

Belief Propagation, Information Projections, and Dykstra s Algorithm Belief Propagation, Information Projections, and Dykstra s Algorithm John MacLaren Walsh, PhD Department of Electrical and Computer Engineering Drexel University Philadelphia, PA jwalsh@ece.drexel.edu

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Bodduluri Asha, B. Leela kumari Abstract: It is well known that in a real world signals do not exist without noise, which may be negligible

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Scale Mixture Modeling of Priors for Sparse Signal Recovery

Scale Mixture Modeling of Priors for Sparse Signal Recovery Scale Mixture Modeling of Priors for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Outline Outline Sparse

More information

Elaine T. Hale, Wotao Yin, Yin Zhang

Elaine T. Hale, Wotao Yin, Yin Zhang , Wotao Yin, Yin Zhang Department of Computational and Applied Mathematics Rice University McMaster University, ICCOPT II-MOPTA 2007 August 13, 2007 1 with Noise 2 3 4 1 with Noise 2 3 4 1 with Noise 2

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Communication by Regression: Achieving Shannon Capacity

Communication by Regression: Achieving Shannon Capacity Communication by Regression: Practical Achievement of Shannon Capacity Department of Statistics Yale University Workshop Infusing Statistics and Engineering Harvard University, June 5-6, 2011 Practical

More information

LEARNING DATA TRIAGE: LINEAR DECODING WORKS FOR COMPRESSIVE MRI. Yen-Huan Li and Volkan Cevher

LEARNING DATA TRIAGE: LINEAR DECODING WORKS FOR COMPRESSIVE MRI. Yen-Huan Li and Volkan Cevher LARNING DATA TRIAG: LINAR DCODING WORKS FOR COMPRSSIV MRI Yen-Huan Li and Volkan Cevher Laboratory for Information Inference Systems École Polytechnique Fédérale de Lausanne ABSTRACT The standard approach

More information

Restricted Strong Convexity Implies Weak Submodularity

Restricted Strong Convexity Implies Weak Submodularity Restricted Strong Convexity Implies Weak Submodularity Ethan R. Elenberg Rajiv Khanna Alexandros G. Dimakis Department of Electrical and Computer Engineering The University of Texas at Austin {elenberg,rajivak}@utexas.edu

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Performance Trade-Offs in Multi-Processor Approximate Message Passing

Performance Trade-Offs in Multi-Processor Approximate Message Passing Performance Trade-Offs in Multi-Processor Approximate Message Passing Junan Zhu, Ahmad Beirami, and Dror Baron Department of Electrical and Computer Engineering, North Carolina State University, Email:

More information

Sparse, stable gene regulatory network recovery via convex optimization

Sparse, stable gene regulatory network recovery via convex optimization Sparse, stable gene regulatory network recovery via convex optimization Arwen Meister June, 11 Gene regulatory networks Gene expression regulation allows cells to control protein levels in order to live

More information

Learning discrete graphical models via generalized inverse covariance matrices

Learning discrete graphical models via generalized inverse covariance matrices Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

An Homotopy Algorithm for the Lasso with Online Observations

An Homotopy Algorithm for the Lasso with Online Observations An Homotopy Algorithm for the Lasso with Online Observations Pierre J. Garrigues Department of EECS Redwood Center for Theoretical Neuroscience University of California Berkeley, CA 94720 garrigue@eecs.berkeley.edu

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

arxiv:cs/ v2 [cs.it] 1 Oct 2006

arxiv:cs/ v2 [cs.it] 1 Oct 2006 A General Computation Rule for Lossy Summaries/Messages with Examples from Equalization Junli Hu, Hans-Andrea Loeliger, Justin Dauwels, and Frank Kschischang arxiv:cs/060707v [cs.it] 1 Oct 006 Abstract

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information