Scale Mixture Modeling of Priors for Sparse Signal Recovery
|
|
- Drusilla Hall
- 5 years ago
- Views:
Transcription
1 Scale Mixture Modeling of Priors for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri
2 Outline
3 Outline Sparse Signal Recovery (SSR) Problem and some Extensions
4 Outline Sparse Signal Recovery (SSR) Problem and some Extensions Scale Mixture Priors Gaussian Scale Mixture (GSM) Laplacian Scale Mixture (LSM) Power Exponential Scale Mixture (PESM)
5 Outline Sparse Signal Recovery (SSR) Problem and some Extensions Scale Mixture Priors Gaussian Scale Mixture (GSM) Laplacian Scale Mixture (LSM) Power Exponential Scale Mixture (PESM) Bayesian Methods MAP estimation (Type I) Hierarchical Bayes (Type II)
6 Outline Sparse Signal Recovery (SSR) Problem and some Extensions Scale Mixture Priors Gaussian Scale Mixture (GSM) Laplacian Scale Mixture (LSM) Power Exponential Scale Mixture (PESM) Bayesian Methods MAP estimation (Type I) Hierarchical Bayes (Type II) Experimental Results
7 Outline Sparse Signal Recovery (SSR) Problem and some Extensions Scale Mixture Priors Gaussian Scale Mixture (GSM) Laplacian Scale Mixture (LSM) Power Exponential Scale Mixture (PESM) Bayesian Methods MAP estimation (Type I) Hierarchical Bayes (Type II) Experimental Results Summary
8 Problem Description: Sparse Signal Recovery (SSR) y is a N 1 measurement vector. Φ is N M dictionary matrix where M >> N. x is M 1 desired vector which is sparse with k non zero entries. v is the measurement noise.
9 Extensions
10 Extensions Block Sparsity
11 Extensions Block Sparsity Multiple Measurement Vectors (MMV)
12 Extensions Block Sparsity Multiple Measurement Vectors (MMV) Block MMV MMV with time varying sparsity
13 Multiple Measurement Vectors (MMV) Multiple measurements: L measurements Common Sparsity Profile: k nonzero rows
14 Applications
15 Applications Signal Representation (Mallat, Coifman, Donoho,..) EEG/MEG (Leahy, Gorodnitsky,Ioannides,..) Robust Linear Regression and Outlier Detection (Jin, Giannakis,..) Speech Coding (Ozawa, Ono, Kroon,..) Compressed Sensing (Donoho, Candes, Tao,..) Magnetic Resonance Imaging (Lustig,..) Sparse Channel Equalization (Fevrier, Proakis,...) Face Recognition (Wright, Yang,...) Cognitive Radio (Eldar,..) and many more...
16 Potential Algorithmic Approaches Finding the Optimal Solution is NP hard. So need low complexity algorithms with reasonable performance.
17 Potential Algorithmic Approaches Finding the Optimal Solution is NP hard. So need low complexity algorithms with reasonable performance. Greedy Search Techniques Matching Pursuit (MP), Orthogonal Matching Pursuit (OMP),...
18 Potential Algorithmic Approaches Finding the Optimal Solution is NP hard. So need low complexity algorithms with reasonable performance. Greedy Search Techniques Matching Pursuit (MP), Orthogonal Matching Pursuit (OMP),... Minimizing Diversity Measures (Regularization Framework) Tractable Surrogate Cost functions: e.g. l 1 minimization,...
19 Potential Algorithmic Approaches Finding the Optimal Solution is NP hard. So need low complexity algorithms with reasonable performance. Greedy Search Techniques Matching Pursuit (MP), Orthogonal Matching Pursuit (OMP),... Minimizing Diversity Measures (Regularization Framework) Tractable Surrogate Cost functions: e.g. l 1 minimization,... Bayesian Methods Make appropriate Statistical assumptions on the solution (sparsity): Choice of Prior
20 Bayesian Methods: Choice of Prior Super Gaussian Distributions: Heavy tailed and sharper peak at origin compared to Gaussian. Tractable representations using Scale Mixtures: Gaussian Scale Mixture (GSM) Laplacian Scale Mixture (LSM) Power Exponential Scale Mixture (PESM)
21 Gaussian Scale Mixtures Separability: p(x) = i p(x i)
22 Gaussian Scale Mixtures Separability: p(x) = i p(x i) p(x i ) = p(x i γ i )p(γ i )dγ i = N(x i ; 0, γ i )p(γ i )dγ i
23 Gaussian Scale Mixtures Separability: p(x) = i p(x i) p(x i ) = p(x i γ i )p(γ i )dγ i = N(x i ; 0, γ i )p(γ i )dγ i Theorem A density p(x) which is symmetric with respect origin, can be represented by a GSM iff p( x) is completely monotonic on (0, ).
24 Gaussian Scale Mixtures Separability: p(x) = i p(x i) p(x i ) = p(x i γ i )p(γ i )dγ i = N(x i ; 0, γ i )p(γ i )dγ i Theorem A density p(x) which is symmetric with respect origin, can be represented by a GSM iff p( x) is completely monotonic on (0, ). Most of the sparse priors over x can be represented in this GSM form. [Palmer et al., 2006]
25 Examples of Gaussian Scale Mixture Laplacian density p(x; a) = a 2 exp( a x ) Scale mixing density: p(γ) = a2 2 a2 exp( 2 γ), γ 0.
26 Examples of Gaussian Scale Mixture Laplacian density p(x; a) = a 2 exp( a x ) Scale mixing density: p(γ) = a2 2 Student-t Distribution p(x; a, b) = ba Γ(a + 1/2) (2π) 0.5 Γ(a) a2 exp( 2 γ), γ 0. Scale mixing density: Gamma Distribution. 1 (b + x 2 /2) a+1/2
27 Examples of Gaussian Scale Mixture Laplacian density p(x; a) = a 2 exp( a x ) Scale mixing density: p(γ) = a2 2 Student-t Distribution p(x; a, b) = ba Γ(a + 1/2) (2π) 0.5 Γ(a) a2 exp( 2 γ), γ 0. Scale mixing density: Gamma Distribution. 1 (b + x 2 /2) a+1/2 Generalized Gaussian p(x; p) = 1 2Γ(1 + 1 p )e x p Scale mixing density: Positive alpha stable density of order p/2.
28 Generalized Scale Mixture Family GSM corresponds to l 2 norm based SSR algorithm. LSM corresponds to l 1 norm based SSR algorithm. Need a generalized scale mixture for a unified treatment of l 1 and l 2 minimization based SSR.
29 Power Exponential Scale Mixture Distributions (PESM) Power Exponential Distribution Also known as Box and Tiao (BT) or Generalized Gaussian distribution (GGD). p PE (x; 0, σ, p) = Ke x p σ p
30 Power Exponential Scale Mixture Distributions (PESM) Power Exponential Distribution Also known as Box and Tiao (BT) or Generalized Gaussian distribution (GGD). p PE (x; 0, σ, p) = Ke x p σ p Scale Mixture of Power Exponential : p(x i ) = p(x i γ i )p(γ i )dγ i = p PE (x i ; 0, γ i, p)p(γ i )dγ i
31 Power Exponential Scale Mixture Distributions (PESM) Choice of p=2 Gaussian Scale Mixtures (GSM): l 2 norm minimization based algorithms. Choice of p=1 Laplacian Scale Mixtures (LSM): l 1 norm minimization based algorithms. PESM Unified treatment of both l 1 and l 2 based algorithms.
32 PESM Example: Generalized t distribution
33 PESM Example: Generalized t distribution Inverse Generalized Gamma (GG) for scaling density: p(γ i ) = p GG (γ i ; p, σ, q) = η(σ/γ i ) pq+1 e (σ/γ i ) p
34 PESM Example: Generalized t distribution Inverse Generalized Gamma (GG) for scaling density: p(γ i ) = p GG (γ i ; p, σ, q) = η(σ/γ i ) pq+1 e (σ/γ i ) p p GT (x; σ, p, q) = K(1 + x p qσ p ) (q+1/p)
35 PESM Example: Generalized t distribution Inverse Generalized Gamma (GG) for scaling density: p(γ i ) = p GG (γ i ; p, σ, q) = η(σ/γ i ) pq+1 e (σ/γ i ) p p GT (x; σ, p, q) = K(1 + x p qσ p ) (q+1/p) A wide class of heavy tailed super gaussian densities can be represented by GT using suitable shape parameters p and q.
36 Variants of GT Table: Variants of Generalized t Distribution q p Distribution q 2 Normal q 1 Laplacian (Double Exponential) q 0 (degrees of freedom) 2 Student t distribution q 0 (shape parameter) 1 Generalized Double Pareto (GDP)
37 Bayesian Methods
38 Bayesian Methods MAP Estimation (Type I)
39 Bayesian Methods MAP Estimation (Type I) Hierarchical Bayes (Type II)
40 MAP Estimation Framework (Type I) Problem Statement ˆx = arg max x p(x y) = arg max p(y x)p(x) x Choice of p(x) = a 2 e a x as Laplacian and Gaussian Likelihood assumption will lead to the familiar LASSO framework.
41 Hierarchical Bayesian Framework (Type II)
42 Hierarchical Bayesian Framework (Type II) Problem Statement ˆγ = arg max γ p(γ y) = arg max p(y γ)p(γ) γ Using this estimate of γ we can compute our concerned posterior p(x y; ˆγ).
43 Hierarchical Bayesian Framework (Type II) Potential Advantages Averaging over x leads to fewer minima in p(γ y). γ can tie several parameters, leading to fewer parameters. Maximizing the true posterior mass over the subspaces spanned by non zero indexes instead of looking for the mode.
44 Hierarchical Bayesian Framework (Type II) Potential Advantages Averaging over x leads to fewer minima in p(γ y). γ can tie several parameters, leading to fewer parameters. Maximizing the true posterior mass over the subspaces spanned by non zero indexes instead of looking for the mode. Bayesian LASSO Laplacian p(x) as GSM a : p(x) = p(x γ)p(γ)dγ 1 = exp( x 2 2πγ 2γ ) }{{} p(x γ) = a 2 exp( a x ) a2 2 exp( a2 2 γ) dγ }{{} p(γ) a Bayesian Compressive Sensing Using Laplace Priors, Babacan et al
45 MAP Estimation (Type I) Framework Problem Statement ˆx = arg max x log p(x y) = arg max log p(y x) + log p(x) x
46 MAP Estimation (Type I) Framework Problem Statement Examples: ˆx = arg max x log p(x y) = arg max log p(y x) + log p(x) x Prior Distribution Penalty Function SSR Algorithm Normal x 2 Ridge Regression Laplacian x 1 LASSO Student t distribution log(ɛ + x 2 ) Reweighted l 2 (Chartrand s) Generalized Double Pareto log(ɛ + x ) Reweighted l 1 (Candes s)
47 MAP Estimation (Type I) Framework Problem Statement Examples: ˆx = arg max x log p(x y) = arg max log p(y x) + log p(x) x Prior Distribution Penalty Function SSR Algorithm Normal x 2 Ridge Regression Laplacian x 1 LASSO Student t distribution log(ɛ + x 2 ) Reweighted l 2 (Chartrand s) Generalized Double Pareto log(ɛ + x ) Reweighted l 1 (Candes s) PESM as sparsity promoting prior p(x): Unified Type I Framework
48 Unified Type I Framework Choice of Prior: p(x) Any distribution in PESM class.
49 Unified Type I Framework Choice of Prior: p(x) Any distribution in PESM class. EM Algorithm Complete Data Log-Likelihood: log p(y, x, γ) = log p(y x) + log p(x γ) + log p(γ) Hidden Variable: γ Concerned Posterior: p(γ x, y) p(γ x) (From Markov chain).
50 Unified Type I: E step [ ] Q(x) = E γ x log p(y x) + log p(x γ) + log p(γ) E Step Only second term has dependencies on both x and γ. [ Compute E 1 ] γi x i γ p i
51 Unified Type I: E step p (x i ) = d dx i 0 p(x i γ i )p(γ i )dγ i = p x i p 1 sign(x i )p(x i ) = p x i p 1 sign(x i )p(x i )E γi x i [ 1 γ p i 0 1 p(γ i x i )dγ i γ p i ] E step E γi x i [ 1 γ p i ] = p (x i ) p x i p 1 sign(x i )p(x i )
52 Unified Type I: E step p (x i ) = d dx i 0 p(x i γ i )p(γ i )dγ i = p x i p 1 sign(x i )p(x i ) = p x i p 1 sign(x i )p(x i )E γi x i [ 1 γ p i 0 1 p(γ i x i )dγ i γ p i ] E step E γi x i [ 1 γ p i ] = p (x i ) p x i p 1 sign(x i )p(x i ) Note: No need to know p(γ), as long as p(x) is known and has a PESM representation.
53 Unified Type I: M step M step ˆx (k+1) 1 = arg min x 2λ y Φx 2 + i w (k) i x i p Where, w (k) i [ 1 ] = E γi x (k) i γ p i Special Case: Generalized t distribution w (k) i = q + 1/p qσ p + x (k) i p
54 Hierarchical Bayesian Framework (Type II)
55 Hierarchical Bayesian Framework (Type II) Estimate of the posterior distribution for x using estimated ˆγ; i.e.p(x y; ˆγ). Choice of GSM as p(x) leads to Sparse Bayesian Learning
56 Sparse Bayesian Learning (Type II)
57 Sparse Bayesian Learning (Type II) y = Φx + v
58 Sparse Bayesian Learning (Type II) y = Φx + v Solving for MAP estimate of γ ˆγ = arg max γ p(γ y) = arg max p(y γ)p(γ) γ
59 Sparse Bayesian Learning (Type II) y = Φx + v Solving for MAP estimate of γ What is p(y γ) ˆγ = arg max γ p(γ y) = arg max p(y γ)p(γ) γ
60 Sparse Bayesian Learning (Type II) y = Φx + v Solving for MAP estimate of γ What is p(y γ) ˆγ = arg max γ p(γ y) = arg max p(y γ)p(γ) γ Given γ, x is Gaussian with mean zero and Covariance matrix Γ with Γ = diag(γ), i.e. p(x γ) = N(x; 0, Γ) = ΠN(x i ; 0, γ i ).
61 Sparse Bayesian Learning (Type II) y = Φx + v Solving for MAP estimate of γ What is p(y γ) ˆγ = arg max γ p(γ y) = arg max p(y γ)p(γ) γ Given γ, x is Gaussian with mean zero and Covariance matrix Γ with Γ = diag(γ), i.e. p(x γ) = N(x; 0, Γ) = ΠN(x i ; 0, γ i ). Then p(y γ) = N(y; 0, Σ y ), where Σ y = σ 2 I + ΦΓΦ T, 1 p(y γ) = (2π)N Σ y e 1 2 y T Σ 1 y y
62 Sparse Bayesian Learning (Tipping) y = Φx + v Solving for the optimal γ ˆγ = arg max γ = arg min γ p(γ y) = arg max p(y γ)p(γ) γ log Σ y + y T Σ 1 y y 2 i where, Σ y = σ 2 I + ΦΓΦ T and Γ = diag(γ) logp(γ i )
63 Sparse Bayesian Learning (Tipping) y = Φx + v Solving for the optimal γ ˆγ = arg max γ = arg min γ p(γ y) = arg max p(y γ)p(γ) γ log Σ y + y T Σ 1 y y 2 i where, Σ y = σ 2 I + ΦΓΦ T and Γ = diag(γ) logp(γ i ) Computational Methods Many options for solving the above optimization problem, e.g. Majorization Minimization, Expectation-Maximization (EM).
64 Sparse Bayesian Learning y = Φx + v Computing Posterior Now because of our convenient GSM choice, posterior can be easily computed, i.e, p(x y; ˆγ) = N(µ x, Σ x ) where, µ x = E[x y; ˆγ] = ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 y Σ x = Cov[x y; ˆγ] = ˆΓ ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 ΦˆΓ
65 Sparse Bayesian Learning y = Φx + v Computing Posterior Now because of our convenient GSM choice, posterior can be easily computed, i.e, p(x y; ˆγ) = N(µ x, Σ x ) where, µ x = E[x y; ˆγ] = ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 y Σ x = Cov[x y; ˆγ] = ˆΓ ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 ΦˆΓ µ x can be used as a point estimate.
66 Sparse Bayesian Learning y = Φx + v Computing Posterior Now because of our convenient GSM choice, posterior can be easily computed, i.e, p(x y; ˆγ) = N(µ x, Σ x ) where, µ x = E[x y; ˆγ] = ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 y Σ x = Cov[x y; ˆγ] = ˆΓ ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 ΦˆΓ µ x can be used as a point estimate. Sparsity of µ x is achieved through sparsity in γ.
67 Sparse Bayesian Learning y = Φx + v Computing Posterior Now because of our convenient GSM choice, posterior can be easily computed, i.e, p(x y; ˆγ) = N(µ x, Σ x ) where, µ x = E[x y; ˆγ] = ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 y Σ x = Cov[x y; ˆγ] = ˆΓ ˆΓΦ T (σ 2 I + ΦˆΓΦ T ) 1 ΦˆΓ µ x can be used as a point estimate. Sparsity of µ x is achieved through sparsity in γ. Another parameter of interest for the EM algorithm E(x 2 i y, ˆγ) = µ 2 x(i) + Σ x (i, i)
68 EM algorithm: Updating γ
69 EM algorithm: Updating γ Treating (y, x) as complete data and vector x as hidden variable. log p(y, x, γ) = log p(y x) + log p(x γ) + log p(γ)
70 EM algorithm: Updating γ Treating (y, x) as complete data and vector x as hidden variable. log p(y, x, γ) = log p(y x) + log p(x γ) + log p(γ) E step Q(γ γ k ) = E x y;γ k [log p(y x) + log p(x γ) + log p(γ)]
71 EM algorithm: Updating γ Treating (y, x) as complete data and vector x as hidden variable. log p(y, x, γ) = log p(y x) + log p(x γ) + log p(γ) E step Q(γ γ k ) = E x y;γ k [log p(y x) + log p(x γ) + log p(γ)] M step γ k+1 = argmax γ Q(γ γ k ) = argmax γ E x y;γ k [log p(x γ) + log p(γ)] M [( x 2 = argmin γ E i x y;γ k + 1 ) ] 2γ i 2 log γ i log p(γ i ) i=1
72 EM algorithm: Updating γ Treating (y, x) as complete data and vector x as hidden variable. log p(y, x, γ) = log p(y x) + log p(x γ) + log p(γ) E step Q(γ γ k ) = E x y;γ k [log p(y x) + log p(x γ) + log p(γ)] M step γ k+1 = argmax γ Q(γ γ k ) = argmax γ E x y;γ k [log p(x γ) + log p(γ)] M [( x 2 = argmin γ E i x y;γ k + 1 ) ] 2γ i 2 log γ i log p(γ i ) i=1 Solving this optimization problem with a non-informative prior p(γ), γ k+1 i = E(x 2 i y, γ k ) = µ x (i) 2 + Σ x (i, i)
73 Type II (SBL) properties
74 Type II (SBL) properties Local minima are sparse, i.e. have at most N nonzero γ i
75 Type II (SBL) properties Local minima are sparse, i.e. have at most N nonzero γ i Cost function p(γ y) is generally much smoother than the associated MAP estimation objective p(x y). Fewer local minima.
76 Type II (SBL) properties Local minima are sparse, i.e. have at most N nonzero γ i Cost function p(γ y) is generally much smoother than the associated MAP estimation objective p(x y). Fewer local minima. In high signal to noise ratio, the global minima is the sparsest solution. No structural problems.
77 Type II (SBL) properties Local minima are sparse, i.e. have at most N nonzero γ i Cost function p(γ y) is generally much smoother than the associated MAP estimation objective p(x y). Fewer local minima. In high signal to noise ratio, the global minima is the sparsest solution. No structural problems. Attempts to approximate the posterior distribution p(x y) in the area with significant mass.
78 Algorithmic Variants
79 Algorithmic Variants Fixed Point iteration based on setting the derivative of the objective function to zero (Tipping)
80 Algorithmic Variants Fixed Point iteration based on setting the derivative of the objective function to zero (Tipping) Sequential search for the significant γ s (Tipping and Faul)
81 Algorithmic Variants Fixed Point iteration based on setting the derivative of the objective function to zero (Tipping) Sequential search for the significant γ s (Tipping and Faul) Majorization-Minimization based approach (Wipf and Nagarajan)
82 Algorithmic Variants Fixed Point iteration based on setting the derivative of the objective function to zero (Tipping) Sequential search for the significant γ s (Tipping and Faul) Majorization-Minimization based approach (Wipf and Nagarajan) Reweighted l 1 and l 2 algorithms (Wipf and Nagarajan)
83 Algorithmic Variants Fixed Point iteration based on setting the derivative of the objective function to zero (Tipping) Sequential search for the significant γ s (Tipping and Faul) Majorization-Minimization based approach (Wipf and Nagarajan) Reweighted l 1 and l 2 algorithms (Wipf and Nagarajan) Approximate Message Passing (AlShoukairi and Rao)
84 Type II using PESM In E step we need to compute the conditional expectation. Closed form may not be available depending on the choice of p (distributional parameter of PESM). Alternative: MCMC technique.
85 Type II using PESM In E step we need to compute the conditional expectation. Closed form may not be available depending on the choice of p (distributional parameter of PESM). Alternative: MCMC technique. LSM Using the fact that a Laplacian density has a GSM representation, a tractable 3 layer hierarchical model can be developed.
86 Simulation Results Parameters 1 N = 50, M = Dictionary Elements: Normal Distribution with mean = 0 and standard deviation = 1. 3 Distribution of non zero elements (I) Zero mean unit variance Gaussian. (II) Student t distribution with degrees of freedom ν = 3. (Super-Gaussian) (III) Uniform ±1 random spikes.
87 Simulation Results: Gaussian Figure: Recovery performance with Gaussian distributed non zero coefficients
88 Simulation Results: Super Gaussian Figure: Recovery performance with Super Gaussian (Student t) distributed non zero coefficients
89 Simulation Results: Uniform Figure: Recovery performance with uniform spikes as non zero coefficients
90 Special Case: MMV Multiple measurements: L measurements Common Sparsity Profile: k nonzero rows
91 Bayesian Methods: GSM Extension
92 Bayesian Methods: GSM Extension Representation for Random Vectors (Rows for MMV) X = γg where, G N(g; 0, B) γ is a positive random variable, which is independent of G.
93 Bayesian Methods: GSM Extension Representation for Random Vectors (Rows for MMV) X = γg where, G N(g; 0, B) γ is a positive random variable, which is independent of G. p(x) = p(x γ)p(γ)dγ = N(x; 0, γb)p(γ)dγ
94 Bayesian Methods: GSM Extension Representation for Random Vectors (Rows for MMV) X = γg where, G N(g; 0, B) γ is a positive random variable, which is independent of G. p(x) = p(x γ)p(γ)dγ = N(x; 0, γb)p(γ)dγ B = I if the row entries are assumed independent.
95 Bayesian Methods: GSM Extension Representation for Random Vectors (Rows for MMV) X = γg where, G N(g; 0, B) γ is a positive random variable, which is independent of G. p(x) = p(x γ)p(γ)dγ = N(x; 0, γb)p(γ)dγ B = I if the row entries are assumed independent. One γ per row vector. Complexity of estimating γ does not grow with L.
96 Bayesian Methods: GSM Extension Representation for Random Vectors (Rows for MMV) X = γg where, G N(g; 0, B) γ is a positive random variable, which is independent of G. p(x) = p(x γ)p(γ)dγ = N(x; 0, γb)p(γ)dγ B = I if the row entries are assumed independent. One γ per row vector. Complexity of estimating γ does not grow with L. The EM algorithm is also very tractable.
97 MMV Empirical Comparison: 1000 trials
98 Summary
99 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications.
100 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications. Many algorithmic options exist to solve the underlying sparse signal recovery problem; Greedy Search Techniques, regularization methods, Bayesian methods, among others.
101 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications. Many algorithmic options exist to solve the underlying sparse signal recovery problem; Greedy Search Techniques, regularization methods, Bayesian methods, among others. Bayesian methods offer interesting algorithmic options to the Sparse Signal Recovery problem
102 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications. Many algorithmic options exist to solve the underlying sparse signal recovery problem; Greedy Search Techniques, regularization methods, Bayesian methods, among others. Bayesian methods offer interesting algorithmic options to the Sparse Signal Recovery problem MAP methods (reweighted l 1 and l 2 methods)
103 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications. Many algorithmic options exist to solve the underlying sparse signal recovery problem; Greedy Search Techniques, regularization methods, Bayesian methods, among others. Bayesian methods offer interesting algorithmic options to the Sparse Signal Recovery problem MAP methods (reweighted l 1 and l 2 methods) Hierarchical Bayesian Methods (Sparse Bayesian Learning)
104 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications. Many algorithmic options exist to solve the underlying sparse signal recovery problem; Greedy Search Techniques, regularization methods, Bayesian methods, among others. Bayesian methods offer interesting algorithmic options to the Sparse Signal Recovery problem MAP methods (reweighted l 1 and l 2 methods) Hierarchical Bayesian Methods (Sparse Bayesian Learning) Versatile and can be more easily employed in problems with structure
105 Summary Sparse Signal Recovery (SSR) and Compressed Sensing (CS) are interesting new signal processing tools with many potential applications. Many algorithmic options exist to solve the underlying sparse signal recovery problem; Greedy Search Techniques, regularization methods, Bayesian methods, among others. Bayesian methods offer interesting algorithmic options to the Sparse Signal Recovery problem MAP methods (reweighted l 1 and l 2 methods) Hierarchical Bayesian Methods (Sparse Bayesian Learning) Versatile and can be more easily employed in problems with structure Algorithms can often be justified by studying the resulting objective functions.
Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble
Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Zhilin Zhang and Ritwik Giri Motivation Sparse Signal Recovery is an interesting
More informationBayesian Methods for Sparse Signal Recovery
Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Motivation Motivation Sparse Signal Recovery
More informationBhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego
Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego 1 Outline Course Outline Motivation for Course Sparse Signal Recovery Problem Applications Computational
More informationSparse Signal Recovery: Theory, Applications and Algorithms
Sparse Signal Recovery: Theory, Applications and Algorithms Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego Collaborators: I. Gorodonitsky, S. Cotter,
More informationEfficient Variational Inference in Large-Scale Bayesian Compressed Sensing
Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing George Papandreou and Alan Yuille Department of Statistics University of California, Los Angeles ICCV Workshop on Information
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationl 0 -norm Minimization for Basis Selection
l 0 -norm Minimization for Basis Selection David Wipf and Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego, CA 92092 dwipf@ucsd.edu, brao@ece.ucsd.edu Abstract
More informationLatent Variable Bayesian Models for Promoting Sparsity
1 Latent Variable Bayesian Models for Promoting Sparsity David Wipf 1, Bhaskar Rao 2, and Srikantan Nagarajan 1 1 Biomagnetic Imaging Lab, University of California, San Francisco 513 Parnassus Avenue,
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationAn Introduction to Sparse Approximation
An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationIntroduction to Compressed Sensing
Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationIntegrated Non-Factorized Variational Inference
Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014
More informationApproximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery
Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationEE 381V: Large Scale Optimization Fall Lecture 24 April 11
EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationRobust multichannel sparse recovery
Robust multichannel sparse recovery Esa Ollila Department of Signal Processing and Acoustics Aalto University, Finland SUPELEC, Feb 4th, 2015 1 Introduction 2 Nonparametric sparse recovery 3 Simulation
More informationSketching for Large-Scale Learning of Mixture Models
Sketching for Large-Scale Learning of Mixture Models Nicolas Keriven Université Rennes 1, Inria Rennes Bretagne-atlantique Adv. Rémi Gribonval Outline Introduction Practical Approach Results Theoretical
More informationIntroduction to Machine Learning
Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationParticle Filtered Modified-CS (PaFiMoCS) for tracking signal sequences
Particle Filtered Modified-CS (PaFiMoCS) for tracking signal sequences Samarjit Das and Namrata Vaswani Department of Electrical and Computer Engineering Iowa State University http://www.ece.iastate.edu/
More informationCompressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach
Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional
More informationOn the Role of the Properties of the Nonzero Entries on Sparse Signal Recovery
On the Role of the Properties of the Nonzero Entries on Sparse Signal Recovery Yuzhe Jin and Bhaskar D. Rao Department of Electrical and Computer Engineering, University of California at San Diego, La
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationColor Scheme. swright/pcmi/ M. Figueiredo and S. Wright () Inference and Optimization PCMI, July / 14
Color Scheme www.cs.wisc.edu/ swright/pcmi/ M. Figueiredo and S. Wright () Inference and Optimization PCMI, July 2016 1 / 14 Statistical Inference via Optimization Many problems in statistical inference
More informationNew Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit
New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence
More informationSparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference
Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationGeneralized Orthogonal Matching Pursuit- A Review and Some
Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents
More informationRecovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm
Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm J. K. Pant, W.-S. Lu, and A. Antoniou University of Victoria August 25, 2011 Compressive Sensing 1 University
More informationy(x) = x w + ε(x), (1)
Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued
More informationPre-weighted Matching Pursuit Algorithms for Sparse Recovery
Journal of Information & Computational Science 11:9 (214) 2933 2939 June 1, 214 Available at http://www.joics.com Pre-weighted Matching Pursuit Algorithms for Sparse Recovery Jingfei He, Guiling Sun, Jie
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationBayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine
Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine Mike Tipping Gaussian prior Marginal prior: single α Independent α Cambridge, UK Lecture 3: Overview
More informationSparse Solutions of an Undetermined Linear System
1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research
More informationNoisy Signal Recovery via Iterative Reweighted L1-Minimization
Noisy Signal Recovery via Iterative Reweighted L1-Minimization Deanna Needell UC Davis / Stanford University Asilomar SSC, November 2009 Problem Background Setup 1 Suppose x is an unknown signal in R d.
More informationInverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France
Inverse problems and sparse models (1/2) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France remi.gribonval@inria.fr Structure of the tutorial Session 1: Introduction to inverse problems & sparse
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationMLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationLarge-Scale L1-Related Minimization in Compressive Sensing and Beyond
Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Arizona State University March
More informationType II variational methods in Bayesian estimation
Type II variational methods in Bayesian estimation J. A. Palmer, D. P. Wipf, and K. Kreutz-Delgado Department of Electrical and Computer Engineering University of California San Diego, La Jolla, CA 9093
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationA Bayesian Perspective on Residential Demand Response Using Smart Meter Data
A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationOn Bayesian Computation
On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a
More informationRigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems
1 Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Philip Schniter, and Sundeep Rangan Abstract arxiv:1706.06054v1 cs.it
More informationGreedy Signal Recovery and Uniform Uncertainty Principles
Greedy Signal Recovery and Uniform Uncertainty Principles SPIE - IE 2008 Deanna Needell Joint work with Roman Vershynin UC Davis, January 2008 Greedy Signal Recovery and Uniform Uncertainty Principles
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationPerspectives on Sparse Bayesian Learning
Perspectives on Sparse Bayesian Learning David Wipf, Jason Palmer, and Bhaskar Rao Department of Electrical and Computer Engineering University of California, San Diego, CA 909 dwipf,japalmer@ucsd.edu,
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationCS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning
CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning Professor Erik Sudderth Brown University Computer Science October 4, 2016 Some figures and materials courtesy
More informationLearning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014
Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of
More informationSparsity in Underdetermined Systems
Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2
More informationCompressive Sensing and Beyond
Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered
More informationToday. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion
Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence
More informationOslo Class 6 Sparsity based regularization
RegML2017@SIMULA Oslo Class 6 Sparsity based regularization Lorenzo Rosasco UNIGE-MIT-IIT May 4, 2017 Learning from data Possible only under assumptions regularization min Ê(w) + λr(w) w Smoothness Sparsity
More informationFactor Analysis (10/2/13)
STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.
More informationMIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco
MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 08: Sparsity Based Regularization Lorenzo Rosasco Learning algorithms so far ERM + explicit l 2 penalty 1 min w R d n n l(y
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationwissen leben WWU Münster
MÜNSTER Sparsity Constraints in Bayesian Inversion Inverse Days conference in Jyväskylä, Finland. Felix Lucka 19.12.2012 MÜNSTER 2 /41 Sparsity Constraints in Inverse Problems Current trend in high dimensional
More informationHigher Order Statistics
Higher Order Statistics Matthias Hennig Neural Information Processing School of Informatics, University of Edinburgh February 12, 2018 1 0 Based on Mark van Rossum s and Chris Williams s old NIP slides
More informationK-Means and Gaussian Mixture Models
K-Means and Gaussian Mixture Models David Rosenberg New York University October 29, 2016 David Rosenberg (New York University) DS-GA 1003 October 29, 2016 1 / 42 K-Means Clustering K-Means Clustering David
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationRelevance Vector Machines
LUT February 21, 2011 Support Vector Machines Model / Regression Marginal Likelihood Regression Relevance vector machines Exercise Support Vector Machines The relevance vector machine (RVM) is a bayesian
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationPriors on the Variance in Sparse Bayesian Learning; the demi-bayesian Lasso
Priors on the Variance in Sparse Bayesian Learning; the demi-bayesian Lasso Suhrid Balakrishnan AT&T Labs Research 180 Park Avenue Florham Park, NJ 07932 suhrid@research.att.com David Madigan Department
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationStability and Robustness of Weak Orthogonal Matching Pursuits
Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery
More informationInferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016
Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines Eric W. Tramel itwist 2016 Aalborg, DK 24 August 2016 Andre MANOEL, Francesco CALTAGIRONE, Marylou GABRIE, Florent
More informationEM Algorithm & High Dimensional Data
EM Algorithm & High Dimensional Data Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Gaussian EM Algorithm For the Gaussian mixture model, we have Expectation Step (E-Step): Maximization Step (M-Step): 2 EM
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationRecovery Guarantees for Rank Aware Pursuits
BLANCHARD AND DAVIES: RECOVERY GUARANTEES FOR RANK AWARE PURSUITS 1 Recovery Guarantees for Rank Aware Pursuits Jeffrey D. Blanchard and Mike E. Davies Abstract This paper considers sufficient conditions
More informationLecture 1b: Linear Models for Regression
Lecture 1b: Linear Models for Regression Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationManifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA
Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/
More informationElaine T. Hale, Wotao Yin, Yin Zhang
, Wotao Yin, Yin Zhang Department of Computational and Applied Mathematics Rice University McMaster University, ICCOPT II-MOPTA 2007 August 13, 2007 1 with Noise 2 3 4 1 with Noise 2 3 4 1 with Noise 2
More informationExpectation Maximization (EM)
Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationLecture 5 : Sparse Models
Lecture 5 : Sparse Models Homework 3 discussion (Nima) Sparse Models Lecture - Reading : Murphy, Chapter 13.1, 13.3, 13.6.1 - Reading : Peter Knee, Chapter 2 Paolo Gabriel (TA) : Neural Brain Control After
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More information