Compressive Inference Weihong Guo and Dan Yang Case Western Reserve University and SAMSI SAMSI transition workshop Project of Compressive Inference subgroup of Imaging WG Active members: Garvesh Raskutti, Jiayang Sun and Grace Yi Wang May 22, 2013 G & Y (CWRU & SAMSI) Compressive inference 1 / 20
Outline 1 Compressive Sensing 2 Compressive Inference 3 Method 4 Simulation 5 Conclusion G & Y (CWRU & SAMSI) Compressive inference 2 / 20
Compressive Sensing Why: Too long/expensive to collect data, too much space to store, too much time to analyze/retrieve information, and increases the risk of developing cancer in medical application. A volume of human brain scans. 175 slices. (Courtesy: oasis-brains.org.) G & Y (CWRU & SAMSI) Compressive inference 3 / 20
Traditional v.s. Compressive Sensing (CS) Fourier Domain Image Domain Traditional Compressive G & Y (CWRU & SAMSI) Compressive inference 4 / 20
Formulation Continuous Setup: f : X R is the intensity (or difference) function of an image. X R d is the ROI. Example: [0, 1] or [0, 1] 2 G & Y (CWRU & SAMSI) Compressive inference 5 / 20
Formulation Continuous Setup: f : X R is the intensity (or difference) function of an image. X R d is the ROI. Example: [0, 1] or [0, 1] 2 Discrete Setup: f = (f (x 1 ), f (x 2 ),, f (x p)) Example: difference of intensities at grid ( 1, 2,..., 1) for grayscale images p p G & Y (CWRU & SAMSI) Compressive inference 5 / 20
Formulation Continuous Setup: f : X R is the intensity (or difference) function of an image. X R d is the ROI. Example: [0, 1] or [0, 1] 2 Discrete Setup: f = (f (x 1 ), f (x 2 ),, f (x p)) Example: difference of intensities at grid ( 1, 2,..., 1) for grayscale images p p Compressive sensing: Observe: y = Af + ɛ where A R nxp is a sampling matrix with n p, satisfying RIP ɛ N(0, σ 2 I n) Goal: recover f or make inference about f G & Y (CWRU & SAMSI) Compressive inference 5 / 20
Formulation Continuous Setup: f : X R is the intensity (or difference) function of an image. X R d is the ROI. Example: [0, 1] or [0, 1] 2 Discrete Setup: f = (f (x 1 ), f (x 2 ),, f (x p)) Example: difference of intensities at grid ( 1, 2,..., 1) for grayscale images p p Compressive sensing: Observe: y = Af + ɛ where A R nxp is a sampling matrix with n p, satisfying RIP ɛ N(0, σ 2 I n) Goal: recover f or make inference about f Comparison with statistical latent variable model y = f + ɛ z = Ay + γ G & Y (CWRU & SAMSI) Compressive inference 5 / 20
Example Low dimensional information retrieval (Courtesy: sadies-brain-tumor.org) G & Y (CWRU & SAMSI) Compressive inference 6 / 20
Compressive Inference Recall: given data y = Af + ɛ, make reference about f from y. Examples: 1 Test H 0 : f (x) = 0, x X - Applicable in comparison of images - Aside: Find the support of f, i.e., {x : f (x) 0} - Multiple hypothesis testing G & Y (CWRU & SAMSI) Compressive inference 7 / 20
Compressive Inference Recall: given data y = Af + ɛ, make reference about f from y. Examples: 1 Test H 0 : f (x) = 0, x X - Applicable in comparison of images - Aside: Find the support of f, i.e., {x : f (x) 0} - Multiple hypothesis testing 2 Test H 0 : f = 0 vs H 1 : f = s, where s is known or unknown - Davenport et al. (2006, 2010), Duarte et al. (2006) - Single hypothesis testing - Likelihood or Hotelling T 2 G & Y (CWRU & SAMSI) Compressive inference 7 / 20
Compressive Inference Recall: given data y = Af + ɛ, make reference about f from y. Examples: 1 Test H 0 : f (x) = 0, x X - Applicable in comparison of images - Aside: Find the support of f, i.e., {x : f (x) 0} - Multiple hypothesis testing 2 Test H 0 : f = 0 vs H 1 : f = s, where s is known or unknown - Davenport et al. (2006, 2010), Duarte et al. (2006) - Single hypothesis testing - Likelihood or Hotelling T 2 3 Test H 0i : f i = 0 vs H 1i : f i 0 - Buhlmann (2012) - Multiple hypothesis testing - Sparsity (no function perspective) - Combination of LASSO and Ridge - Discrete conservative G & Y (CWRU & SAMSI) Compressive inference 7 / 20
Compressive Inference Recall: given data y = Af + ɛ, make reference about f from y. Examples: 1 Test H 0 : f (x) = 0, x X - Applicable in comparison of images - Aside: Find the support of f, i.e., {x : f (x) 0} - Multiple hypothesis testing 2 Test H 0 : f = 0 vs H 1 : f = s, where s is known or unknown - Davenport et al. (2006, 2010), Duarte et al. (2006) - Single hypothesis testing - Likelihood or Hotelling T 2 3 Test H 0i : f i = 0 vs H 1i : f i 0 - Buhlmann (2012) - Multiple hypothesis testing - Sparsity (no function perspective) - Combination of LASSO and Ridge - Discrete conservative Key: CI takes advantage of compressibility of smooth image (continuous f ), so that complex information can be obtained from small amount information y G & Y (CWRU & SAMSI) Compressive inference 7 / 20
Method Smooth assumption of f Estimation of f by kernel ridge regression Tube method on compressed sensing G & Y (CWRU & SAMSI) Compressive inference 8 / 20
Smoothness Assumption Need to impose assumptions on class of difference of images function f. Eg. Polynomial, Lipschitz, smoothing spline. Special cases of Reproducing Kernel Hilbert Spaces (RKHS). G & Y (CWRU & SAMSI) Compressive inference 9 / 20
Smoothness Assumption Need to impose assumptions on class of difference of images function f. Eg. Polynomial, Lipschitz, smoothing spline. Special cases of Reproducing Kernel Hilbert Spaces (RKHS). Important properties: Mercer s theorem: K : X X R K(x, x ) = λ k φ k (x)φ k (x ) k=1 Statistical complexity decay of eigenvalues G & Y (CWRU & SAMSI) Compressive inference 9 / 20
Example Lipschitz Kernel K(x, x ) = min{x, x } Function class {f : f L 2, f L 2 } Corresponds to Sobolev class with smoothness α = 1 Other examples G & Y (CWRU & SAMSI) Compressive inference 10 / 20
From Assumption to Penalty Expansion for f in RKHS K(x, x ) = λ k φ k (x)φ k (x ) k=1 f (x) = a k φ k (x) k=1 Hilbert ball of radius ρ B H(ρ) = {f : f 2 H = k a 2 k λ k ρ 2 } G & Y (CWRU & SAMSI) Compressive inference 11 / 20
From Assumption to Penalty Expansion for f in RKHS K(x, x ) = λ k φ k (x)φ k (x ) k=1 f (x) = a k φ k (x) k=1 Hilbert ball of radius ρ B H(ρ) = {f : f 2 H = k a 2 k λ k ρ 2 } Kernel Ridge Regression min y Af 2 2 + λ f 2 H min y AΦa 2 2 + λa T Λ 1 a where (Φ) ik = φ k (x i ), Λ = diag(λ 1, λ 2,...), a = (a 1, a 2,...) T. G & Y (CWRU & SAMSI) Compressive inference 11 / 20
Estimator Minimizer ˆfλ (x) = κ(x) T A T (AK A T + λi) 1 y where (K ) ij = K(x i, x j ), κ(x) = (K(x, x 1 ),..., K(x, x n)) T G & Y (CWRU & SAMSI) Compressive inference 12 / 20
Estimator Minimizer ˆfλ (x) = κ(x) T A T (AK A T + λi) 1 y def. = l(x), y where (K ) ij = K(x i, x j ), κ(x) = (K(x, x 1 ),..., K(x, x n)) T Linear in y G & Y (CWRU & SAMSI) Compressive inference 12 / 20
SCR via Tube Method Confidence bands (ˆf (x) ± cˆσ l(x) ) G & Y (CWRU & SAMSI) Compressive inference 13 / 20
SCR via Tube Method Confidence bands (ˆf (x) ± cˆσ l(x) ) Simultaneous: α = P( ˆf (x) f (x) > cˆσ l(x), x X ) G & Y (CWRU & SAMSI) Compressive inference 13 / 20
SCR via Tube Method Confidence bands (ˆf (x) ± cˆσ l(x) ) Simultaneous: α = P( ˆf (x) f (x) > cˆσ l(x), x X ) Tube method (Sun and Loader, 94) for d = 1 α κ 0 π where κ 0 can be derived from l(x) ) ν/2 (1 + c2 + P( t ν > c) ν G & Y (CWRU & SAMSI) Compressive inference 13 / 20
SCR via Tube Method Confidence bands (ˆf (x) ± cˆσ l(x) ) Simultaneous: α = P( ˆf (x) f (x) > cˆσ l(x), x X ) Tube method (Sun and Loader, 94) for d = 1 α κ 0 π where κ 0 can be derived from l(x) ) ν/2 (1 + c2 + P( t ν > c) ν Decision: reject H 0 if there exists x such that ˆf (x) ˆσ l(x) > c G & Y (CWRU & SAMSI) Compressive inference 13 / 20
Simulation Setup 1 Fix p; vary n 2 Consider functions f = 0, f 1, f 2 evaluated at x i = i/p. 3 Generate A n p : A ij iid N(0, 1/n). 4 Generate ɛ N(0, I n). 5 y = Af + ɛ. Tube Bonferroni H 0 : f (x) = 0, x X H 0i : f (x i ) = 0 vs H 1i : f (x i ) 0 G & Y (CWRU & SAMSI) Compressive inference 14 / 20
Table: Test Size;(T: tube method; B: Bonferroni). n/p method α = 1% 10% 20% 30% 40% 50% 100% T 0.90% 1.20% 1.10% 0.70% 1.10% 0.70% B 0.20% 0.20% 0.00% 0.20% 0.10% 0.00% n/p method α = 5% 10% 20% 30% 40% 50% 100% T 4.80% 4.50% 5.30% 4.40% 3.10% 3.70% B 0.30% 0.50% 0.30% 0.60% 0.80% 0.00% n/p method α = 10% 10% 20% 30% 40% 50% 100% T 9.10% 9.40% 8.90% 9.60% 7.70% 7.70% B 0.40% 1.00% 1.00% 1.10% 1.10% 0.00% G & Y (CWRU & SAMSI) Compressive inference 15 / 20
Figure: f 1 = 2δ x 0.5, x [0, 1]; f 2 = δ exp ( 10 4 x 0.5 2 ), x [0, 1] G & Y (CWRU & SAMSI) Compressive inference 16 / 20
Figure: Test Power for f 1 = 2δ x 0.5, x [0, 1] G & Y (CWRU & SAMSI) Compressive inference 17 / 20
Figure: Test Power for f 2 = δ exp ( 10 4 x 0.5 2 ), x [0, 1] G & Y (CWRU & SAMSI) Compressive inference 18 / 20
Future Work Multidimensional image d 2; real image and video Automating λ Supervised selection of K More asymptotics Conditional tube formula (A random) Software Real applications: medical, hidden messages, security monitoring, etc. G & Y (CWRU & SAMSI) Compressive inference 19 / 20
Thank you! G & Y (CWRU & SAMSI) Compressive inference 20 / 20