Peak Detection for Images Armin Schwartzman Division of Biostatistics, UC San Diego June 016
Overview How can we improve detection power? Use a less conservative error criterion Take advantage of prior knowledge on the signal How do we know the algorithm works? Show theoretical error control and power consistency Show practical error control and power consistency simulations /39
Error Criteria in Multiple Testing Family-wise error rate FWER: Prob. of getting at least one false positive Computed under the complete null hypothesis of no signal anywhere Does not consider the signal False discovery rate FDR: Expected proportion of false positives among positives Benjamini and Hochberg Computed under the actual distribution of the data, including the signal 3/39
False Discovery Rate FDR False discovery proportion FDP Proportion of false discoveries: FDP = V maxr,1 False discovery rate FDR # false discoveries # discoveries Expected false discovery proportion: FDR = E maxr,1 V [ FDP] = E 4/39
Benjamini-Hochberg Procedure The BH procedure: Given m ordered p-values p 1 < < p m Reject the first k null hypotheses where k mpi = max i : < α i Theorem: If the p-values are independent or weakly positively dependent then m0 FDR m α α 5/39
Fluorescence Nanoscopy 1 μm Courtesy: Alex Egner, Gottingen 6/33
Background Adjusted 1 μm 7/33
Pixelwise BH 1 μm 8/33
Kernel Smoothing 1 μm =1 m ~ = 338 Local maxima 9/33
STEM Algorithm BH 1 μm FDR Level α = 0.001 Intensity Threshold R = 3 Sig. peaks 10/33
Inference for Images Approach Image Processing Statistical Control Familywise Error Rate FWER Control False Discovery Rate FDR Regularized estimation Widely used in neuroimaging Random field theory Widely used in genomics Less conservative Sparse estimates Dimension reduction Peak detection Inference at level of regions Too conservative false negatives Discrete tests Inference about voxels Driven by fit, not error control Threshold is often heuristic 11/39
Statistical Peak Detection Yields inference about spatial clusters, not voxels: 1/39
Statistical Peak Detection Significant clusters correspond to significant local maxima or minima 13/39
N-Dimensional Model Signal: unimodal peaks µ t J = a jh j j= 1 t, h t dt = 1, t [ 0, L] N j L = 00 h = Gaussian J = 6 14/39
N-Dimensional Model Add stationary ergodic Gaussian noise 0, y t = µ t + z t, z t ~ N σ E [ z t z t + s ] = c s σ = 1 cs = Gaussian 15/39
Pointwise Multiple Testing Every pixel has a p-value Number of tests is fixed = L Benjamini- Hochberg BH FDR Level α = 0.1 m = 4 10 tests 4 16/39
FWER Using the Supremum FWER = Prob. supremum exceeds thresh. Expected Euler Characteristic FWER Level α = 0.05 17/39
Pointwise Thresholding Every pixel has a p-value Number of tests is fixed = L FDR Level α = 0.1 m = 4 10 tests 4 18/39
The STEM Algorithm 1. Smooth with a unimodal kernel. Find all the local maxima 3. Compute the p-value of each local maximum 4. Correct for multiple testing STEM = Smooth and TEst Maxima 19/39
1. Kernel Smoothing Unimodal kernel best if same as signal = = = t w t w ds s y s t w t y t w t x N 1, = 6 0/39
1. Kernel Smoothing Unimodal kernel best if same as signal = = = t w t w ds s y s t w t y t w t x N 1, = 6 1/39
1. Kernel Smoothing Unimodal kernel best if same as signal = = = t w t w ds s y s t w t y t w t x N 1, = 6 /39
. Find Local Maxima Local maxima are candidate peaks ~ = T { t [ 0, L] : x t 0, x t 0} N = Dimension reduction! L = 4 10 4 ~ = 56 m Random! 3/39
3. Compute p-values Test at each local maximum the hypothesis H t : µ t = 0 vs. H t : µ t > 0, t T 0 A Define the distribution of the height of a local max. under the complete null hypothesis μt 0 ~ [ z t u is local max] F u P > t = Palm distribution For each observed local maximum, its p-value is computed as p t F x t = ~ [ ], t T 4/39
Isotropic Gaussian Fields In the case N =, the density of the height of a local maximum is In practice, κ is estimated from the data. Φ + + Φ = 3 3 3 3 3 3 1 3 κ κ σ κ κ σ ϕ κ σ κ σ ϕ κ πσ κ κ σ κ σ ϕ σ σ κ u u u u u u u u f 5/39
4. Multiple Testing Get threshold for p-values Number of tests is random! Benjamini- Hochberg BH Level α = 0.1 ~ = 56 m tests 6/39
4. Multiple Testing Get threshold for p-values Number of tests is random! Benjamini- Hochberg BH Level α = 0.1 ~ = 56 m tests 7/39
The problem: Summary So Far Find local significant regions in continuous domains What we have: Test local maxima after smoothing STEM What we are still missing: Distribution of local maxima Theoretical justification Simulations Data example 8/39
Theory: Assumptions True peaks are unimodal, finite support twice-differentiable after kernel smoothing Noise is Gaussian stationary ergodic trice-differentiable after kernel smoothing Kernel is unimodal, finite support 9/39
Theory: Error Definitions FDR = Expected fraction of false positive local max. among sig. local max. True peak Power = Expected fraction of detected peaks Smoothed peak False discovery True discovery False discovery 30/39
Theory: Main Result Let L, a = inf j a j, log L/a 0 # peaks Signal volume J / L N = A 1 + Oa -, 0 < A 1. S 1 / L N = A + Oa -, 0 < A < 1. Let u BH be the BH threshold at level α. 1. FDR u BH α + O p L N Error control + Op a. Power u BH 1 O p a Consistency In repeated samples: a = n so a = n. 31/39
L = 00 Simulation: Setup J = 9 peaks, regularly spaced h b t is isotropic Gaussian kernel with std. dev. b = 3, truncated at ±b zt is zero-mean Gaussian noise, Gaussian isotropic autocorrelation function with bandwidth ν = 1,, 3 Smoothing Gaussian kernel w t = h t,000 repetitions 3/39
Simulation: Performance ν = 0 ν = 1 ν = FDR a = 30 a = 40 Power a = 50 33/3
Optimal Bandwidth How to choose the kernel bandwidth? Small : High noise variance Large : low signal Optimal bandwidth maximizes SNR: arg max Power u arg max P [ x τ > u] arg max h σ 0 Gaussian peaks, Gaussian ACF, Gaussian kernel, std. dev. std. dev. std. dev. b ν opt = b ν 34/39
Functional MRI fmri 35/33
fmri Experiment: Social Thinking time off on off on off Social Social Nonsocial Nonsocial Nonsocial on Social Where in the brain is social information processed? Find differences between on and off states Moran et al. 009, OpenfMRI.org 36/33
Data: fmri: Voxelwise Analysis 71 ˣ 7 ˣ 36 ˣ 179 array Space Time At each voxel s: 1. Fit a linear model: Time Y s = Xβ s + ε s ˆ β s = X T X 1 X T Y s Obs. Stimulus. Compute a test statistic: ~ β s = se ˆ β s [ ˆ ] β s 37/33
STEM Algorithm BH =1.6 ~ = 334 m Local maxima FDR Level α = 0.05 R = 55 Sig. peaks 38/33
The problem: Summary Find local significant regions in continuous domains What we did: Test local maxima after smoothing STEM Algorithm searches for peaks and measures error in terms of peaks Conclusion: Inference should be about the right features 39/39