Noise Analysis of Regularized EM for SPECT Reconstruction Wenli Wang and Gene Gindi Departments of Electrical Engineering and Radiology SUNY at Stony Brook, Stony Brook, NY 794-846 Abstract The ability to theoretically model the propagation of photon noise through tomographic reconstruction algorithms is crucial in evaluating reconstructed image quality as a function of parameters of the algorithm. Here, we show the theoretical expressions for the propagation of Poisson noise through tomographic SPECT reconstructions using regularized EM algorithms with independent Gamma and multivariate Gaussian priors. Our analysis extends the work in [], in which judicious linearizations were used to enable the propagation of a mean image and covariance matrix from one iteration to the next for the (unregularized) EM algorithm. To validate our theoretical analyses, we use a methodology in [] to compare the results of theoretical calculations to Monte Carlo simulations. We also demonstrate an application of the theory to the calculation of an optimal smoothing parameter for a regularized reconstruction. The smoothing parameter is optimal in the context of a quantitation task, defined as the minimization of the expected mean-square error of an estimated number of counts in a hot lesion region. Our results thus demonstrate how the theory can be applied to a problem of potential practical use. I. INTRODUCTION Reconstruction algorithms are often justified in terms of simple image quality metrics such as rms error, but a more meaningful approach advocated in recent years is to base the justification on task performance metrics. In this approach, reconstructions are obtained for an ensemble of representative objects and noise realizations. A task is defined (e.g. lesion quantitation) and the task is performed by a mathematical observer that derives some test statistics (e.g. counts in a region) for each of the reconstructions. An algorithm is successful insofar as it yields good average performance according to some criterion (e.g. low bias and variance in the quantitation estimate). While the above approach lends itself readily to Monte Carlo (MC) approaches, it would be more usefully employed in a theoretical approach that enabled one to predict task performance statistics as a function of object and noise statistics. Such an approach, advocated in [3], is readily applied to linear reconstruction algorithms, but becomes more difficult to apply to the nonlinear algorithms of much interest in SPECT. The difficulty here lies in the theoretical modeling of noise Appeared at 996 IEEE Nuclear Science Symposium Conference Record, pp.933-937, Anaheim, California. propagation through the nonlinear stages of these typically iterative algorithms. For the ML-EM algorithm, a solution to this problem was reported in []. Our own interests lies in Bayesian algorithms, where much anecdotal evidence touts the apparent efficacy of including prior information into the reconstruction. In this work, we report two advances: () We show how to extend the noise propagation formulate of [] to the case of MAP-EM. In particular, we consider the cases of independent Gamma and multivariate Gaussian priors. (The latter category includes familiar smoothing priors.) For each case, we show how first and second-order noise statistics are propagated through the MAP-EM algorithm. () We then show how such formulae may be used to solve a vexing problem associated with Bayesian approaches, namely the determination of, the strength of a smoothing prior. Here, is determined via a task performance metric involving region-of-interest (ROI) quantitation. II. REGULARIZED EM ALGORITHMS In SPECT, the forward projection process can be described by G = Hf + N () where the N vector f denotes the unknown object, the M random vector G denotes the projection data. (Note in (), we use a single subscript to lexicographically order the D quantities f and G.) The H is the M N system matrix. Its element H mn is the probability that a photon emitted from object pixel n will be detected at data bin m. In SPECT, it includes the (approximately linear) effects of attenuation, scatter and detector response. The M random vector N is the object dependent Poisson noise in the projection data. To summarize notation, upper-case bold quantities denote random vectors, lower-case bold quantities denote deterministic vectors, and calligraphic letters denote matrices with corresponding upper-case letters denote matrix elements. Also, we will use the convenient Hadamard notation [] [4] in which ab and a=b denote vectors whose nth components are a n b n and a n =b n, respectively, where a n and b n are the nth components of a and b, respectively. Also log a and exp a are vectors comprising components log a n and exp a n, respectively. Dot and matrix products are denoted a T b and Aa with T indicating a transpose. With this notation, the familiar ML-EM algorithm becomes [] ^F ^F k k+ = H T GH [ ^F ] s where ^F k is the object estimate at iteration k and s is the
sensitivity vector defined as s = H T, where is an M vector with all elements equal to one. Note that ^F k is a random vector since it depends on N. We may now list the two MAP-EM algorithms to be analyzed. The first MAP-EM algorithm for an independent Gamma prior [5] [4] becomes ^F ^F k k+ = H T G q^f [ s + c H ^F k ] + () k where c and q are N vectors with nth components c n = n = n and q n = n?, respectively. The quantities n and n= n are the mean and variance in the Gamma prior Pr(f n ) = ( n = n ) n =?( n )fn n? exp(? n f n = n ) for the nth object pixel. The Gamma prior thus is not a smoothing prior but steers each pixel estimate toward a predetermined value n, so a mean image is required. The second algorithm is the One-Step-Late (OSL) procedure of Green [6]. It is not a true MAP-EM algorithm, but if it converges, it converges to the MAP solution [7]. The regularized EM algorithm for a multivariate Gaussian prior, using the OSL strategy [6] [4], can be shown [4] to be ^F ^F k k+ = s + K? ( ^F H T G k [? m) H ^F k ] (3) where m and K are the mean and covariance matrix in the multivariate Gaussian prior. We also derived [4] two specializations of the multivariate Gaussian corresponding to two smoothing priors: membrane prior and thin plate prior. Both of these may be written as Gibbs priors with associated energy functions. The energy function U (f ) for the membrane prior is defined as U (f ) = X n [fe (n) + f s (n) + p fne (n) + p fse (n)] (4) Here, f e (n), f s (n), f ne (n) and f se (n) are the first partial derivatives along the horizontal ( east ), vertical ( south ), northeast and southeast directions at the nth pixel, respectively. The first derivative is approximated as the center pixel minus its appropriate neighbor. An eight-nearest-neighbor neighborhood is thus involved. The membrane equation (4) is a special case of a multivariate Gaussian prior with a zero mean vector and a specific covariance matrix. The N N inverse covariance matrix K M? for the membrane prior can be shown [4] to be a symmetric positive semi-definite sparse matrix with most of its elements zero, except for 9 elements in each row or column. The OSL update for the thin plate prior is very similar to that of the membrane prior, but is not analyzed here. Interested readers can refer to [4]. III. THEORETICAL NOISE ANALYSIS Here, we establish our noise propagation formulae using the Gamma prior, the derivation for other priors follows along similar lines. The derivation here, necessarily skeletal, follows that in [] but extends the case from ML to MAP. Note the right hand side of equation () is a multiplicative updating formula, so, we take logarithm of () and obtain an additive updating equation: Y k+ = Y k + log H T G q^f [ H ^F k ] +? log(s k + c) (5) where Y k log ^F k. We decompose each of the random vectors G, Y k and ^F k in (5) into their mean plus (zero-mean) noise terms: G = Hf + N (6) Y k = log a k + N k y (7) ^F k = a k exp N k y ' a k ( + N k y ) = ak + N k^f where Hf, log a k and a k are the means (conditioned on f) of random vectors G, Y k and ^F k, respectively, and N, N k y and N k^f are the noises in the random vectors G, Yk and ^F k, respectively. Recall that we only consider photon noise and assume the object f is given. Thus ^F k is a random vector by virtue of the noise, as in equation (8), and not by virtue of the prior object density. In (8), N k^f = ak N k y and we have used the first of two approximations: noise in the reconstructed object is much less than the signal in the reconstructed object, i.e. N k^f << ak, which will be approximately true for useful images. Inserting (6), (7) and (8) into (5), using small signal approximations ignoring terms that are quadratic or higher in any of the three quantities N, N k y and, equating random Nk^f terms to random terms and non-random ones to non-random ones, we obtain the following results: (for details see [4]) The conditional mean of the reconstructed object, E[ ^F k j f ] a k, can be obtained simply by running the MAP-EM algorithm with noise-free projection G = Hf. That is, a k+ = ak H T Hf q [ s + c Ha k ] + a k (8) (9) such a result (i.e., mean image equals noise-free reconstruction) would hold for any linear algorithm, but is not obvious for MAP-EM. Noise in the reconstructed object N can be obtained by k^f a linear operation operating on N. That is N k^f = ak (U k N) ; where U k is an N M matrix satisfying the recursion relation: U k+ = B k + [C k? A k ]U K with U = () with A k approximately a projection-backprojection operator (H T H), B k approximately a backprojection operator (H T ) and C k a diagonal matrix. The component
forms for N N matrix A k, N M matrix B k and N N matrix C k are as follows: A k ij = B k ij = ak j s i + qi ( a k i MX m= H mi H mj PN n= H mna k n ) () H ji (s i + qi a k i )(P N n= H jna k n ) () C k ii = s i s i + qi a k i (3) In the derivation of the matrices A k, B k and C k, we have used the second approximation: the projection of the current estimate, Ha k, will fairly closely resemble the noise-free projection Hf of the object after the first few iterations wipe out biases due to the initial estimate a. That is Hf=Ha k '. We could drop this approximation and it would lead to more complicated forms for A k, B k and C k. The same strategy applies to equation (3) with a membrane prior leads to the equivalent of equations (9) and (): a k+ = a k Hf s + K M? HT [ ak Ha k ] (4) U k+ = B k + [I? A k ]U K with U = (5) where I is an N N identity matrix. The component forms for A k and B k for the membrane prior are: A k ij = a k j s i ( MX m= H mi H mj PN n= H mna k n ) + B k ij = [K? M ] ija k j s i + P N n= [K? M ] ina k n (6) H ji s i ( P N n= H jna k n ) (7) For any of our priors, we can write general expressions for the covariance matrix for the reconstructed object given f, denoted K k^fjf. This turns out to be K k^fjf E ^F[( ^F k? a k )( ^F k? a k ) T j f ] ' diag(a k ) U k diag(hf )[ U k ] T diag(a k ) : (8) where diag(a k ) is a diagonal matrix with nth diagonal element a k n. We may express Kk^Fjf in terms of its ij element, as [K k^fjf ]ij = a k i ak j X m [U k ]im[u k ]jm[hf ]m : An important special case of the above equation is i = j, which gives the variances of the components of the random vector ^F k (i.e. the variance image ). To actually use the theoretical noise analysis method: we first initialize a, compute and save the sequence of noise-free reconstruction a k, for k = ; : : : ; K, then use recursive relation () or (5) to compute the desired U K at iteration K, and plug in (8) to get the covariance matrix K K^Fjf. The conditional mean of ^F K given f is simply a K. We thus end up with the first and second-order reconstruction statistics at iteration k. In [4], we also derive the general lognormal joint density function for ^F k. In section V, we illustrate validation results of these formulae using MC trials. IV. HYPERPARAMETER ESTIMATION USING TASK PERFORMANCE We may use the theory to estimate the smoothing hyperparameter of the membrane prior based on task performance criterion. The quantitation task we choose is estimation of total counts in an ROI. Define the ROI by a binary (; ) N vector w T with w i = if pixel i ROI. An estimate of the true number of counts is then given by ^ ROI = w T ^F : (9) Note that ^ is a random variable since it depends on ^F, as well as parameter. Also note = w T f. The bias and variance of this ROI estimator are calculated in [3], with the bias given by b ROI E[^ ROI j f ]? = w T b ; () where b is the bias vector defined as b = E[ ^F j f ]? f, and the variance given by var(^ ROI ) E h i (^ ROI? E[^ ROI ]) j f = tr[k ^Fjf W] ; () where W is an N N matrix defined as W = ww T. Note that var(^ ROI ) is not only the sum of pointwise variances of ROI pixels, but also includes contributions from covariances associated with all pixel pairs in the ROI. A good figure of merit that can take both bias and variance into account is the expected mean-square error (EMSE) [3], given by EMSE(^ ROI ) E[(^ ROI? ) j f ] = b ROI + var(^ ROI ) : () Our optimal will be the one which minimizes the EMSE of ^ ROI. V. SIMULATION RESULTS Following the MC methodology in [], we validated the theoretical noise analysis formulae for cases of of no prior (i.e. ML-EM) and Gamma prior. The phantom (3 3) was a uniform disk with radius 3 pixels. We used two projection count levels, 8, and 5,, to represent the low and high signal-to-noise ratios. For the Gamma prior, the mean was set to the disk for both count levels, and the standard deviation was set to 6.8% and 5.8% of the mean for count levels 8, and 5,, respectively. Sample size was 8 which implies a relative error of about.6% []. Results (not included here) showed excellent agreement of theory and MC according to criteria discussed below. For the membrane prior, we used a phantom, shown in Figure A, that included a % contrast hot lesion. We found that the OSL convergence ranges for the lesion phantom
with 8; and 5; total projection counts were = [:; :5] and = [:; :8], respectively. Within these ranges, we found that subranges [:; :9] (low counts) and [:; :9] (high counts) captured a nice bias/variance tradeoff in ^ ROI. Our validations were thus based on 4 experiments: () 8; total projection counts, = :. () 8; total projection counts, = :9. (3) 5; total projection counts, = :. (4) 5; total projection counts, = :9. We again checked MC-theory validation with 8 noise realizations. For each experiment, the results [8] of mean, variance and covariance are very consistent at iterations, 3, 5,, for both MC simulation and theoretical analysis. Figures A and B show the excellent agreement of profiles of mean and variance images for MC and theory. Figure C illustrates profiles of covariance images, which display the covariation of a given pixel relative to a reference pixel at the center of the lesion, for MC and theory. Figures are for experiment 3 at iterations. Figures B-E shows a set of variance images (from left to right) obtained from the theoretical analysis for experiment 3 at iterations, 3, 5 and, respectively. As seen, as iteration number increases, the effect of the lesion in the variance image gradually disappears and the variance image finally looks like a uniform phantom with some symmetric structure. One explanation is that: since the OSL-MAP-EM is a smoothed version of ML-EM, as iterations go on, the neighborhood interactions increase and finally smooth out the larger fluctuations associated with the lesion in the variance image. The symmetric fine structure, apparently due to the sensitivity vector, is not as yet easily explained. Our task was to estimate the total counts in a 3 3 lesion template. (We used a template smaller than the lesion to avoid the edge artifacts in the bias image.) The bias, variance and EMSE of the ROI estimator (equations (), () and ()) were calculated for the low projection counts (8,) using =.,.3,.4,.5,.6,.7,.8 and.9, and for high projection counts (5,) using =.,.3,.4,.5,.6,.7,.8 and.9. The OSL-MAP-EM algorithm was stopped at iteration in all these cases. Optimization to find best for quantitation was done simply by inspection of the EMSE- curve. Note that these optimal s are not the same as those that minimize reconstruction rms, though calculation shows them to be in the same range in this case. Figures 3A and 3B shows the bias-, bias?, variance- and EMSE- curves of the ROI estimator for 8; and 5; projection counts. As seen, the bias (solid line with plus signs) begins with a small positive bias at small, and decreases with increasing, and finally becomes a large negative value at large. Thus the square of the bias (dashed line) is concave-shaped. The negative bias at high is easily understood: as the high positive contrast lesion is smoothed more strongly, the high-valued pixel values are spread to surrounding background and the values of the pixels in the ROI lowered. The variance (dash-dotted line) decreases as expected with increasing. The EMSE () (solid line with circles) which is the sum of bias squared and variance becomes a convex-shaped curve and has a minimum. The optimal s for low and high projection counts were read off as.3 and.5, respectively. VI. CONCLUSION We developed theoretical noise analyses for MAP-EM algorithms and demonstrated one application to task performance. The results here technically apply only to the particular phantom and at the two noise levels. To generalize this analysis, one would have to consider a relevant ensemble of objects that adequately captures the range of objects likely to be encountered in the clinic. We note that the theoretical method still requires a significant amount of computation, albeit far less than MC methods. However, once we have computed the covariance K k^fjf, which is the main burden, this same covariance may be reused in support of a variety of task performance analyses. VII. ACKNOWLEDGMENTS We wish to thank Soo-Jin Lee, Ing-Tsung Hsiao and Paul J. Hong for technical help and the ML-EM folks from the Arizona group Harrison H. Barrett, Donald W. Wilson and Craig K. Abbey for useful discussions. This work was supported by a Student Fellowship from the Education and Research Foundation of the Society of Nuclear Medicine, and by grant NS3879 from NIH-NINDS. VIII. REFERENCES [] H. H. Barrett, D. W. Wilson, and B. M. W. Tsui, Noise Properties of the EM Algorithm: I. Theory, Phys. Med. Biol., 39, pp. 833 846, 994. [] D. W. Wilson, B. M. W. Tsui, and H. H. Barrett, Noise Properties of the EM Algorithm: II. Monte Carlo Simulations, Phys. Med. Biol., 39, pp. 847 87, 994. [3] H. H. Barrett, Objective Assessment of Image Quality: Effects of Quantum Noise and Object Variability, Journal of Optical Society of America A, 7(7), pp. 66 78, July 99. [4] W. Wang and G. Gindi, Noise Analysis of Regularized EM Algorithms for SPECT, Technical Report MIPL-96-, Depts. of Radiology and Electrical Engineering, State University of New York at Stony Brook, June 996. [5] K. Lange, M. Bahn, and R. Little, A Theoretical Study of Some Maximum Likelihood Algorithms for Emission and Transmission Tomography, IEEE Trans. on Med. Imaging, MI-6(), pp. 6 4, June 987. [6] P. J. Green, Bayesian Reconstructions from Emission Tomography Data Using a Modified EM Algorithm, IEEE Trans. on Medical Imaging, 9(), pp. 84 93, Mar. 99. [7] P. J. Green, On Use of the EM Algorithm for Penalized Likelihood Estimation, J. Royal Statist. Soc., B, 5(3), pp. 443 45, 99. [8] W. Wang and G. Gindi, Noise Analysis of Regularized EM Algorithms for SPECT: Validation and Task Performance Application to Quantitation, Technical Report MIPL-96-3, Depts. of Radiology and Electrical Engineering, State University of New York at Stony Brook, Oct. 996.
6 5 MC **th 9 8 MC **th 8 7 6 MC **th 4 7 5 6 4 3 5 3 4 3 5 5 5 3 5 5 5 3 (A) (B) (C) 5 5 5 3 Fig. Horizontal profiles comparisons of theoretical and MC results. (A) Mean images, profiles through lesion center. (B) Variance images, profiles through image center. (C) Covariance images for the center pixel of lesion, profiles through lesion center. Lesion phantom, 5, projection counts, reconstructed using OSL-MAP-EM with membrane prior, = :, iterations. MC, * theory. (A) (B) (C) (D) (E) Fig. Lesion phantom (A) and variance images (B to E) obtained from theoretical analysis at, 3, 5 and iterations, 5, projection counts, reconstructed using OSL-MAP-EM with membrane prior, = :. metrics of ROI estimator 3 5 5 5 5 8k EMSE bias var bias metrics of ROI estimator 3 5 5 5 5k EMSE bias var bias...3.4.5.6.7.8.9.. λ (A) 5 3 4 5 6 7 8 9 x 3 Fig. 3 Bias-, bias?, variance- and EMSE- curves of 3 3 ROI estimator using theoretical analysis, lesion phantom, OSL-MAP-EM with membrane prior, iterations. (A) 8, projection counts, (B) 5, projection counts. Bias: solid line with plus signs, bias : dashed line, variance: dash-dotted line, EMSE: solid line with circles. λ (B)