MEDICAL IMAGING INFORMATICS: Lecture # 1

Size: px

Start display at page:

Download "MEDICAL IMAGING INFORMATICS: Lecture # 1"

Marcus O’Brien’
5 years ago
Views:

1 MEDICAL IMAGIG IFORMATICS: Lecture # 1 Basics of Medical Imaging Informatics: Estimation Theory orbert Schuff Professor of Radiology Medical Center and UCSF orbert.schuff@ucsf.edu Medical Imaging Informatics 2010, schuff Slide 1/31

2 What Is Medical Imaging Informatics? Signal Processing Digital Image Acquisition Image Processing and Enhancement Data Mining Computational anatomy Statistics Databases Data-mining Workflow and Process Modeling and Simulation Data Management Picture Archiving and Communication System (PACS) Imaging Informatics for the Enterprise Image-Enabled Electronic Medical Records Radiology Information Systems (RIS) and Hospital Information Systems (HIS) Quality Assurance Archive Integrity and Security Data Visualization Image Data Compression 3D, Visualization and Multi-media DICOM, HL7 and other Standards Teleradiology Imaging Vocabularies and Ontologies Transforming the Radiological Interpretation Process (TRIP)[2] Computer-Aided Detection and Diagnosis (CAD). Radiology Informatics Education Etc. Slide 2/31

knowledge gain Pro-active Improve Data collection Refine Model

3 What Is The Focus Of This Course? Learn using computational tools to maximize information and knowledge gain Pro-active Improve Data collection Refine Model Measurements Image Model knowledge Re-active Extract information Compare with model Slide 3/31

4 Challenge: Maximize Information Gain 1. Q: How can we estimate quantities of interest from a given set of uncertain (noise) measurements? A: Apply estimation theory (1 st lecture by orbert) 2. Q: How can we measure (quantify) information? A: Apply information theory (2 nd lecture by Wang) Slide 4/31

Estimation Theory: Motivation Example I Gray/White Matter Segmentation Hypothetical Histogram 1.

5 Estimation Theory: Motivation Example I Gray/White Matter Segmentation Hypothetical Histogram Intensity GM/WM overlap 50:50; Can we do better than flipping a coin? Slide 5/31

6 Estimation Theory: Motivation Example II Goal: Capture dynamic signal on a static background High signal to noise Poor signal to noise D. Feinberg Advanced MRI Technologies, Sebastopol, CA Slide 6/31

7 Estimation Theory: Motivation Example III Diffusion Spectrum Imaging Human Cingulum Bundle Goal: Capture directions of fiber bundles Dr. Van Wedeen, MGH Slide 7/31

8 Basic Concepts of Modeling Θ: target of interest and unknown ρ: measurement Θ ) : Estimator - a good guess of Θ based on measurements Cartoon adapted from: Rajesh P.. Rao, Bruno A. Olshausen Probabilistic Models of the Brain. MIT Press Slide 8/31

9 Deterministic Model ϕ = Hθ + M noise = number of measurements M = number of states, M=1 is possible Usually > M and noise 2 > 0 The model is deterministic, because discrete values of Θ are solutions. ote: 1) we make no assumption about Θ 2) Each value is as likely as any another value What is the best estimator under these circumstances? Slide 9/31

10 Least-Squares Estimator (LSE) The best what we can do is minimizing noise: Minimizing E LSE with regard to θ leads to E LSE ϕ ˆ Hθ LSE = 0 T Hϕ θˆ = 0 = 1 min 2 ( T HH) noise θ ) LSE LSE = ( T ) HH 1 LSE is popular choice for model fitting Useful for obtaining a descriptive measure But LSE makes no assumptions about distributions of data or parameters Has no basis for statistics deterministic model 2 H T ϕ n Slide 10/31

11 Prominent Examples of LSE 1 = Mean Value: ˆ θ ϕ( j) mean j= 1 1 θvar iance = ϕ θ 1 Variance ˆ ( j) j= 1 ( ˆ ) 2 mean Amplitude: Frequency: Phase: Decay: θˆ 1 θˆ 2 θˆ 3 θˆ 4 Slide 11/31

12 Likelihood Model Likelihood L ( Φ ) = p( ϕ Φ) ϕ Pretend we know something about Θ We perform measurements for all possible values of Θ We obtain the likelihood function of Θ given our measurements ρ ote: Θ is random ϕ is a fixed parameter Likelihood is a function of both the unknown Θ and known ϕ Slide 12/31

13 Likelihood Model (cont d) L ϕ ( Φ ) = p( ϕ Φ) ew Goal: Find an estimator which gives the most likely probability distribution underlying Φ L ϕ ( ) Slide 13/31

14 Maximum Likelihood Estimator (MLE) Goal: Find estimator which gives the most likely probability distribution underlying x. Max likelihood function ) Φ MLE ( ) = max p ϕ Φ θ MLE can be found by d dφ ( ϕ ) ln p Φ = 0 θ= θ MLE Slide 14/31

2 σ j= 1 ormal Distribution Log ormal Distribution MLE of the mean (1 st derivative): d

15 Example I: MLE Of ormal Distribution ormal distribution 2 1 p( ϕ Φ, σ ) exp ϕ 2 ( j) 2 Φ σ j= 1 ( ) 2 log of the normal distribution (normd) 2 ( ϕ ) ( ( j) ) 2 Φ σ ϕ Φ 2 ln p, 1 2 σ j= 1 ormal Distribution Log ormal Distribution MLE of the mean (1 st derivative): d ) dφ MLE 1 ) ln p() = ( ϕ ( j) ΦMLE ) = 0 2 ˆ 2 σ j= 1 ) 1 Φ MLE = j= 1 ϕ ( j) Slide 15/31

16 Example II: MLE Of Binominal Distribution (Coin Toss) Distribution function f(y n,w): n= number of tosses w= probability of success f(y n=10,w=7) f(y n=10,w=3) Slide 16/31 y

17 MLE Of Coin Toss (cont d) Goal: Given the observed data f (y w=0.7, n=10), find the parameter Φ MLE that most likely produced the data. L( Φ y = 7, n= 10) MLE Slide 17/31

18 MLE Of Coin Toss (cont d) Likelihood function of coin tosses n! y L( Φ y) = Φ 1 Φ!! ( y )( n y) ( ) n y log likelihood function ( ) ln L w y = n! ln + yln w+ n y ln 1 w!! ( y )( n y) ( ) ( ) Slide 18/31

19 MLE Of Coin Toss Evaluate MLE equation (1 st derivative) () y ( ) dln L n y = = dφ ΦML 1 Φ MLE E MLE 0 y nφ MLE y = = 0 ΦMLE = Φ (1 Φ ) n MLE MLE According to the MLE principle, the distribution f(y/n) for a given n is the most likely distribution to have generated the observed data of y. Slide 19/31

20 Relationship between MLE and LSE Assume: Θ is independent of noise MLE and noise have the same distribution ( ϕ Φ ) = ( ϕ Φ Φ) p p noise θ H noise is zero mean and gaussian p(ρ Θ) is maximized when LSE is minimized Slide 20/31

21 Prior knowledge Bayesian Model ow, the daemon comes into play, but we know The daemon s preferences for Θ (prior knowledge). prior ( Φ ) = p ( Φ) ew Goal: Find the estimator which gives the most likely probability distribution of Θ given everything we know. Slide 21/31

22 Bayesian Model ( θ) = ( Φ) ( Φ) posterior C L p ϕ ϕ ϕ Slide 22/31

23 Maximum A-Posteriori (MAP) Estimator Goal: Find the most likely Θ MAP (max. posterior density of ) given ϕ. ˆ θ = max L ϕ θ p ϕ MAP ( ) ( ) Maximize joint density θ MAO can be found by d ln L( ϕ θ) p( θ) = ln p( ϕ θ) + ln p( θ) = 0 dθ θ θ Slide 23/31

24 Example III: MAP Of ormal Distribution We have random sample: ( 2) ( 2) 1 1 ln Lϕ ˆˆ, ˆˆ, ( ( ) ˆ ) ( ˆ μσ p μσ = ϕ j μϕ p μmle ) = 0 uˆ ˆ σ ˆ σ 2 2 ϕ j= 1 μ j= 1 The sample mean of MAP is: ) Φ MAP = σ 2 μ 2 2 ϕ + T μ j σ σ = 1 ϕ ( j) If we do not have prior information on μ, σ μ inf or T inf μˆ μˆ μˆ MAP ML, LSE Slide 24/31

25 Posterior Distribution and Decision Rules p(θ ρ) Slide 25/31 Θ MSE Θ MAP Θ

26 Decision Rules Measurements Likelihood function Prior Distribution Posterior Distrribution Gain Function Result Slide 26/31 θ

27 Some Desirable Properties of Estimators I: Unbiased: Mean value of the error should be zero E Φ ) - Φ = 0 E Φ ) = E θ Consistent: Error estimator should decrease asymptotically as number of measurements increase. (Mean Square Error (MSE)) MSE ) 2 = E Φ - Φ 0 for large What happens to MSE when estimator is biased? 2 2 MSE = E Φ ) - Φ - b + E b variance Slide 27/31 bias

28 Some Desirable Properties of Estimators II: Efficient: Co-variance matrix of error should decrease asymptotically to its minimal value for large T 1 ( Φi - i)( Φk - k) J ik ) ) C % = E Φ Φ θ Slide 28/31

29 Example: Properties Of Estimators Mean and Variance Mean: 1 1 E ˆ μ = E ( j) ϕ = μ = μ j= 1 The sample mean is an unbiased estimator of the true mean Variance: σ E ( ˆ μ μ) = E ϕ( j) μ = σ = ( ) 2 2 j= 1 The variance is a consistent estimator because It approaches zero for large number of measurements. Slide 29/31

30 Properties Of MLE is consistent: the MLE recovers asymptotically the true parameter values that generated the data for inf; Is efficient: The MLE achieves asymptotically the minimum error (= max. information) Slide 30/31

31 Summary LSE is a descriptive method to accurately fit data to a model. MLE is a method to seek the probability distribution that makes the observed data most likely. MAP is a method to seek the most probably parameter value given prior information about the parameters and the observed data. If the influence of prior information decreases, i.e. many measurements, MAP approaches MLE Slide 31/31

32 Some Priors in Imaging Smoothness of the brain Anatomical boundaries Intensity distributions Anatomical shapes Physical models Point spread function Bandwidth limits Etc. Slide 32/31

33 Estimation Theory: Motivation Example I Gray/White Matter Segmentation Hypothetical Histogram Intensity What works better than flipping a coin? Design likelihood functions based on anatomy co-occurance of signal intensities others Determine prior distribution population based atlas of regional intensities model based distributions of intensities others Slide 33/31

34 Estimation Theory: Motivation Example II Goal: Capture dynamic signal on a static background Poor signal to noise Improvements to identify the dynamic signal: D. Feinberg Advanced MRI Technologies, Sebastopol, CA Slide 34/31 Design likelihood functions based on auto-correlations anatomical information Determine prior distributions from serial measurements multiple subjects anatomy

35 Estimation Theory: Motivation Example III Diffusion Spectrum Imaging Human Cingulum Bundle Goal: Capture directions of fiber bundles Improvements to identify tracts: Design likelihood functions based on similarity measures of adjacent voxels fiber anatomy Determine prior distributions from anatomy fiber skeletons from a population others Dr. Van Wedeen, MGH Slide 35/31

36 MAP Estimation in Image Reconstructions with Edge-Preserving Priors Slide 36/31 Dr. Ashish Raj, Cornell U

37 MAP Estimation In Image Reconstruction Human brain MRI. (a) The original LR data. (b) Zero-padding interpolation. (c) SR with box-psf. (d) SR with Gaussian-PSF. From: A. Greenspan in The Computer Journal Advance Access published February 19, 2008 Slide 37/31

38 Improved ASL Perfusion Results zdft = zero-filled DFT By Dr. John Kornak, UCSF

39 Bayesian Automated Image Segmentation Bruce Fischl, MGH Slide 39/31

40 Segmentation Using MLE A: Raw MRI B: SPM2 C: EMS D: HBSA from Habib Zaidi, et al, euroimage 32 (2006) Slide 40/31

41 Population Atlases As Priors Slide 41/31 Dr. Sarang Joshi, U Utah, Salt Lake City

42 Population Shape Regressions Based Age- Selective Priors Age = Slide 42/31 Dr. Sarang Joshi, U Utah, Salt Lake City

IDL FSL fmri, smri, DTI C/C++ fmristat fmri matlab BrainVoyager smri C/C++

43 Imaging Software Using MLE And MAP Packages Applications Languages VoxBo fmri C/C++/IDL MEDx smri, fmri C/C++/Tcl/Tk SPM fmri, smri matlab/c ibrain IDL FSL fmri, smri, DTI C/C++ fmristat fmri matlab BrainVoyager smri C/C++ BrainTools C/C++ AFI fmri, DTI C/C++ Freesurfer smri C/C++ ipy Python Slide 43/31

44 Literature Mathematical H. Sorenson. Parameter Estimation Principles and Problems. Marcel Dekker (pub)1980. Signal Processing S. Kay. Fundamentals of Signal Processing Estimation Theory. Prentice Hall L. Scharf. Statistical Signal Processing: Detection, Estimation, and Time Series Analysis. Addison-Wesley Statistics: A. Hyvarinen. Independent Component Analysis. John Wileys & Sons ew Directions in Statistical Signal Processing. From Systems to Brain. Ed. S. Haykin. MIT Press Slide 44/31

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood