Information-Theoretic Imaging
|
|
- Elmer Stokes
- 5 years ago
- Views:
Transcription
1 with Applications in Hyperspectral Imaging and Transmission Tomography Joseph A. O SullivanO Electronic Systems and Signals Research Laboratory Department of Electrical Engineering Washington University jao@essrl.wustl.edu Supported by: ONR, ARO, NIH
2 Collaborators Faculty Donald L. Snyder William H. Smith Daniel R. Fuhrmann Jeffrey F. Williamson Bruce R. Whiting Richard E. Blahut Michael I. Miller Chrysanthe Preza David G. Politte James G. Blaine Students and Post-Docs Michael D. DeVore Natalia A. Schmid Metin Oz Ryan Murphy Jasenka Benac Adam Cataldo 2
3 Outline: Information Theory and Imaging Hyperspectral Imaging - Information Value Decomposition Transmission Tomography - Estimate in Exponential Family - Object-Constrained Computerized Tomography Alternating Minimization Algorithms Applications: HSI, CT Conclusions 3
4 Some Roles of Information Theory In Imaging Problems Some Important Ideas: Roles depend on problem studied Key problems are in detection, estimation, and classification Information is quantifiable SNR as a measure has limitations Information theory often provides performance bounds 4
5 How to Measure Information QUESTIONS: Can we measure the information in an image? Does one sensor provide more information than another? Does resolution measure information? Do more pixels in an image give more information? ANSWER: MAYBE Information for what? Information relative to what? Ill-defined questions clearly defined problem 5
6 Measuring Information Information for what? Detection Estimation Classification Image formation Information relative to what? Noise only Clutter plus noise Natural Scenery 6
7 Information Theory and Imaging Center for Imaging Science established 1995 Brown MURI on Performance Metrics established 1997 Invited Paper 1998 J. A. O Sullivan, O R. E. Blahut,, and D. L. Snyder, Information-Theoretic Image Formation, IEEE Transactions on Information Theory,, Oct IEEE 1998 Information Theory Workshop on Detection, Estimation, Classification, and Imaging IEEE Transactions on Information Theory Special Issue on Information-Theoretic Imaging, Aug Complexity Regularization, Large Deviations, 7
8 IEEE Transactions on Information Theory,, October 1998, Special Issue to commemorate the 50th anniversary of Claude E. Shannon's A Mathematical Theory of Communication Problem Definition Optimality Criterion Algorithm Development Performance Quantification 8
9 Hyperspectral Imaging WU Team at Washington University Donald L. Snyder William H. Smith Daniel R. Fuhrmann Joseph A. O SullivanO Chrysanthe Preza Snyder Digital Array Scanning Interferometer (DASI) Smith Fuhrmann O Sullivan 9
10 Hyperspectral Imaging Scene Cube Data Cube Drink from a fire hose Filter wheel, interferometer, tunable FPAs Modeling and processing: - data models - optimal algorithms - efficient algorithms 10
11 Hyperspectral Imaging Likelihood Models ideal data : r y y y hyx : sx: dxd hy x : h a y x : h a y x : 2 shear vector for Wollaston prism hy x : wavelength dependent amplitude PSF of DASI sx: scene intensity for incoherent radiation at x, nonideal (more realistic) data : ry Poisson y 0 y Gaussian y data likelihood : Y 1 Er scenelog ny! y 0 y ny y e 0 y y 1 ny 1 e ry ny
12 Idealized Data Model Data spectrum for each pixel s j Linear combination of constituent spectra K s j a k1 k kj A Problem: Estimate constituents and proportions subject to nonnegativity; positivity of S assumed Ambiguity if Comments: Radiometric Calibration; Constraints Fundamental S 12
13 Idealized Problem Statement: Maximum-Likelihood Minimum I-divergenceI S A Poisson distributed data loglikelihood function I J K K l( S A) s ij ln ikakj ikakj i1 j1 k1 k1 Maximization over and A equivalent to minimization of I-divergenceI I J s ij K I( S A) s ij ln sij i j K k ika k ika 1 kj Information Value Decomposition Problem kj 13
14 Markov Approximations X and Y RV s s on finite sets, p(x,y) unknown Data: N i.i.d. pairs {X{ i,y i } n( x, Unconstrained ML Estimate of p(x,y) S N y) Lower rank Markov approximation Sˆ A X M Y M in a set of cardinality K mina I( S A) Factor analysis, contingency tables, economics Problem: Approximation of one matrix by another of lower rank C. Eckart and G. Young, Psychometrika,, vol. 1, pp SVD IVD 14
15 CT Imaging in Presence of High Density Attenuators Brachytherapy applicators After-loading colpostats for radiation oncology Cervical cancer: 50% survival rate Dose prediction important Object-Constrained Computed Tomography (OCCT) 15
16 Filtered Back Projection FBP: inverse Radon transform Truth FBP 16
17 Transmission Tomography Source-detector pairs indexed by y; ; pixels indexed by x Data d(y) Poisson, means g(y:), loglikelihood function l(d : ¹ ) = X y2y g(y : ¹ ) = P E I 0(y; E) exp[ P x Mean unattenuated counts I 0, mean background Attenuation function (x,e) (x,e), E indexes energies ¹(x;E)= P I i=1 ¹ i (E)c i (x) [d(y) lng(y : ¹ ) g(y : ¹ )] Maximize over or c i ; equivalently minimize I-divergenceI Comment: pose search c(x) = c a (x:) ) + c b (x) h(yjx)¹ (x;e)] + (y) 17
18 Alternating Minimization Algorithms Define problem as min q (q) (q) Derive Variational Representation: (q) (q) = min p J(p,q) J is an auxiliary function p is in auxiliary set P Result: double minimization min q min p J(p,q) Alternating minimization algorithm p q ( l1) ( l1) argmin pp argmin qq J ( p, q J ( p ( l) ( l1) ), q) Comments: Guaranteed Monotonicity; J selected carefully 18
19 Alternating Minimization Algorithms: I-Divergence, Linear, Exponential Families Special Case of Interest: J is I-divergenceI Families of Interest: Linear Family L(A,b) = {p: Ap = b} Exponential Family E(,B) = {q: q i = i exp[ j b ij j ]} p q ( l1) ( l1) argmin pl argmin qe I( p q I( p ( l1) ( l) ) q) Csiszár and Tusnády dy; Dempster,, Laird, Rubin; Blahut; Richardson; Lucy; Vardi, Shepp,, and Kaufman; Cover; Miller and Snyder; O'Sullivan 19
20 Alternating Minimization Example Linear family: p p 2 = 2 Exponential family: q 1 = exp (v),( q 2 = exp (-v)( min min I( p qe pl q) p 2, q p 2, q p 2, q p 1, q 1 p 2, q p 1, q p 1, q p 1, q 1 20
21 Information Geometry I-divergence is nonnegative, convex in pair (p,q) Generalization of relative entropy, example of f-divergencef First triangle equality: p in L p * argmin pl I( p q) I( p q) I( p p * ) I( p * q) Second triangle equality: q in E q * arg min qe I( p q) I( p q) I( p q * ) I( q * q) 21
22 Variational Representations Convex decomposition lemma. Let f be convex. Then f i ( i r i x i ) 1, r i Special Case: f is ln i 0 r ln qi min i pp P p : pi 1 i i f ( 1 r i i x p i i ) p ln q i i Basis for EM; see also De Pierro,, Lange, Fessler 22
23 Shrink-Wrap Algorithm for Endmembers S = A = Endmembers,, K Columns A = Pixel Mixture Proportions SVD, Then Simplex Volume Minimization Fuhrmann,, WU 23
24 Alternating Minimization Algorithms for Hyperspectral Imaging C Initial Estimated Coefficients Snyder, WU O Sullivan, WU + Poisson Process Alternating Minimization Algorithm Estimated Coefficients S=A Given and S,, estimate A. Uses I-Divergence I Discrepancy Measure.
25 Information Value Decomposition Applied to Hyperspectral Data Downloaded spectra from USGS website 470 Spectral components Randomly generated A with 2000 columns Ran IVD on result 25
26 Information Value Decomposition Applied to Hyperspectral Data 26
27 Information Theoretic Imaging: Application to Hyperspectral Imaging Problem Definition Optimality Criterion Algorithm Development Performance Quantification Likelihood Models for Sensors Likelihood Models for Scenes Spectrum Estimation and Decomposition Performance Quantification Applications to Available Sensors 27
28 Hyperspectral Imaging Ongoing Efforts Scene Models Including Spatial and Wavelength Characteristics Sensor Models Including Apodization Orientation Dependence of Hyperspectral Signatures Expanded Complementary Efforts Atmospheric Propagation Models Performance Bounds Measures of Added Information 28
29 CT Minimum I-Divergence I Formulation l(d : ¹ ) = X y2y [d(y) lng(y : ¹ ) g(y : ¹ )] g(y : ¹ ) = P E I 0(y; E) exp[ P x h(yjx)¹ (x;e)] + (y) Maximize loglikelihood function over minimize I-divergence Use Convex Decomposition Lemma g is a marginal over q L (d) = f p(y; E ) 0: X E p(y; E ) = d(y)g I(dkg(y : c)) = min I(pkq) p2l(d) 29
30 New Alternating Minimization Algorithm for Transmission Tomography min q2e(i 0 ;H ¹ ) min I(pkq) p2l(d) ^q (k) (y;e) = I 0 (y;e) exp ^p (k) (y;e) = ^q (k) (y;e) P ~ b (k) i (x) = ^b(k) i (x) = X y2y X y2y X E X E X x 2X X i d(y) ¹ i (E)h(yjx)^c (k) i E 0 ^q(k) (y;e 0 ) ¹ i (E)h(yjx)^p (k) (y;e) ¹ i (E)h(yjx)^q (k) (y;e) (x) # 30
31 New Alternating Minimization Algorithm for Transmission Tomography (k+ 1) ^c i (x) = ^c (k) i (x) 1 Z i (x) ln à ~b (k) i ^b(k) i! (x) (x) Interpretation: Compare predicted data to measured data via ratio of backprojections Update estimate using a normalization constant Comments: Choice for constants; monotonic convergence; Linear convergence; Constraints easily incorporated 31
32 Iterative Algorithm with Known Applicator Pose Truth Known Pose 32
33 OCCT Iterations Known Pose OCCT 33
34 Truth Known Standard FBP OCCT 34
35 Magnified views around brachytherapy applicator FBP Truth OCCT 35
36 Conclusions Information-Theoretic Imaging Alternating Minimization Algorithms Hyperspectral Imaging - Applying Formal Methodology - Data Collections - Sensor Modeling, Likelihoods - Broad, Ongoing Efforts CT Imaging - New Image Formation Algorithm - Simulations and Scanner Data - Modeling Issues: Physics Crucial 36
37 37
38 Physical Phenomena Physical Scene Physical Sensors Mathematical Abstractions Scene Model Sensor Models c C L 2 L 2 Reproduction Space C C Criteria Performance l c, fc, I fc min d,c dc,ĉ * dc,ĉ dĉ *,ĉ 38
39 Problem Definition Physical Phenomena: Electromagnetic Waves, Physical Objects Mathematical Abstraction: Function Spaces (L 2 ), Dynamical Models Sensor Models: Projection Models, Statistics Reproduction Space: Finite Dimensional Approximation Optimality Criterion Deterministic: Stochastic: Least Squares, Minimum Discrimination Maximum Likelihood, MAP, MMSE 39
40 Algorithm Development Iterative Algorithms: Random Search: Alternating Minimizations Expectation-Maximization Simulated Annealing, Diffusion, Jump-Diffusion Performance Quantification Discrepancy Measures: Performance: Performance Bounds: Squared Error, Discrimination Bias, Variance, Mean Discrimination Cramer-Rao Rao, Hilbert-Schmidt 40
41 Target Classification Problem Target Classifier â=t72 Given a SAR image r, determine a corresponding target class âa Target Orientation Estimation a=t72 Orientation Estimator =135 Given a SAR image r and a target class âa, estimate target orientation Likelihood Model Approach: p R,a (r,a) - Conditional Data Model p,a (,a) - Prior on orientation (known or simply uniform) P(a) - Prior on target class (known or simply uniform) 41
42 Training and Testing Problem: Function Estimation and Classification Training Data Scene and Sensor Physics Function Estimation Raw Data Processing Image L(r a,) Inference â=t72 Labeled training data: target type and pose Log-likelihood parameterized by a function: mean image, variance image, etc. 42
43 Function Estimation and Classification Functions are estimated from - sample data sets only - physical model datasets only (PRISM, XPATCH, etc.) - combination of these Training sets are finite Computational and likelihood models have a finite number of parameters Estimation error Approximation error Some regularization is needed 43
44 Function Estimation Let S and T be two target types, f S and f T the corresponding functions f S : D R or R + Spatial domain D: Surface of object Rectangular region in R 2 Continuous Pose space : Typically SE(3) Simpler to consider SE(2), SO(2) is continuous Assume a null function f 0 44
45 Function Estimation: Sieve Function space F Family of subsets F - Maximum likelihood estimate of f exists for each - F F Ulf Grenander Spline sieves = 1/Q, where Q = number of coefficients. Nonnegative functions, nonnegative expansions. Pierre Moulin, et al. f() ) = k a(k) k ()), a(k) >= 0 45
46 Sieve Geometry f S F f 2. f 1 F 1 F 2 46
47 Discrepancy Measures Estimation: Relative entropy between parameterized distributions d(f S, approximation error Expected relative entropy for estimated functions E{d( f )} estimation error Classification: Probability of error, rate functions Decompose into approximation and estimation error Examine performance as a function of, optimize 47
48 Motivation Many reported approaches to ATR from SAR Performance and database complexity are interrelated We seek to provide a framework for comparison that: - Allows direct comparison under identical conditions - Removes dependency on implementation details 2S1 T62 BTR 60 D7 ZIL 131 ZSU 23/4 Publicly available SAR data from the MSTAR program 48
49 Approaches: Conditionally Gaussian Model each pixel as complex Gaussian plus uncorrelated noise: ri 1 Ki, a N0 p r, a e R, A i Ki, a N0 2 GLRT Classification and MAP Estimation: â Bayes r argmax a max k ˆp r k,a ˆ r,a argmax HS ˆp r,a k k J. A. O Sullivan O and S. Jacobs, IEEE-AES 2000 J. A. O Sullivan, O M. D. DeVore,, V. Kedia,, and M. Miller, IEEE-AES to appear
50 Approach Select 240 combinations of implementation parameters Execute algorithms at each parameterization Scatter plot the performance-complexity pairs Determine the best achievable performance at any complexity BMP2 Variance Image at 6 Sizes ZIL131 Variance Image at 6 Sizes Six different image sizes from 128x128 to 48x48 50
51 System Parameters and Complexity Approximate (,a) ) and 2 (,a)) as piecewise constant in Implementations parameterized by: w - number of constant intervals in d - width of training intervals in N 2 - number of pixels in an image Database complexity log 10 (# floating point values / target type) 51
52 Performance and Complexity Forty combinations of angular resolution and training interval width. Variance image of at62 tank 1 Window trained over 360 Variance images of a T72 tank 72 Windows trained over 10 52
53 Problem Statement Directly compare conditionally Gaussian, log-magnitude MSE, and quarter power MSE ATR Algorithms - identical training and testing data - identical spatial and orientation windows Plot performance vs. complexity - probability of classification error - orientation estimation error - log-database size as complexity Use 10 class MSTAR SAR images Approach Select 240 combinations of implementation parameters Execute algorithms at each parameterization Scatter plot the performance-complexity pairs Determine the best achievable performance at any complexity 53
54 Approaches: Log-Magnitude Minimize distance between r db = 20 log r and db templates d 2 Make decisions according to: â LM 2 r db, LM r db LM r argminmin d 2 r db, LM k, a a k LM ˆ r a argmind 2 r db, LM k,a k Alternatively, use a form of normalization: d 2 r db r db, LM k,a LM k,a G. Owirka and L. Novak, SPIE 2230, 1994 L. Novak, et al., IEEE-AES, Jan
55 Approaches: Quarter Power Minimize distance between r QP = r 1/2 and quarter power templates d 2 Make decisions according to: â QP ˆ QP 2 r QP, QP r QP QP r argmin mind 2 r QP, QP k, a k a r,a argmin d 2 r QP, QP k,a Or, normalized by vector magnitude: d 2 k r QP, QP k,a r QP QP k,a S. W. Worrell, et al., SPIE 3070, 1997 Discussions with M. Bryant of Wright Laboratory 55
56 Conditionally Gaussian Results 56
57 Performance-Complexity Legend Forty combinations of number of piecewise constant intervals and training window width 57
58 Log-Magnitude Results Recognition without normalization Arithmetic mean normalized 58
59 Quarter Power Results Recognition without normalization Recognition with normalization 59
60 Normalized Conditionally Gaussian Results 60
61 Side-by by-side Results Comparison in terms of: Performance achievable at a given complexity Complexity required to achieve a given performance 61
62 Target Classification Results Probability of correct classification: 97.2% 2S1 BMP 2 BRDM 2 BTR 60 BTR 70 D7 T62 T 72 ZIL131 ZSU S % BMP % BRDM % BTR % BTR % D % T % T % ZIL % ZSU % 62
63 Conclusions General problem in training/testing posed as estimation/classification Method of sieves (polynomial splines chosen) Comprehensive performance-complexity study for - Ten class MSTAR problem - Conditionally Gaussian model - Log-magnitude MSE - Quarter power MSE Provided a framework for direct comparison of alternatives and selection of implementation parameters Analysis ongoing 63
ONR MURI: Performance Analysis of Multiple Sensor Systems in Automatic Target Recognition
ONR MURI: Performance Analysis of Multiple Sensor Systems in Automatic Target Recognition Joseph A. O SullivanO Electronic Systems and Signals Research Laboratory Department of Electrical Engineering Washington
More informationRecognition Performance from SAR Imagery Subject to System Resource Constraints
Recognition Performance from SAR Imagery Subject to System Resource Constraints Michael D. DeVore Advisor: Joseph A. O SullivanO Washington University in St. Louis Electronic Systems and Signals Research
More informationand the Existence of True Models
Statistical Image Reconstruction and the Existence of True Models Joseph A. O Sullivan Samuel C. Sachs Professor Dean, UMSL/WU Joint Undergraduate Engineering Program Professor of Electrical and Systems
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationIntroduction to Graphical Models
Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationClustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation
Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide
More informationA quantative comparison of two restoration methods as applied to confocal microscopy
A quantative comparison of two restoration methods as applied to confocal microscopy Geert M.P. van Kempen 1, Hans T.M. van der Voort, Lucas J. van Vliet 1 1 Pattern Recognition Group, Delft University
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationLatent Tree Approximation in Linear Model
Latent Tree Approximation in Linear Model Navid Tafaghodi Khajavi Dept. of Electrical Engineering, University of Hawaii, Honolulu, HI 96822 Email: navidt@hawaii.edu ariv:1710.01838v1 [cs.it] 5 Oct 2017
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationWHEN IS A MAXIMAL INVARIANT HYPOTHESIS TEST BETTER THAN THE GLRT? Hyung Soo Kim and Alfred O. Hero
WHEN IS A MAXIMAL INVARIANT HYPTHESIS TEST BETTER THAN THE GLRT? Hyung Soo Kim and Alfred. Hero Department of Electrical Engineering and Computer Science University of Michigan, Ann Arbor, MI 489-222 ABSTRACT
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationExponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger
Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationSignal detection theory
Signal detection theory z p[r -] p[r +] - + Role of priors: Find z by maximizing P[correct] = p[+] b(z) + p[-](1 a(z)) Is there a better test to use than r? z p[r -] p[r +] - + The optimal
More informationAutomated Segmentation of Low Light Level Imagery using Poisson MAP- MRF Labelling
Automated Segmentation of Low Light Level Imagery using Poisson MAP- MRF Labelling Abstract An automated unsupervised technique, based upon a Bayesian framework, for the segmentation of low light level
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationGenerative MaxEnt Learning for Multiclass Classification
Generative Maximum Entropy Learning for Multiclass Classification A. Dukkipati, G. Pandey, D. Ghoshdastidar, P. Koley, D. M. V. S. Sriram Dept. of Computer Science and Automation Indian Institute of Science,
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationMultiscale Systems Engineering Research Group
Hidden Markov Model Prof. Yan Wang Woodruff School of Mechanical Engineering Georgia Institute of echnology Atlanta, GA 30332, U.S.A. yan.wang@me.gatech.edu Learning Objectives o familiarize the hidden
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationIntroduction to Machine Learning Midterm, Tues April 8
Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend
More informationBayesian Learning. Bayesian Learning Criteria
Bayesian Learning In Bayesian learning, we are interested in the probability of a hypothesis h given the dataset D. By Bayes theorem: P (h D) = P (D h)p (h) P (D) Other useful formulas to remember are:
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationREVIEW OF ORIGINAL SEMBLANCE CRITERION SUMMARY
Semblance Criterion Modification to Incorporate Signal Energy Threshold Sandip Bose*, Henri-Pierre Valero and Alain Dumont, Schlumberger Oilfield Services SUMMARY The semblance criterion widely used for
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationComputer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization
Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions
More informationCSCE 478/878 Lecture 6: Bayesian Learning
Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationA Modular NMF Matching Algorithm for Radiation Spectra
A Modular NMF Matching Algorithm for Radiation Spectra Melissa L. Koudelka Sensor Exploitation Applications Sandia National Laboratories mlkoude@sandia.gov Daniel J. Dorsey Systems Technologies Sandia
More informationMachine Learning for natural language processing
Machine Learning for natural language processing Classification: Maximum Entropy Models Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 24 Introduction Classification = supervised
More informationLecture 3: Pattern Classification. Pattern classification
EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationEBEM: An Entropy-based EM Algorithm for Gaussian Mixture Models
EBEM: An Entropy-based EM Algorithm for Gaussian Mixture Models Antonio Peñalver Benavent, Francisco Escolano Ruiz and Juan M. Sáez Martínez Robot Vision Group Alicante University 03690 Alicante, Spain
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More information1. Abstract. 2. Introduction/Problem Statement
Advances in polarimetric deconvolution Capt. Kurtis G. Engelson Air Force Institute of Technology, Student Dr. Stephen C. Cain Air Force Institute of Technology, Professor 1. Abstract One of the realities
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationPose Estimation in SAR using an Information Theoretic Criterion
Pose Estimation in SAR using an Information Theoretic Criterion Jose C. Principe, Dongxin Xu, John W. Fisher III Computational NeuroEngineering Laboratory, U. of Florida. {principe,xu,fisher}@cnel.ufl.edu
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationSpeech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models.
Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.com This Lecture Expectation-Maximization (EM)
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More information2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?
ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we
More informationMachine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?
Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationWe Prediction of Geological Characteristic Using Gaussian Mixture Model
We-07-06 Prediction of Geological Characteristic Using Gaussian Mixture Model L. Li* (BGP,CNPC), Z.H. Wan (BGP,CNPC), S.F. Zhan (BGP,CNPC), C.F. Tao (BGP,CNPC) & X.H. Ran (BGP,CNPC) SUMMARY The multi-attribute
More informationA Statistical Analysis of Fukunaga Koontz Transform
1 A Statistical Analysis of Fukunaga Koontz Transform Xiaoming Huo Dr. Xiaoming Huo is an assistant professor at the School of Industrial and System Engineering of the Georgia Institute of Technology,
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationREVEAL. Receiver Exploiting Variability in Estimated Acoustic Levels Project Review 16 Sept 2008
REVEAL Receiver Exploiting Variability in Estimated Acoustic Levels Project Review 16 Sept 2008 Presented to Program Officers: Drs. John Tague and Keith Davidson Undersea Signal Processing Team, Office
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationEstimating the parameters of hidden binomial trials by the EM algorithm
Hacettepe Journal of Mathematics and Statistics Volume 43 (5) (2014), 885 890 Estimating the parameters of hidden binomial trials by the EM algorithm Degang Zhu Received 02 : 09 : 2013 : Accepted 02 :
More informationIntroduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 14 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Density Estimation Maxent Models 2 Entropy Definition: the entropy of a random variable
More informationZ. Hradil, J. Řeháček
Likelihood and entropy for quantum tomography Z. Hradil, J. Řeháček Department of Optics Palacký University,Olomouc Czech Republic Work was supported by the Czech Ministry of Education. Collaboration SLO
More informationVariable selection for model-based clustering
Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition
More informationAdaptive Multi-Modal Sensing of General Concealed Targets
Adaptive Multi-Modal Sensing of General Concealed argets Lawrence Carin Balaji Krishnapuram, David Williams, Xuejun Liao and Ya Xue Department of Electrical & Computer Engineering Duke University Durham,
More informationNaïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 18 Oct. 31, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Naïve Bayes Matt Gormley Lecture 18 Oct. 31, 2018 1 Reminders Homework 6: PAC Learning
More informationActive and Semi-supervised Kernel Classification
Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),
More informationQUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS
QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS Parvathinathan Venkitasubramaniam, Gökhan Mergen, Lang Tong and Ananthram Swami ABSTRACT We study the problem of quantization for
More informationSubmodular Functions Properties Algorithms Machine Learning
Submodular Functions Properties Algorithms Machine Learning Rémi Gilleron Inria Lille - Nord Europe & LIFL & Univ Lille Jan. 12 revised Aug. 14 Rémi Gilleron (Mostrare) Submodular Functions Jan. 12 revised
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationAN NONNEGATIVELY CONSTRAINED ITERATIVE METHOD FOR POSITRON EMISSION TOMOGRAPHY. Johnathan M. Bardsley
Volume X, No. 0X, 0X, X XX Web site: http://www.aimsciences.org AN NONNEGATIVELY CONSTRAINED ITERATIVE METHOD FOR POSITRON EMISSION TOMOGRAPHY Johnathan M. Bardsley Department of Mathematical Sciences
More informationStatistical Learning Theory and the C-Loss cost function
Statistical Learning Theory and the C-Loss cost function Jose Principe, Ph.D. Distinguished Professor ECE, BME Computational NeuroEngineering Laboratory and principe@cnel.ufl.edu Statistical Learning Theory
More informationOutline of Today s Lecture
University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Jeff A. Bilmes Lecture 12 Slides Feb 23 rd, 2005 Outline of Today s
More informationMarkov Chains and Hidden Markov Models
Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden
More informationREAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca. Siemens Corporate Research Princeton, NJ 08540
REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research Princeton, NJ 84 fscott.rickard,radu.balan,justinian.roscag@scr.siemens.com
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationLogistic Regression. William Cohen
Logistic Regression William Cohen 1 Outline Quick review classi5ication, naïve Bayes, perceptrons new result for naïve Bayes Learning as optimization Logistic regression via gradient ascent Over5itting
More informationPART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics
Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationBayesian Modeling and Classification of Neural Signals
Bayesian Modeling and Classification of Neural Signals Michael S. Lewicki Computation and Neural Systems Program California Institute of Technology 216-76 Pasadena, CA 91125 lewickiocns.caltech.edu Abstract
More informationMAP Reconstruction From Spatially Correlated PET Data
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL 50, NO 5, OCTOBER 2003 1445 MAP Reconstruction From Spatially Correlated PET Data Adam Alessio, Student Member, IEEE, Ken Sauer, Member, IEEE, and Charles A Bouman,
More informationDetection and Estimation Theory
ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer
More informationLearning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014
Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of
More informationThe Method of Types and Its Application to Information Hiding
The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,
More informationScan Time Optimization for Post-injection PET Scans
Presented at 998 IEEE Nuc. Sci. Symp. Med. Im. Conf. Scan Time Optimization for Post-injection PET Scans Hakan Erdoğan Jeffrey A. Fessler 445 EECS Bldg., University of Michigan, Ann Arbor, MI 4809-222
More informationCS340 Winter 2010: HW3 Out Wed. 2nd February, due Friday 11th February
CS340 Winter 2010: HW3 Out Wed. 2nd February, due Friday 11th February 1 PageRank You are given in the file adjency.mat a matrix G of size n n where n = 1000 such that { 1 if outbound link from i to j,
More informationExpectation propagation for signal detection in flat-fading channels
Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA
More informationMachine Learning Lecture Notes
Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective
More informationLINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
LINEAR CLASSIFIERS Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification, the input
More informationStatistical Filters for Crowd Image Analysis
Statistical Filters for Crowd Image Analysis Ákos Utasi, Ákos Kiss and Tamás Szirányi Distributed Events Analysis Research Group, Computer and Automation Research Institute H-1111 Budapest, Kende utca
More informationNo. of dimensions 1. No. of centers
Contents 8.6 Course of dimensionality............................ 15 8.7 Computational aspects of linear estimators.................. 15 8.7.1 Diagonalization of circulant andblock-circulant matrices......
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationPart 4: Conditional Random Fields
Part 4: Conditional Random Fields Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 39 Problem (Probabilistic Learning) Let d(y x) be the (unknown) true conditional distribution.
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationBayesian Learning. Two Roles for Bayesian Methods. Bayes Theorem. Choosing Hypotheses
Bayesian Learning Two Roles for Bayesian Methods Probabilistic approach to inference. Quantities of interest are governed by prob. dist. and optimal decisions can be made by reasoning about these prob.
More informationHidden Markov Models
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 22 April 2, 2018 1 Reminders Homework
More information