Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:
|
|
- Ilene Carter
- 5 years ago
- Views:
Transcription
1 Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Lawrence Carin, Leslie Collins, Qing Liu Duke University Waymond Scott and James McClellan Georgia Institute of Technology George Papanicolaou Stanford University Sponsors: DARPA (Dr. Doug Cochran), ARO (Dr. Russell Harmon)
2 Research Overview Sensing of landmines, targets under trees, underground structures and targets in a water channel are very distinct missions, although they fall under the general problem of sensing concealed targets in the presence of a complex, stochastic environment Rather than focusing on one of these areas, we exploit their inter-relationships to investigate the general concealed-target problem Particular examples will be investigated by connecting members of the MURI to appropriate members of the user community (e.g. landmines: Army Countermine Office, Ft. Belvoir, VA) Undertake a multi-modal (multi-sensor) approach to effect inversion, with insights from the evolving inversion used to refine/optimize the inter- and intra-sensor parameters Will exploit the fact that future (and current) DoD systems will rely increasingly on autonomous vehicles (e.g. multiple robots, UAVs and UUVs that can be positioned to optimize the sensor platform as the inversion is undertaken)
3 Multi-Modal Inversion of General Targets Embedded In Arbitrary Stochastic Layered Media General Adaptive Algorithms and Phenomenological Insights Application- Specific Questions/ Issues Landmines Underground Structures Concealed Ground Structures
4 Multiple Sensors Multi-Sensor Physics-Based Constrained RDOF Phenomenology from Forward Models Direct Inversion Statistical Inversion Reverse-Time Migration General Nonlinear Inversion Adaptive Coarse-to- Fine Pruning Adaptive HMM Multi-sensor mutual information Multi-sensor, adaptive Bayesian Optimize Intra-Sensor Parameters Optimize Inter-Sensor Parameters Inversion Confidence High Inversion complete Phenomenology from Forward Models
5 Key to Program Success: Close Coupling to User Community MURI team members have a track record of working together and with the responsible parties within the DoD (and contractors) To assure that the research addresses relevant issues, and is transitioned to the appropriate organizations, it is essential that the MURI team members work closely with the user community Close relationships exist with the DoD landmine community (NVESD), and similar relationships are planned for the TUT and underground-structures problems Research relevance assured through goal of processing data generated by the appropriate DoD community
6 Duke Team Presentations 8:50-9:20 Lawrence Carin Optimal Experiments for Adaptive Sensing: Regression, Detection and Classification 9:20-9:50 Leslie Collins Bayesian Adaptive Multi-Modality Processing for the Discrimination of Landmines 9:50-10:20 Qing Liu Fast Forward Solvers for Electromagnetic and Elastic Scattering 10:20-10:45 Break 10:45-11:15 George Papanicolaou Imaging in Clutter 11:15-12:15 Waymond Scott and James McClellan Detection of Obscured Objects: Signal Processing & Modeling
7 Optimal Experiments for Adaptive Sensing: Regression, Detection and Classification Lawrence Carin, Xuejun Liao, Shihao Yu and Yan Zhang Department of Electrical & Computer Engineering Duke University Durham, NC
8 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions
9 Magnetization Tensor for EMI Sensing Magnetic fields of a rotated EMI dipole H z z zs T T (Θ rs, ωs ) = ( r r ) ( ) ( ) T 3 s U M ωs U r rs [( r r ) ( r r )] s s ωmpk ωmpk M( ω) = diag[ mp +, mp0 +, mz0 ω jω ω jω + 0 k pk k pk k ωmzk ω jω zk ] The target EMI dipole parameters to be inverted are Θ = [ m,θ φ T p0,mpk,ωpk,mz0,mzk,ωzk,x,y,z, ] The sensor parameters that may be varied are p = [ xs,ys,zs,ωs T ]
10 r s Mobile induction sensor x r z
11 z target dipole 0 θ φ y x x-y-z: sensor coordinate system
12 Measurement Model Set of N measurement parameters and observations represented (p n,o n ) n=1,n The observation is modeled as the magnetic field plus additive WGN O = H Θ r, ω ) + G z ( s s Given N measurements (p n,o n ) n=1,n the target model parameters are estimated as Θˆ = arg min g ( Θ) = arg min Hz( Θ pn) On( pn) Θ subject to : m p0,m pk,ω pk Θ,m z0 N n= 1,m zk,ω zk 0 2
13 Statistical Characterization The error between the observation and the model is modeled as WGN, where we are assuming that the model is correct ), ( ), ( ), ( ) ; ( Θ p Θ Θ Θ I R je = e + = p p e p f O The associated density function is ) ; ( ) ; ( ) exp( 2 ) exp( 2 ) ; ( Θ Θ Θ I R I R e p e p e e p = = β π β β π β e Associated Fisher information matrix (iid) = = β N n H n n 1 } )] ; ( )][ ; ( Re{[ Θ Θ Θ Θ p f p f J Assumption of additive WGN is analogous to linearization with arbitrary noise Θ Θ Θ Θ Θ Θ Θ ˆ p f ˆ ˆ p f p f = Θ + ) ; ( ) - ( ) ; ( ) ; ( T N N L
14 Information Gain of New Measurement Measure of information in N iid measurements where J n ( p N n 1,..., pn}) = J ( p1,..., pn ) = J ( n) n= 1 q({ p p n ) = β Re{[ Θ f ( p n ; Θ)][ Θ f ( p n ; Θ)] H } Information gain by adding measurement N+1 δ q ( p) = lnq({ p1,..., p, p 1}) lnq({ p1,..., p }) = ln I N N + N + β F T B 1 N F F = N [ FR, FI ] = [Re{ Θ f ( pn + 1; Θ)},Im{ Θ f ( p + 1; Θ)}] B N = N n= 1 J n Sensor parameters for measurement N+1 p H 1 = arg maxδq( p) = argmaxln I + F BN N + β p p 1 F
15 ) ( ) ( ) ( ) ( ) ( N N T T N N N H q ln c p p W W p p F B F I p p p µ + β = µ = δ ψ + + Accounting for Measurement Cost Previous theory chose new sensor parameters based on information gain alone Desirable to account as well for measurement cost Simple augmentation of cost function
16 60 digits=sensor positions,star=sources,angles=[ ] Deg y position (meters) y position (meters) source dipole numbers: sensor dipole xposition(meters) x position (meters)
17 550 sensor-frequency used in the fixed grid Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number
18 digits=sensor positions,star=sources,angles=[ ] Deg. source dipole numbers: sensor dipole y position (meters) y position (meters) xposition(meters) x position (meters)
19 10 4 optimal sensor-frequency from the search strategy Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number
20 10 4 MSE of model parameter estimate 10 2 MSE with optimal sampling points MSE with fixed sampling points Mean Squared Error (MSE) mean squared error (MSE) number of data points used Number of Data Points Used
21 No Constraints On Sensor Motion digits=sensor positions,star=sources,angles=[ ] Deg. source dipole numbers: sensor dipole 1 y position (meters) y position (meters) xposition(meters) x position (meters)
22 1500 optimal sensor-frequency as a function of iteration number Sensor sensor-frequency Frequency (Hz) iteration number Iteration Number
23 Constrained Sensor Motion digits=sensor positions,star=sources,angles=[ ] Deg source dipole numbers: sensor dipole 20 y position (meters) y position (meters) xposition(meters) x position (meters)
24 10 5 optimal sensor-frequency as a function of iteration number 10 4 Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number
25 Information-Gain Surface, Measurement One
26 Information-Gain Surface, Measurement Two
27 Information-Gain Surface, Measurement Three
28 Information-Gain Surface, Measurement Four
29 Information-Gain Surface, Measurement Five
30 true dipole, dipole fitted from all selected data, and dipole fitted from the initial single datum m p0 m p1 ω p1 m z0 m z1 ω z1 x n y n z n θ φ true dipole 1: true dipole 2: last fit. dipole 1: last fit. dipole 2: last fit. dipole 3: ini fit. dipole 1: ini fit. dipole 2: ini fit. dipole 3:
31 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions
32 Basic Construct S1 S2 target S3 S4 Scattering Data Can be Segmented into Angular Bins Characterized by particular physics Each such angular range termed a state (S1, S2,..., SN)
33 Hidden Markov Models π 1 π 2 π 3 s 1 s 2 s 3 a 11 a 13 a 33 a 12 a 23 s 1 s 2 s 3 a 21 a 22 a 31
34 Target Target Target 3 Target 4 φ Ribs Target 5 Internal Oscillators
35 Theory of Optimal Experiments In previous HMM classification studies the path of the sensor was fixed, with uniform sampling spatially For a UUV/UAV, one may be able to adaptively control sensor path Question: Can we optimize the UUV/UAV path to most efficiently classify a submerged target (mine)? Employ the theory of optimal experiments Assume that we wish to distinguish between K targets, and we perform t measurements, or observations O t ={O 1,,O t } ML classification: Choose target i if p( O T ) > p( O T ), t i t k T k T i
36 Theory of Optimal Experiments - 2 Assume that we have performed t measurements, at which we have measured, O t ={O 1,,O t }. The measurements are performed at the relative angles θ, L 2, θ t Note: Since we don t know the target orientation, we can only control the relative sensor angles with respect to the target Assume we have the target-dependent statistical models p for example with an HMM ( O1, L, O t θ2, L, θt, Tk ) θ t+1 At what relative angle should we perform measurement t+1, to optimize target classification?
37 Theory of Optimal Experiments - 3 Consider the probabilities p( Tk O1, L, O t, θ2, L, θt ) for all targets Idea: Choose the t+1 measurement to minimize the entropy characteristic of these density functions H t ( θ2,..., θt ) = k = K p( T θ log p( T, L, θ θ 2, O 1 k 2 t 1 k, L, O, L, θ t t ), O 1, L, O t ) θ t+1 Issue: When considering different we do not actually have access to O t+1 Solution: We compute the expected entropy, integrating over our density function for O t+1 Key: With HMM we need not assume independent measurements + HMM efficient
38 Theory of Optimal Experiments - 4 We can perform this efficiently utilizing our previous work with HMMs 50 Resopnse for Target 1 50 Resopnse for Target 2 50 Resopnse for Target Frequency (khz) Frequency (khz) Frequency (khz) Angle (Deg) Resopnse for Target Angle (Deg) Resopnse for Target 5 Angle (Deg) Frequency (khz) Frequency (khz) Angle (Deg) Angle (Deg)
39 Confusion Matrix (5 points) Uniform Sampling: o 5 T 1 T 2 T 3 T 1 T 2 T 3 T 4 T T 4 T Ave = 77.72% Uniform Sampling: T 1 T T 2 T 3 T T o 45 T 2 T T T Ave= 86.89%
40 T 1 T1 T2 T3 T4 T Optimal Sampling: T 2 T T 4 T Ave= 92.89%
41 Objective Function 1.2 x θ 5 = delta Entropy Sample rate
42 Sample postions= , delta= Targ 1 Targ 2 Targ 3 Targ 4 Targ 5 Probablity Number of Observations
43 Sample postions= , delta= Targ 1 Targ 2 Targ 3 Targ 4 Targ 5 Probablity Number of Observations
44 Adaptive 5 degree 48 degree 1 Entropy Number of Observations
45 Optimal Multi-Aspect Target Detection Previous results considered optimal sensing for classification problem We assumed knowledge of HMMs for each of the targets Often we have HMMs for known targets but may come across objects we haven t seen previously Can we extend theory of optimal experiments to case for which we only have HMMs for a subset of objects we may encounter? Relevant for target detection
46 Optimal Multi-Aspect Target Detection - 2 Assume that we have performed t measurements, at which we have measured, O t ={O 1,,O t }. The measurements are performed at the relative angles θ, L 2, θ t The log likelihood of these measurements, for the known target T is log p( O1, O2,..., Ot θ1, θ2,..., θt, T) We perform the next measurement at the change of angle expected increase in log likelihood θ +1 t that maximizes the arg max θt+ = E [log p( O1,..., O, O 1 1,...,, 1 O 1 O1,..., Ot, θ1,..., θt, θt 1 t t+ θ θt θ + + θ t+ 1 t t+ 1, T )] This corresponds to minimizing the entropy (uncertainty) of whether the target under interrogation corresponds to target T
47 NSWC Data: 6 Targets khz Bullet Plastic Cone 55 Gal. Drum Rock 1 Rock 2 Log
48 ROC curve between Target 5 and Dobeck NSWC Targets Adaptive 5 degree 22 degree P d P f
49 0 Uniform Sampling o 22 Log Likelihood Optimal Sampling Log Likelihood
50 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions
51 Background UXO versus clutter : a classification problem Labeled data and unlabeled data Data is easy to observed, but excavation is very costly Reducing cost by the smart learning, no prior training data required
52 Step 1: Collect Data
53 Step 2: Choose Representative Signatures (Basis Functions), No Digging
54 Step 3: Dig First Hole to Learn Label
55 Step 4: Refine Classifier and then Dig Hole Two
56 Step 5: Refine Classifier and then Dig Hole Three, etc
57 Step 6: Obtain Set of Labeled Signatures and Complete Classifier Design after Information Gain Accrued By New Labels is Minimal
58 Classifier design: A Sketch Three parts for building the classifier: 1. Select basis functions sequentially to build the structure of the classifier using all available data (unlabeled). The structure is designed to occupy the maximum information from the data set. 2. Sequentially select the most informative data to the classifier structure determined from Part 1, discover its label, and refine the information of the structure. 3. Estimate the weights of the classifier with labeled data selected from Part 2.
59 Classifier design: Kernel Machine Kernel machine is a linear classifier in high dimensional space, it can produce complex decision boundaries while keep the generalization ability high. n T f n ( x ) = wn, i K ( ci, x ) + wn, 0 = w n Φn ( x i= 1 ) Basis functions: Φ n ( ) = [1, ( c1, ), K ( c2, ),..., K ( cn, K )] T Weights : w n = [ n, 0, wn,1, wn,2, L, wn, n w ] T Kernel: Relevant vectors: K ( c i, x ) Comparing the similarity between c i and x in high dimensional space. ci, i = 1, L, n
60 Classifier design: Selection of Basis Functions (1) With the whole unlabeled data set X ={x 1, x 2,, x N }, basis functions are selected sequentially by Fisher Information Criteria (FIC) to capture the data characteristics. Assume a linear kernel machine has n basis functions, then the Fisher information matrix for the weights is N M = = Φn ( xi ) Φ n i 1 T n (x i ) M n indicates the information from data set X for the current structure of the classifier. Its inverse is the Cramer-Rao Bound.
61 Classifier design: Selection of Basis Functions (2) When a new kernel Φ n+1 ( ) = K(c n+1, ), c n+1 X is added to basis functions, the new Fisher information matrix is M Φ [ T Φ ( x ) ( x ] N n i n + = = Φ 1 i 1 n i n+ 1 i Φn+ 1( x ) i ) ( x ) The information gain by adding a new kernel is defined as follows. It can also be viewed as the drop of CRB. I ker nal = + ln Det( M n 1) ln Det( M n )
62 Classifier design: Selection of Basis Functions (3) For the new kernel K(c n+1, ), the information gain is equal to ln I ( ) kernel cn+ 1 = N 2 i = (xi ) Φ n ( N T 1 N Φ i= n+ i n i n i= n i Φ 1 1 x ) Φ ( x )) M ( Φ ( x ) 1 n+ 1 ( ( x i )) Then the optimal new kernel is selected by a greedy search to maximize the information gain c n+ = arg max ln I ( c) 1 c X \{ c1,..., c } ker nel n
63 Classifier design: Selection of labeled Data (1) After a set of most informative basis functions Φ n0 ( ) have been selected, in order to determine the classifier, the labeled data are required to estimate the weights. Another FIC is utilized to assign a priority to each unlabeled data, and those more informative to the estimation are labeled first by digging them out. Therefore, the classifier is determined with substantial less excavating effort, which is the most costly part for UXO problem.
64 Classifier design: Selection of labeled Data (2) Labeled data selection is also a sequential process. Assuming we have a subset of labeled data X L X, X L = {x 1, x 2,, x L }, the Fisher information matrix with the data set X L is M ( L T X L) Φ n 1 n ( xl ) Φn ( x l ) = l = When a new datum x L+1 is added, the new Fisher information matrix with the data set X L+1 is M n T ( X L+ ) = M n ( X L ) + Φn ( xl+ 1) Φn ( xl 1)
65 Classifier design: Selection of labeled Data (3) The information gain by adding a new datum is defined by Idata = ln Det( M n ( X )) ln ( ( 0 L+ 1 Det M n X 0 L )) For the datum x L+1, the information gain is T 1 ln I data( xl+ ) = 1+ Φn ( xl+ 1) M n ( X L) Φn ( xl 1) Then the next most informative datum is selected by a greedy search, and its label is recovered for the estimation of weights. x L+ = arg max ln I ( x) 1 x X \ X data L
66 Classifier design: Weights Estimation With n 0 basis functions Φ n0 ( ) and L 0 labeled data {X L0, Y L0 } selected by FIC, the weights w n0 of the classifier are obtained via a Best Linear Unbiased Estimation (BLUE). = = ( ( ) ( L l l T n l n L n ) ) x Φ x Φ X M = = ) ( ) ( L l l l n L n n y x Φ X M w
67 Overall Procedure Collect magnetometer and induction data at a given site (NO training data) Select a set of example signatures for basis function (NO digging required) Sequentially dig up UXO/clutter, one at a time, to learn labels of particular signatures Using the signatures and labels, refine classifier Using new classifier, choose next example to dig and learn label Refine classifier Continue digging and learning labels until the information gain to classifier is minimal Classifier design complete Using final classifier, discriminate UXO from clutter for remaining un-excavated objects
68 Results The proposed method is tested with the data from the JPG V demonstration site, which is deployed to asses different technologies for UXO detection and discrimination under realistic scenarios. Both GEM-3 data and magnetometer data are considered. Features are extracted via model fitting. Magnetometer feature extraction is similar to that of the induction sensor, but much simpler. A typical set of features are utilized for the classification. The data is from two adjoining areas, and includes totally 40 UXO, which are 60 mm and 81mm mortars; 57 mm, 76 mm, 105 mm, 152 mm, and 155 mm projectiles; and 2.75 inch rockets. After the model fitting based prescreening operation, 260 clutters are left.
69 Results There are 16 UXO and 112 clutters in area 3, which is originally assigned as the training area. We observe that most UXO items presented in one area are also appeared in the other area, which seems to be a manual effort to keep the training data matched to the detecting data. We feel this may not be feasible in practice. The ROC curve of the proposed active training method is presented in comparison with results of a standard Kernel Matching Pursuits (KMP) from 100 random training selections. KMP is the most related classifier to the method proposed here, and it has achieved superior results comparing with some state of the art methods.
70 Number of labeled data: 128
71 Number of labeled data: 90
72 Number of labeled data: 60
73 Number of labeled data: 40
74 Future Directions Thus far have considered optimization of parameters (e.g. position, frequency, etc.) of single sensors -EMI -Acoustic - Magnetometer Now must place optimal sensing into context of multiple sensors Multiple sensors operating collectively (e.g. team of robots and/or UAVs)
Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:
Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Duke University Georgia Institute of
More informationAdaptive Multi-Modal Sensing of General Concealed Targets
Adaptive Multi-Modal Sensing of General Concealed argets Lawrence Carin Balaji Krishnapuram, David Williams, Xuejun Liao and Ya Xue Department of Electrical & Computer Engineering Duke University Durham,
More informationTHere are many sensing scenarios for which the target is
Adaptive Multi-Aspect Target Classification and Detection with Hidden Markov Models Shihao Ji, Xuejun Liao, Senior Member, IEEE, and Lawrence Carin, Fellow, IEEE Abstract Target detection and classification
More informationENVIRONMENTAL remediation of sites containing unexploded
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007 629 A Bivariate Gaussian Model for Unexploded Ordnance Classification with EMI Data David Williams, Member, IEEE, Yijun Yu, Levi
More informationReinforcement Learning as Classification Leveraging Modern Classifiers
Reinforcement Learning as Classification Leveraging Modern Classifiers Michail G. Lagoudakis and Ronald Parr Department of Computer Science Duke University Durham, NC 27708 Machine Learning Reductions
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationDetection of Unexploded Ordnance: An Introduction
Chapter 1 Detection of Unexploded Ordnance: An Introduction Hakan Deliç Wireless Communications Laboratory Department of Electrical and Electronics Engineering Boǧaziçi University Bebek 34342 Istanbul
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationMagnetics: Fundamentals and Parameter Extraction
: Fundamentals and Parameter Extraction Stephen Billings Magnetic module outline fundamentals Sensor systems Data examples and demo Parameter extraction Concepts Real-world examples Classification Using
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationThe connection of dropout and Bayesian statistics
The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationStatistical learning theory, Support vector machines, and Bioinformatics
1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationONR Mine Warfare Autonomy Virtual Program Review September 7, 2017
ONR Mine Warfare Autonomy Virtual Program Review September 7, 2017 Information-driven Guidance and Control for Adaptive Target Detection and Classification Silvia Ferrari Pingping Zhu and Bo Fu Mechanical
More informationCamp Butner UXO Data Inversion and Classification Using Advanced EMI Models
Project # SERDP-MR-1572 Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models Fridon Shubitidze, Sky Research/Dartmouth College Co-Authors: Irma Shamatava, Sky Research, Inc Alex
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationA Theoretical Performance Analysis and Simulation of Time-Domain EMI Sensor Data for Land Mine Detection
2042 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 38, NO. 4, JULY 2000 A Theoretical Performance Analysis and Simulation of Time-Domain EMI Sensor Data for Land Mine Detection Ping Gao, Student
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationMachine Learning Overview
Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationConditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013
Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationClassification and Regression Trees
Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity
More information6.867 Machine Learning
6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationLocalization of Radioactive Sources Zhifei Zhang
Localization of Radioactive Sources Zhifei Zhang 4/13/2016 1 Outline Background and motivation Our goal and scenario Preliminary knowledge Related work Our approach and results 4/13/2016 2 Background and
More informationSupport Vector Machine. Industrial AI Lab.
Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different
More informationAn Introduction to Statistical Machine Learning - Theoretical Aspects -
An Introduction to Statistical Machine Learning - Theoretical Aspects - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny,
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1
More informationDynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji
Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationQUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS
QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS Parvathinathan Venkitasubramaniam, Gökhan Mergen, Lang Tong and Ananthram Swami ABSTRACT We study the problem of quantization for
More informationFully Bayesian Deep Gaussian Processes for Uncertainty Quantification
Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationReliability of Seismic Data for Hydrocarbon Reservoir Characterization
Reliability of Seismic Data for Hydrocarbon Reservoir Characterization Geetartha Dutta (gdutta@stanford.edu) December 10, 2015 Abstract Seismic data helps in better characterization of hydrocarbon reservoirs.
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationBayesian Networks Inference with Probabilistic Graphical Models
4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning
More informationGeneralization Error on Pruning Decision Trees
Generalization Error on Pruning Decision Trees Ryan R. Rosario Computer Science 269 Fall 2010 A decision tree is a predictive model that can be used for either classification or regression [3]. Decision
More information(2) I. INTRODUCTION (3)
2686 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 5, MAY 2010 Active Learning and Basis Selection for Kernel-Based Linear Models: A Bayesian Perspective John Paisley, Student Member, IEEE, Xuejun
More informationHidden Markov Models (HMM) and Support Vector Machine (SVM)
Hidden Markov Models (HMM) and Support Vector Machine (SVM) Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 1 Hidden Markov Models (HMM)
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationFinal Exam, Spring 2006
070 Final Exam, Spring 2006. Write your name and your email address below. Name: Andrew account: 2. There should be 22 numbered pages in this exam (including this cover sheet). 3. You may use any and all
More informationAugmented Statistical Models for Speech Recognition
Augmented Statistical Models for Speech Recognition Mark Gales & Martin Layton 31 August 2005 Trajectory Models For Speech Processing Workshop Overview Dependency Modelling in Speech Recognition: latent
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Guy Lebanon Binary Classification Binary classification is the most basic task in machine learning, and yet the most frequent. Binary classifiers often serve as the
More informationFeature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with
Feature selection c Victor Kitov v.v.kitov@yandex.ru Summer school on Machine Learning in High Energy Physics in partnership with August 2015 1/38 Feature selection Feature selection is a process of selecting
More informationDiscriminative Learning in Speech Recognition
Discriminative Learning in Speech Recognition Yueng-Tien, Lo g96470198@csie.ntnu.edu.tw Speech Lab, CSIE Reference Xiaodong He and Li Deng. "Discriminative Learning in Speech Recognition, Technical Report
More informationComplexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning
Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
More informationSLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada
SLAM Techniques and Algorithms Jack Collier Defence Research and Development Canada Recherche et développement pour la défense Canada Canada Goals What will we learn Gain an appreciation for what SLAM
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationPassive Sonar Detection Performance Prediction of a Moving Source in an Uncertain Environment
Acoustical Society of America Meeting Fall 2005 Passive Sonar Detection Performance Prediction of a Moving Source in an Uncertain Environment Vivek Varadarajan and Jeffrey Krolik Duke University Department
More informationSparse Sensing for Statistical Inference
Sparse Sensing for Statistical Inference Model-driven and data-driven paradigms Geert Leus, Sundeep Chepuri, and Georg Kail ITA 2016, 04 Feb 2016 1/17 Power networks, grid analytics Health informatics
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationOverview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated
Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian
More informationDimension Reduction (PCA, ICA, CCA, FLD,
Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationLECTURE 18: NONLINEAR MODELS
LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:
More informationSupport Vector Machine. Industrial AI Lab. Prof. Seungchul Lee
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /
More informationDoug Cochran. 3 October 2011
) ) School Electrical, Computer, & Energy Arizona State University (Joint work with Stephen Howard and Bill Moran) 3 October 2011 Tenets ) The purpose sensor networks is to sense; i.e., to enable detection,
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationECE521 Lecture7. Logistic Regression
ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard
More informationCOMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation
COMP 55 Applied Machine Learning Lecture 2: Bayesian optimisation Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp55 Unless otherwise noted, all material posted
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationRobotics 2 AdaBoost for People and Place Detection
Robotics 2 AdaBoost for People and Place Detection Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard v.1.0, Kai Arras, Oct 09, including material by Luciano Spinello and Oscar Martinez Mozos
More informationEmpirical Risk Minimization
Empirical Risk Minimization Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Introduction PAC learning ERM in practice 2 General setting Data X the input space and Y the output space
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationProbabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov
Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly
More informationA Bayesian Perspective on Residential Demand Response Using Smart Meter Data
A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationRobust Monte Carlo Methods for Sequential Planning and Decision Making
Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory
More information1 [15 points] Frequent Itemsets Generation With Map-Reduce
Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.
More informationCOMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017
COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking
More informationEstimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction
Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept
More informationModeling Multiple-mode Systems with Predictive State Representations
Modeling Multiple-mode Systems with Predictive State Representations Britton Wolfe Computer Science Indiana University-Purdue University Fort Wayne wolfeb@ipfw.edu Michael R. James AI and Robotics Group
More informationCondition Monitoring of Single Point Cutting Tool through Vibration Signals using Decision Tree Algorithm
Columbia International Publishing Journal of Vibration Analysis, Measurement, and Control doi:10.7726/jvamc.2015.1003 Research Paper Condition Monitoring of Single Point Cutting Tool through Vibration
More informationData Structures for Efficient Inference and Optimization
Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang Discrete
More informationSeq2Seq Losses (CTC)
Seq2Seq Losses (CTC) Jerry Ding & Ryan Brigden 11-785 Recitation 6 February 23, 2018 Outline Tasks suited for recurrent networks Losses when the output is a sequence Kinds of errors Losses to use CTC Loss
More informationOutline Lecture 2 2(32)
Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationIndependent Component Analysis and Unsupervised Learning
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent
More informationMachine Learning Basics: Maximum Likelihood Estimation
Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning
More information