Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:

Size: px
Start display at page:

Download "Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:"

Transcription

1 Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Lawrence Carin, Leslie Collins, Qing Liu Duke University Waymond Scott and James McClellan Georgia Institute of Technology George Papanicolaou Stanford University Sponsors: DARPA (Dr. Doug Cochran), ARO (Dr. Russell Harmon)

2 Research Overview Sensing of landmines, targets under trees, underground structures and targets in a water channel are very distinct missions, although they fall under the general problem of sensing concealed targets in the presence of a complex, stochastic environment Rather than focusing on one of these areas, we exploit their inter-relationships to investigate the general concealed-target problem Particular examples will be investigated by connecting members of the MURI to appropriate members of the user community (e.g. landmines: Army Countermine Office, Ft. Belvoir, VA) Undertake a multi-modal (multi-sensor) approach to effect inversion, with insights from the evolving inversion used to refine/optimize the inter- and intra-sensor parameters Will exploit the fact that future (and current) DoD systems will rely increasingly on autonomous vehicles (e.g. multiple robots, UAVs and UUVs that can be positioned to optimize the sensor platform as the inversion is undertaken)

3 Multi-Modal Inversion of General Targets Embedded In Arbitrary Stochastic Layered Media General Adaptive Algorithms and Phenomenological Insights Application- Specific Questions/ Issues Landmines Underground Structures Concealed Ground Structures

4 Multiple Sensors Multi-Sensor Physics-Based Constrained RDOF Phenomenology from Forward Models Direct Inversion Statistical Inversion Reverse-Time Migration General Nonlinear Inversion Adaptive Coarse-to- Fine Pruning Adaptive HMM Multi-sensor mutual information Multi-sensor, adaptive Bayesian Optimize Intra-Sensor Parameters Optimize Inter-Sensor Parameters Inversion Confidence High Inversion complete Phenomenology from Forward Models

5 Key to Program Success: Close Coupling to User Community MURI team members have a track record of working together and with the responsible parties within the DoD (and contractors) To assure that the research addresses relevant issues, and is transitioned to the appropriate organizations, it is essential that the MURI team members work closely with the user community Close relationships exist with the DoD landmine community (NVESD), and similar relationships are planned for the TUT and underground-structures problems Research relevance assured through goal of processing data generated by the appropriate DoD community

6 Duke Team Presentations 8:50-9:20 Lawrence Carin Optimal Experiments for Adaptive Sensing: Regression, Detection and Classification 9:20-9:50 Leslie Collins Bayesian Adaptive Multi-Modality Processing for the Discrimination of Landmines 9:50-10:20 Qing Liu Fast Forward Solvers for Electromagnetic and Elastic Scattering 10:20-10:45 Break 10:45-11:15 George Papanicolaou Imaging in Clutter 11:15-12:15 Waymond Scott and James McClellan Detection of Obscured Objects: Signal Processing & Modeling

7 Optimal Experiments for Adaptive Sensing: Regression, Detection and Classification Lawrence Carin, Xuejun Liao, Shihao Yu and Yan Zhang Department of Electrical & Computer Engineering Duke University Durham, NC

8 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions

9 Magnetization Tensor for EMI Sensing Magnetic fields of a rotated EMI dipole H z z zs T T (Θ rs, ωs ) = ( r r ) ( ) ( ) T 3 s U M ωs U r rs [( r r ) ( r r )] s s ωmpk ωmpk M( ω) = diag[ mp +, mp0 +, mz0 ω jω ω jω + 0 k pk k pk k ωmzk ω jω zk ] The target EMI dipole parameters to be inverted are Θ = [ m,θ φ T p0,mpk,ωpk,mz0,mzk,ωzk,x,y,z, ] The sensor parameters that may be varied are p = [ xs,ys,zs,ωs T ]

10 r s Mobile induction sensor x r z

11 z target dipole 0 θ φ y x x-y-z: sensor coordinate system

12 Measurement Model Set of N measurement parameters and observations represented (p n,o n ) n=1,n The observation is modeled as the magnetic field plus additive WGN O = H Θ r, ω ) + G z ( s s Given N measurements (p n,o n ) n=1,n the target model parameters are estimated as Θˆ = arg min g ( Θ) = arg min Hz( Θ pn) On( pn) Θ subject to : m p0,m pk,ω pk Θ,m z0 N n= 1,m zk,ω zk 0 2

13 Statistical Characterization The error between the observation and the model is modeled as WGN, where we are assuming that the model is correct ), ( ), ( ), ( ) ; ( Θ p Θ Θ Θ I R je = e + = p p e p f O The associated density function is ) ; ( ) ; ( ) exp( 2 ) exp( 2 ) ; ( Θ Θ Θ I R I R e p e p e e p = = β π β β π β e Associated Fisher information matrix (iid) = = β N n H n n 1 } )] ; ( )][ ; ( Re{[ Θ Θ Θ Θ p f p f J Assumption of additive WGN is analogous to linearization with arbitrary noise Θ Θ Θ Θ Θ Θ Θ ˆ p f ˆ ˆ p f p f = Θ + ) ; ( ) - ( ) ; ( ) ; ( T N N L

14 Information Gain of New Measurement Measure of information in N iid measurements where J n ( p N n 1,..., pn}) = J ( p1,..., pn ) = J ( n) n= 1 q({ p p n ) = β Re{[ Θ f ( p n ; Θ)][ Θ f ( p n ; Θ)] H } Information gain by adding measurement N+1 δ q ( p) = lnq({ p1,..., p, p 1}) lnq({ p1,..., p }) = ln I N N + N + β F T B 1 N F F = N [ FR, FI ] = [Re{ Θ f ( pn + 1; Θ)},Im{ Θ f ( p + 1; Θ)}] B N = N n= 1 J n Sensor parameters for measurement N+1 p H 1 = arg maxδq( p) = argmaxln I + F BN N + β p p 1 F

15 ) ( ) ( ) ( ) ( ) ( N N T T N N N H q ln c p p W W p p F B F I p p p µ + β = µ = δ ψ + + Accounting for Measurement Cost Previous theory chose new sensor parameters based on information gain alone Desirable to account as well for measurement cost Simple augmentation of cost function

16 60 digits=sensor positions,star=sources,angles=[ ] Deg y position (meters) y position (meters) source dipole numbers: sensor dipole xposition(meters) x position (meters)

17 550 sensor-frequency used in the fixed grid Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number

18 digits=sensor positions,star=sources,angles=[ ] Deg. source dipole numbers: sensor dipole y position (meters) y position (meters) xposition(meters) x position (meters)

19 10 4 optimal sensor-frequency from the search strategy Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number

20 10 4 MSE of model parameter estimate 10 2 MSE with optimal sampling points MSE with fixed sampling points Mean Squared Error (MSE) mean squared error (MSE) number of data points used Number of Data Points Used

21 No Constraints On Sensor Motion digits=sensor positions,star=sources,angles=[ ] Deg. source dipole numbers: sensor dipole 1 y position (meters) y position (meters) xposition(meters) x position (meters)

22 1500 optimal sensor-frequency as a function of iteration number Sensor sensor-frequency Frequency (Hz) iteration number Iteration Number

23 Constrained Sensor Motion digits=sensor positions,star=sources,angles=[ ] Deg source dipole numbers: sensor dipole 20 y position (meters) y position (meters) xposition(meters) x position (meters)

24 10 5 optimal sensor-frequency as a function of iteration number 10 4 Sensor Frequency (Hz) sensor-frequency (Hz) iteration number Iteration Number

25 Information-Gain Surface, Measurement One

26 Information-Gain Surface, Measurement Two

27 Information-Gain Surface, Measurement Three

28 Information-Gain Surface, Measurement Four

29 Information-Gain Surface, Measurement Five

30 true dipole, dipole fitted from all selected data, and dipole fitted from the initial single datum m p0 m p1 ω p1 m z0 m z1 ω z1 x n y n z n θ φ true dipole 1: true dipole 2: last fit. dipole 1: last fit. dipole 2: last fit. dipole 3: ini fit. dipole 1: ini fit. dipole 2: ini fit. dipole 3:

31 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions

32 Basic Construct S1 S2 target S3 S4 Scattering Data Can be Segmented into Angular Bins Characterized by particular physics Each such angular range termed a state (S1, S2,..., SN)

33 Hidden Markov Models π 1 π 2 π 3 s 1 s 2 s 3 a 11 a 13 a 33 a 12 a 23 s 1 s 2 s 3 a 21 a 22 a 31

34 Target Target Target 3 Target 4 φ Ribs Target 5 Internal Oscillators

35 Theory of Optimal Experiments In previous HMM classification studies the path of the sensor was fixed, with uniform sampling spatially For a UUV/UAV, one may be able to adaptively control sensor path Question: Can we optimize the UUV/UAV path to most efficiently classify a submerged target (mine)? Employ the theory of optimal experiments Assume that we wish to distinguish between K targets, and we perform t measurements, or observations O t ={O 1,,O t } ML classification: Choose target i if p( O T ) > p( O T ), t i t k T k T i

36 Theory of Optimal Experiments - 2 Assume that we have performed t measurements, at which we have measured, O t ={O 1,,O t }. The measurements are performed at the relative angles θ, L 2, θ t Note: Since we don t know the target orientation, we can only control the relative sensor angles with respect to the target Assume we have the target-dependent statistical models p for example with an HMM ( O1, L, O t θ2, L, θt, Tk ) θ t+1 At what relative angle should we perform measurement t+1, to optimize target classification?

37 Theory of Optimal Experiments - 3 Consider the probabilities p( Tk O1, L, O t, θ2, L, θt ) for all targets Idea: Choose the t+1 measurement to minimize the entropy characteristic of these density functions H t ( θ2,..., θt ) = k = K p( T θ log p( T, L, θ θ 2, O 1 k 2 t 1 k, L, O, L, θ t t ), O 1, L, O t ) θ t+1 Issue: When considering different we do not actually have access to O t+1 Solution: We compute the expected entropy, integrating over our density function for O t+1 Key: With HMM we need not assume independent measurements + HMM efficient

38 Theory of Optimal Experiments - 4 We can perform this efficiently utilizing our previous work with HMMs 50 Resopnse for Target 1 50 Resopnse for Target 2 50 Resopnse for Target Frequency (khz) Frequency (khz) Frequency (khz) Angle (Deg) Resopnse for Target Angle (Deg) Resopnse for Target 5 Angle (Deg) Frequency (khz) Frequency (khz) Angle (Deg) Angle (Deg)

39 Confusion Matrix (5 points) Uniform Sampling: o 5 T 1 T 2 T 3 T 1 T 2 T 3 T 4 T T 4 T Ave = 77.72% Uniform Sampling: T 1 T T 2 T 3 T T o 45 T 2 T T T Ave= 86.89%

40 T 1 T1 T2 T3 T4 T Optimal Sampling: T 2 T T 4 T Ave= 92.89%

41 Objective Function 1.2 x θ 5 = delta Entropy Sample rate

42 Sample postions= , delta= Targ 1 Targ 2 Targ 3 Targ 4 Targ 5 Probablity Number of Observations

43 Sample postions= , delta= Targ 1 Targ 2 Targ 3 Targ 4 Targ 5 Probablity Number of Observations

44 Adaptive 5 degree 48 degree 1 Entropy Number of Observations

45 Optimal Multi-Aspect Target Detection Previous results considered optimal sensing for classification problem We assumed knowledge of HMMs for each of the targets Often we have HMMs for known targets but may come across objects we haven t seen previously Can we extend theory of optimal experiments to case for which we only have HMMs for a subset of objects we may encounter? Relevant for target detection

46 Optimal Multi-Aspect Target Detection - 2 Assume that we have performed t measurements, at which we have measured, O t ={O 1,,O t }. The measurements are performed at the relative angles θ, L 2, θ t The log likelihood of these measurements, for the known target T is log p( O1, O2,..., Ot θ1, θ2,..., θt, T) We perform the next measurement at the change of angle expected increase in log likelihood θ +1 t that maximizes the arg max θt+ = E [log p( O1,..., O, O 1 1,...,, 1 O 1 O1,..., Ot, θ1,..., θt, θt 1 t t+ θ θt θ + + θ t+ 1 t t+ 1, T )] This corresponds to minimizing the entropy (uncertainty) of whether the target under interrogation corresponds to target T

47 NSWC Data: 6 Targets khz Bullet Plastic Cone 55 Gal. Drum Rock 1 Rock 2 Log

48 ROC curve between Target 5 and Dobeck NSWC Targets Adaptive 5 degree 22 degree P d P f

49 0 Uniform Sampling o 22 Log Likelihood Optimal Sampling Log Likelihood

50 Outline Optimal adaptive regression with application to EMI sensing of buried targets Optimal adaptive detection and classification with application to multi-aspect sensing (SAR, UUV, UAV) Optimal adaptive training for buried targets with application to UXO Future directions

51 Background UXO versus clutter : a classification problem Labeled data and unlabeled data Data is easy to observed, but excavation is very costly Reducing cost by the smart learning, no prior training data required

52 Step 1: Collect Data

53 Step 2: Choose Representative Signatures (Basis Functions), No Digging

54 Step 3: Dig First Hole to Learn Label

55 Step 4: Refine Classifier and then Dig Hole Two

56 Step 5: Refine Classifier and then Dig Hole Three, etc

57 Step 6: Obtain Set of Labeled Signatures and Complete Classifier Design after Information Gain Accrued By New Labels is Minimal

58 Classifier design: A Sketch Three parts for building the classifier: 1. Select basis functions sequentially to build the structure of the classifier using all available data (unlabeled). The structure is designed to occupy the maximum information from the data set. 2. Sequentially select the most informative data to the classifier structure determined from Part 1, discover its label, and refine the information of the structure. 3. Estimate the weights of the classifier with labeled data selected from Part 2.

59 Classifier design: Kernel Machine Kernel machine is a linear classifier in high dimensional space, it can produce complex decision boundaries while keep the generalization ability high. n T f n ( x ) = wn, i K ( ci, x ) + wn, 0 = w n Φn ( x i= 1 ) Basis functions: Φ n ( ) = [1, ( c1, ), K ( c2, ),..., K ( cn, K )] T Weights : w n = [ n, 0, wn,1, wn,2, L, wn, n w ] T Kernel: Relevant vectors: K ( c i, x ) Comparing the similarity between c i and x in high dimensional space. ci, i = 1, L, n

60 Classifier design: Selection of Basis Functions (1) With the whole unlabeled data set X ={x 1, x 2,, x N }, basis functions are selected sequentially by Fisher Information Criteria (FIC) to capture the data characteristics. Assume a linear kernel machine has n basis functions, then the Fisher information matrix for the weights is N M = = Φn ( xi ) Φ n i 1 T n (x i ) M n indicates the information from data set X for the current structure of the classifier. Its inverse is the Cramer-Rao Bound.

61 Classifier design: Selection of Basis Functions (2) When a new kernel Φ n+1 ( ) = K(c n+1, ), c n+1 X is added to basis functions, the new Fisher information matrix is M Φ [ T Φ ( x ) ( x ] N n i n + = = Φ 1 i 1 n i n+ 1 i Φn+ 1( x ) i ) ( x ) The information gain by adding a new kernel is defined as follows. It can also be viewed as the drop of CRB. I ker nal = + ln Det( M n 1) ln Det( M n )

62 Classifier design: Selection of Basis Functions (3) For the new kernel K(c n+1, ), the information gain is equal to ln I ( ) kernel cn+ 1 = N 2 i = (xi ) Φ n ( N T 1 N Φ i= n+ i n i n i= n i Φ 1 1 x ) Φ ( x )) M ( Φ ( x ) 1 n+ 1 ( ( x i )) Then the optimal new kernel is selected by a greedy search to maximize the information gain c n+ = arg max ln I ( c) 1 c X \{ c1,..., c } ker nel n

63 Classifier design: Selection of labeled Data (1) After a set of most informative basis functions Φ n0 ( ) have been selected, in order to determine the classifier, the labeled data are required to estimate the weights. Another FIC is utilized to assign a priority to each unlabeled data, and those more informative to the estimation are labeled first by digging them out. Therefore, the classifier is determined with substantial less excavating effort, which is the most costly part for UXO problem.

64 Classifier design: Selection of labeled Data (2) Labeled data selection is also a sequential process. Assuming we have a subset of labeled data X L X, X L = {x 1, x 2,, x L }, the Fisher information matrix with the data set X L is M ( L T X L) Φ n 1 n ( xl ) Φn ( x l ) = l = When a new datum x L+1 is added, the new Fisher information matrix with the data set X L+1 is M n T ( X L+ ) = M n ( X L ) + Φn ( xl+ 1) Φn ( xl 1)

65 Classifier design: Selection of labeled Data (3) The information gain by adding a new datum is defined by Idata = ln Det( M n ( X )) ln ( ( 0 L+ 1 Det M n X 0 L )) For the datum x L+1, the information gain is T 1 ln I data( xl+ ) = 1+ Φn ( xl+ 1) M n ( X L) Φn ( xl 1) Then the next most informative datum is selected by a greedy search, and its label is recovered for the estimation of weights. x L+ = arg max ln I ( x) 1 x X \ X data L

66 Classifier design: Weights Estimation With n 0 basis functions Φ n0 ( ) and L 0 labeled data {X L0, Y L0 } selected by FIC, the weights w n0 of the classifier are obtained via a Best Linear Unbiased Estimation (BLUE). = = ( ( ) ( L l l T n l n L n ) ) x Φ x Φ X M = = ) ( ) ( L l l l n L n n y x Φ X M w

67 Overall Procedure Collect magnetometer and induction data at a given site (NO training data) Select a set of example signatures for basis function (NO digging required) Sequentially dig up UXO/clutter, one at a time, to learn labels of particular signatures Using the signatures and labels, refine classifier Using new classifier, choose next example to dig and learn label Refine classifier Continue digging and learning labels until the information gain to classifier is minimal Classifier design complete Using final classifier, discriminate UXO from clutter for remaining un-excavated objects

68 Results The proposed method is tested with the data from the JPG V demonstration site, which is deployed to asses different technologies for UXO detection and discrimination under realistic scenarios. Both GEM-3 data and magnetometer data are considered. Features are extracted via model fitting. Magnetometer feature extraction is similar to that of the induction sensor, but much simpler. A typical set of features are utilized for the classification. The data is from two adjoining areas, and includes totally 40 UXO, which are 60 mm and 81mm mortars; 57 mm, 76 mm, 105 mm, 152 mm, and 155 mm projectiles; and 2.75 inch rockets. After the model fitting based prescreening operation, 260 clutters are left.

69 Results There are 16 UXO and 112 clutters in area 3, which is originally assigned as the training area. We observe that most UXO items presented in one area are also appeared in the other area, which seems to be a manual effort to keep the training data matched to the detecting data. We feel this may not be feasible in practice. The ROC curve of the proposed active training method is presented in comparison with results of a standard Kernel Matching Pursuits (KMP) from 100 random training selections. KMP is the most related classifier to the method proposed here, and it has achieved superior results comparing with some state of the art methods.

70 Number of labeled data: 128

71 Number of labeled data: 90

72 Number of labeled data: 60

73 Number of labeled data: 40

74 Future Directions Thus far have considered optimization of parameters (e.g. position, frequency, etc.) of single sensors -EMI -Acoustic - Magnetometer Now must place optimal sensing into context of multiple sensors Multiple sensors operating collectively (e.g. team of robots and/or UAVs)

Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets:

Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Multi-Modal Inverse Scattering for Detection and Classification Of General Concealed Targets: Landmines, Targets Under Trees, Underground Facilities Research Team: Duke University Georgia Institute of

More information

Adaptive Multi-Modal Sensing of General Concealed Targets

Adaptive Multi-Modal Sensing of General Concealed Targets Adaptive Multi-Modal Sensing of General Concealed argets Lawrence Carin Balaji Krishnapuram, David Williams, Xuejun Liao and Ya Xue Department of Electrical & Computer Engineering Duke University Durham,

More information

THere are many sensing scenarios for which the target is

THere are many sensing scenarios for which the target is Adaptive Multi-Aspect Target Classification and Detection with Hidden Markov Models Shihao Ji, Xuejun Liao, Senior Member, IEEE, and Lawrence Carin, Fellow, IEEE Abstract Target detection and classification

More information

ENVIRONMENTAL remediation of sites containing unexploded

ENVIRONMENTAL remediation of sites containing unexploded IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 4, NO. 4, OCTOBER 2007 629 A Bivariate Gaussian Model for Unexploded Ordnance Classification with EMI Data David Williams, Member, IEEE, Yijun Yu, Levi

More information

Reinforcement Learning as Classification Leveraging Modern Classifiers

Reinforcement Learning as Classification Leveraging Modern Classifiers Reinforcement Learning as Classification Leveraging Modern Classifiers Michail G. Lagoudakis and Ronald Parr Department of Computer Science Duke University Durham, NC 27708 Machine Learning Reductions

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Conditional Random Field

Conditional Random Field Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Detection of Unexploded Ordnance: An Introduction

Detection of Unexploded Ordnance: An Introduction Chapter 1 Detection of Unexploded Ordnance: An Introduction Hakan Deliç Wireless Communications Laboratory Department of Electrical and Electronics Engineering Boǧaziçi University Bebek 34342 Istanbul

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Magnetics: Fundamentals and Parameter Extraction

Magnetics: Fundamentals and Parameter Extraction : Fundamentals and Parameter Extraction Stephen Billings Magnetic module outline fundamentals Sensor systems Data examples and demo Parameter extraction Concepts Real-world examples Classification Using

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Statistical learning theory, Support vector machines, and Bioinformatics

Statistical learning theory, Support vector machines, and Bioinformatics 1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.

More information

Introduction to Gaussian Process

Introduction to Gaussian Process Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression

More information

ONR Mine Warfare Autonomy Virtual Program Review September 7, 2017

ONR Mine Warfare Autonomy Virtual Program Review September 7, 2017 ONR Mine Warfare Autonomy Virtual Program Review September 7, 2017 Information-driven Guidance and Control for Adaptive Target Detection and Classification Silvia Ferrari Pingping Zhu and Bo Fu Mechanical

More information

Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models

Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models Project # SERDP-MR-1572 Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models Fridon Shubitidze, Sky Research/Dartmouth College Co-Authors: Irma Shamatava, Sky Research, Inc Alex

More information

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

A Theoretical Performance Analysis and Simulation of Time-Domain EMI Sensor Data for Land Mine Detection

A Theoretical Performance Analysis and Simulation of Time-Domain EMI Sensor Data for Land Mine Detection 2042 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 38, NO. 4, JULY 2000 A Theoretical Performance Analysis and Simulation of Time-Domain EMI Sensor Data for Land Mine Detection Ping Gao, Student

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Machine Learning Overview

Machine Learning Overview Machine Learning Overview Sargur N. Srihari University at Buffalo, State University of New York USA 1 Outline 1. What is Machine Learning (ML)? 2. Types of Information Processing Problems Solved 1. Regression

More information

Linear Discrimination Functions

Linear Discrimination Functions Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

the tree till a class assignment is reached

the tree till a class assignment is reached Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Localization of Radioactive Sources Zhifei Zhang

Localization of Radioactive Sources Zhifei Zhang Localization of Radioactive Sources Zhifei Zhang 4/13/2016 1 Outline Background and motivation Our goal and scenario Preliminary knowledge Related work Our approach and results 4/13/2016 2 Background and

More information

Support Vector Machine. Industrial AI Lab.

Support Vector Machine. Industrial AI Lab. Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different

More information

An Introduction to Statistical Machine Learning - Theoretical Aspects -

An Introduction to Statistical Machine Learning - Theoretical Aspects - An Introduction to Statistical Machine Learning - Theoretical Aspects - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny,

More information

Introduction to Machine Learning Midterm Exam

Introduction to Machine Learning Midterm Exam 10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS

QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS Parvathinathan Venkitasubramaniam, Gökhan Mergen, Lang Tong and Ananthram Swami ABSTRACT We study the problem of quantization for

More information

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Reliability of Seismic Data for Hydrocarbon Reservoir Characterization

Reliability of Seismic Data for Hydrocarbon Reservoir Characterization Reliability of Seismic Data for Hydrocarbon Reservoir Characterization Geetartha Dutta (gdutta@stanford.edu) December 10, 2015 Abstract Seismic data helps in better characterization of hydrocarbon reservoirs.

More information

Introduction to Signal Detection and Classification. Phani Chavali

Introduction to Signal Detection and Classification. Phani Chavali Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

Generalization Error on Pruning Decision Trees

Generalization Error on Pruning Decision Trees Generalization Error on Pruning Decision Trees Ryan R. Rosario Computer Science 269 Fall 2010 A decision tree is a predictive model that can be used for either classification or regression [3]. Decision

More information

(2) I. INTRODUCTION (3)

(2) I. INTRODUCTION (3) 2686 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 5, MAY 2010 Active Learning and Basis Selection for Kernel-Based Linear Models: A Bayesian Perspective John Paisley, Student Member, IEEE, Xuejun

More information

Hidden Markov Models (HMM) and Support Vector Machine (SVM)

Hidden Markov Models (HMM) and Support Vector Machine (SVM) Hidden Markov Models (HMM) and Support Vector Machine (SVM) Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 1 Hidden Markov Models (HMM)

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,

More information

Final Exam, Spring 2006

Final Exam, Spring 2006 070 Final Exam, Spring 2006. Write your name and your email address below. Name: Andrew account: 2. There should be 22 numbered pages in this exam (including this cover sheet). 3. You may use any and all

More information

Augmented Statistical Models for Speech Recognition

Augmented Statistical Models for Speech Recognition Augmented Statistical Models for Speech Recognition Mark Gales & Martin Layton 31 August 2005 Trajectory Models For Speech Processing Workshop Overview Dependency Modelling in Speech Recognition: latent

More information

Introduction to Logistic Regression

Introduction to Logistic Regression Introduction to Logistic Regression Guy Lebanon Binary Classification Binary classification is the most basic task in machine learning, and yet the most frequent. Binary classifiers often serve as the

More information

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with

Feature selection. c Victor Kitov August Summer school on Machine Learning in High Energy Physics in partnership with Feature selection c Victor Kitov v.v.kitov@yandex.ru Summer school on Machine Learning in High Energy Physics in partnership with August 2015 1/38 Feature selection Feature selection is a process of selecting

More information

Discriminative Learning in Speech Recognition

Discriminative Learning in Speech Recognition Discriminative Learning in Speech Recognition Yueng-Tien, Lo g96470198@csie.ntnu.edu.tw Speech Lab, CSIE Reference Xiaodong He and Li Deng. "Discriminative Learning in Speech Recognition, Technical Report

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

SLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada

SLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada SLAM Techniques and Algorithms Jack Collier Defence Research and Development Canada Recherche et développement pour la défense Canada Canada Goals What will we learn Gain an appreciation for what SLAM

More information

Probabilistic Time Series Classification

Probabilistic Time Series Classification Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Passive Sonar Detection Performance Prediction of a Moving Source in an Uncertain Environment

Passive Sonar Detection Performance Prediction of a Moving Source in an Uncertain Environment Acoustical Society of America Meeting Fall 2005 Passive Sonar Detection Performance Prediction of a Moving Source in an Uncertain Environment Vivek Varadarajan and Jeffrey Krolik Duke University Department

More information

Sparse Sensing for Statistical Inference

Sparse Sensing for Statistical Inference Sparse Sensing for Statistical Inference Model-driven and data-driven paradigms Geert Leus, Sundeep Chepuri, and Georg Kail ITA 2016, 04 Feb 2016 1/17 Power networks, grid analytics Health informatics

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian

More information

Dimension Reduction (PCA, ICA, CCA, FLD,

Dimension Reduction (PCA, ICA, CCA, FLD, Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction

More information

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

CS6375: Machine Learning Gautam Kunapuli. Decision Trees Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s

More information

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter

More information

LECTURE 18: NONLINEAR MODELS

LECTURE 18: NONLINEAR MODELS LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:

More information

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /

More information

Doug Cochran. 3 October 2011

Doug Cochran. 3 October 2011 ) ) School Electrical, Computer, & Energy Arizona State University (Joint work with Stephen Howard and Bill Moran) 3 October 2011 Tenets ) The purpose sensor networks is to sense; i.e., to enable detection,

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

ECE521 Lecture7. Logistic Regression

ECE521 Lecture7. Logistic Regression ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard

More information

COMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation

COMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation COMP 55 Applied Machine Learning Lecture 2: Bayesian optimisation Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp55 Unless otherwise noted, all material posted

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Robotics 2 AdaBoost for People and Place Detection

Robotics 2 AdaBoost for People and Place Detection Robotics 2 AdaBoost for People and Place Detection Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard v.1.0, Kai Arras, Oct 09, including material by Luciano Spinello and Oscar Martinez Mozos

More information

Empirical Risk Minimization

Empirical Risk Minimization Empirical Risk Minimization Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Introduction PAC learning ERM in practice 2 General setting Data X the input space and Y the output space

More information

p(d θ ) l(θ ) 1.2 x x x

p(d θ ) l(θ ) 1.2 x x x p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Robust Monte Carlo Methods for Sequential Planning and Decision Making

Robust Monte Carlo Methods for Sequential Planning and Decision Making Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory

More information

1 [15 points] Frequent Itemsets Generation With Map-Reduce

1 [15 points] Frequent Itemsets Generation With Map-Reduce Data Mining Learning from Large Data Sets Final Exam Date: 15 August 2013 Time limit: 120 minutes Number of pages: 11 Maximum score: 100 points You can use the back of the pages if you run out of space.

More information

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking

More information

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept

More information

Modeling Multiple-mode Systems with Predictive State Representations

Modeling Multiple-mode Systems with Predictive State Representations Modeling Multiple-mode Systems with Predictive State Representations Britton Wolfe Computer Science Indiana University-Purdue University Fort Wayne wolfeb@ipfw.edu Michael R. James AI and Robotics Group

More information

Condition Monitoring of Single Point Cutting Tool through Vibration Signals using Decision Tree Algorithm

Condition Monitoring of Single Point Cutting Tool through Vibration Signals using Decision Tree Algorithm Columbia International Publishing Journal of Vibration Analysis, Measurement, and Control doi:10.7726/jvamc.2015.1003 Research Paper Condition Monitoring of Single Point Cutting Tool through Vibration

More information

Data Structures for Efficient Inference and Optimization

Data Structures for Efficient Inference and Optimization Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang Discrete

More information

Seq2Seq Losses (CTC)

Seq2Seq Losses (CTC) Seq2Seq Losses (CTC) Jerry Ding & Ryan Brigden 11-785 Recitation 6 February 23, 2018 Outline Tasks suited for recurrent networks Losses when the output is a sequence Kinds of errors Losses to use CTC Loss

More information

Outline Lecture 2 2(32)

Outline Lecture 2 2(32) Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

Machine Learning Basics: Maximum Likelihood Estimation

Machine Learning Basics: Maximum Likelihood Estimation Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning

More information