Pose Estimation in SAR using an Information Theoretic Criterion

Size: px
Start display at page:

Download "Pose Estimation in SAR using an Information Theoretic Criterion"

Transcription

1 Pose Estimation in SAR using an Information Theoretic Criterion Jose C. Principe, Dongxin Xu, John W. Fisher III Computational NeuroEngineering Laboratory, U. of Florida. Abstract This paper describes a pose estimation algorithm based on an information theoretic formulation. We formulate the pose estimation statistically and show that pose can be estimated from a low dimensional feature space obtained by maximizing the mutual information between the aspect angle and the output of the mapper. We use the Havrda-Charvat definition of entropy to implement a nonparametric estimator based on the Parzen window method. Results in the MSTAR data set are presented and show the good performance of the methodology. 1.0 Introduction Knowing the relative position of a vehicle with respect to the sensor (normally called the aspect angle of the observation or the pose) is an important piece of information for vehicle recognition. Since pattern classifiers are statistical machines, without the pose information the classifier has to be trained with all possible poses to become invariant to aspect angle during operation. This is the principle of classifiers based on the synthetic discriminant function (SDF) so widely used in optical correlators [1], or the template based classifiers [2]. Even if the classifier is built around Bayesian principles or neural networks, all possible aspect angles have to be included during training to describe reliably the object. In SAR this is not a simple task due to the enormous variability of the scattering phenomenology created by man-made objects. This argument suggests that alternatively one could divide the classification in two stages: first find the pose of the object and then decide which is the class by selecting a classifier trained exclusively for that pose. Notice that this approach drastically reduced the complexity of the classifier training. This in fact is the principle used in the MSTAR architecture [3] where classifica- Jose C. Principe 1 CNEL, University of Florida

2 tion is divided in an indexing module followed by search and match. However, the approach utilized in MSTARs is based on the traditional method of a priori selecting landmarks in the vehicle and then comparing them for the best match with a database of features taken at different angles. This solution has several drawbacks. First, it is computationally expensive (search has to be done on-line). Second, it is highly dependent on the quality of the landmarks. Edges have been proved useful in optical images, but in SAR point scatters are normally preferred due to the different image formation characteristics. The issue is that point scatters vary abruptly with the depression angle and pose so the stability of the method is still under investigation. Third, the size of the problem space increases drastically with the number of objects and the precision required when local features are utilized. Instead of thinking that the system complexity is intrinsic to the problem [4], we submit that the problem formulation also affects the complexity of the solution. If the landmarks are local, then it is obvious that the problem does not scale up well. Our approach is to extract optimal features directly from the data by training an adaptive system. The advantages are the following: First, the method is very fast. Once the system is trained, during testing the image is presented and the output of the system is the estimation of pose, i.e. we have created a content addressable memory (CAM). Any microprocessor can do this in real time. Second, the system is not sensitive to the detection of landmarks which is a big advantage primarily when we do not know how much information is carried in the landmarks. Until the information formulation proposed here, this optimal feature extraction could only be done using principal component analysis (PCA) or linear discriminant analysis. PCA provides only global (rough) information about the objects (second order statistics) and the information provided may not be directly related to pose, which is just one aspect of the input image. So the results may be disappointing. However, our method of mutual information maximization is using the full information contained on the probability density function (pdf) so it can utilize local information if it is needed to solve the problem and the model parameters are right directed to the pose, which is our only interest here. This paper starts with a statistical formulation of the problem of pose estimation, describes a method of computing entropy from samples and how to construct a mutual information estimator, and presents preliminary results in the MSTAR data set. Jose C. Principe 2 CNEL, University of Florida

3 2.0 A statistical formulation of pose estimation Suppose that we have collected data in pairs ( x i, a i ), i 1,, N, where the image x i can be regarded as a vector in a high dimensional space x i ( m is usually in the thousands) and a i is a vector of ground truth information relative to the image contents. For the general case of pose R m estimation, a i is a six dimensional vector containing the translational and rotational information [5]. Here we will treat the one degree of freedom (1DOF) pose estimation problem where is a x i SAR image of a land vehicle obtained at a given depression angle, and a i is the azimuth (aspect) angle of the vehicle. The MSTAR data set [6] can be readily utilized to test the accuracy of 1DOF pose estimation algorithms. In general, the estimation of the aspect angle (here called pose) given can be formulated as a MAP (maximum a posteriori probability) prob- a particular image x lem: R m â argmax f AX x ( x, a) a (1) where f AX ( xa, ) is the a posteriori probability density function (pdf) of the aspect angle A x given x. This formulation implies that the best estimation of the aspect angle given x is the one which maximizes the a posteriori probability. Although the aspect angle A is a continuous variable, we can discretize it for convenience, where the possible values are a i, i 1,, N, i.e. all the angles in the training set. Since we have no a priori knowledge about the aspect angle, the uniform distribution is the most reasonable assumption about the probability density function of A in the sense that it is the direct result of MaxEnt [7] principle. Under these conditions, the above MAP problem is equivalent to ML (Maximum Likelihood): â Pa ( i )f XA ai ( xa i ) argmax Pa ( i x) argmax argmax f f X ( x) XA ai ( xa i ) i i i (2) where pa ( i x), i 1,, N, is the a posteriori probability of the discrete variable A given x, Jose C. Principe 3 CNEL, University of Florida

4 Pa ( i ) is the a priori probability of A, which here is the uniform distribution, i.e. Pa ( i ) constant for i 1,, N ; f XA ai ( xa i ) is the conditional pdf of the image x for a particular aspect angle A, and f X ( x) is the marginal pdf of x. Therefore from (2), the problem becomes the estimation of the conditional pdf of x for all the possible angle a i, i 1,, N. Since x is a very high dimensional vector and any assumption about the form of the pdf is not appropriate for realistic pose estimation in SAR, a non-parametric method should be used. However, nonparametric pdf estimation of x becomes very unreliable since x is in a very high dimensional space and training data is limited. So, dimensionality reduction or feature extraction is necessary in this case, which means that instead of estimating the angle directly from the image x, we estimate it from a feature space of the image x. a i Generally, a feature is the output of a mapping. Let y h( x, w) be a feature set for x, where h: R m R k is a mapping, also called the feature extractor, y R k, k «m, and w is the parameter set of the feature extractor. Now, the problem according to (2) becomes: â argmax f ya ai ( ), y h( x, w) i ya i (3) In this framework, the key issue is how to choose the parameter set w. We propose to apply Information Theory [8]. From the information theoretic point of view, a mapping or feature extractor is an information transmission channel. The parameter of the mapping should be chosen so that it transmits as much information as possible. Here, the problem requires that the mapping transmits the most information about the aspect angle, i.e. the feature y should best represent the aspect angle. According to Information Theory, the quantitative measure for this purpose is the mutual information between the feature y and aspect angle a. So, parameter selection can be formulated as: w opt argmax Iy ( hxw (, ), a) w (4) Jose C. Principe 4 CNEL, University of Florida

5 where Iya (, ) is the mutual information between y and a, that is the optimal parameter set should be the one which maximizes the mutual information between the feature and the angle. Actually, the mutual information measure directly relies on pdfs. As mentioned above, non-parametric pdf estimation should be used, so the Parzen Window [9] method is selected here. Unfortunately, Shannon s mutual information measure will become too complex to be implementable with the Parzen Window pdf estimation. In the next section, we will introduce our method of mutual information estimation by the Havrda-Chavart s entropy Pose estimation using the Havrda Chavart s entropy Figure 1 shows the proposed block diagram for pose estimation. A(ngles) x y 1 Estimate mutual information y 2 I(Y,A) y f( x, w) adapt parameters w Figure 1. Pose estimation with the MLP From Information Theory, the mutual information can be computed by the difference between the entropy and the conditional entropy: IyA (, ) H H2 ( Y) H H2 ( YA) (5) where y is the feature and A is aspect angle. For reasons that are connected to the estimator of entropy from samples, here we utilize the Havrda-Chavart definition of entropy [10] Jose C. Principe 5 CNEL, University of Florida

6 H Hα ( Y) f 1 α Y ( y) α dy 1 (6) with α2 which will also be called the Quadratic entropy. For a more in depth discussion of several definitions of entropy see [10]. So H 2 ( Y) is the Quadratic entropy of the output and H 2 ( YA) is the conditional Quadratic Entropy. Since the MLP is a universal mapper [11] it is used in this application as the mapping function (here we use the configuration e.g. 6,400x3x2). Now, the problem can be described as finding the parameters (w) of the MLP so that the mutual information between the output of the MLP and the aspect angle is maximized, i.e. we let the output convey the most information about the aspect angle. We can think of this scheme as information filtering as opposed to the more traditional image filtering so commonly utilized in image processing. Suppose the training data set are pairs { x i, a i }, where x i is a SAR image of a vehicle and a i is its true azimuth (aspect) angle. The feature set y i hx ( i, w) is a 2 dimensional vector (y 1i,y 2i ) where the aspect can be easily measured as the angle of the vector. We can discretize uniformly the angles around the curve described by the output vector, as shown in Figure 2, where a circumference is assumed for simplicity. y 2 a 0 a1 a i, i 1,, N a 2 y 1 Fig 2. Structure for the angle information In our problem formulation, the pose is a random variable which must be described statistically. Jose C. Principe 6 CNEL, University of Florida

7 We create a local structure weighting the samples of adjacent angles samples a i k a i 1 a i a i 1 weights w l w 1 w 0 w 1 w l + a i + k 1 w l 0 w l 1 The neighborhood size was experimentally set at l 2 nearest neighbors, and the weighting was selected as a Gaussian decay. Effectively this arrangement says that there is a fuzzy correspondence between several possible angles and each one of the sampled points in the unit circumference. The reason we selected the HC Quadratic entropy is related to the Parzen window estimator presented in [12]. Let R k, i 1,, N, be a set of samples from a random variable Y R k in k- y i dimensional feature space. One interesting question is what will be the entropy associated with this set of data points. One answer lies in the estimation of the data pdf by the Parzen window method using a Gaussian kernel: f Y ( y) N Gy y N ( i, σ 2 ) i 1 (7) where G(.,.) is the Gaussian kernel Gyσ (, ) in dimensional ( 2π) k 2 σ exp y T y 1 2 2σ 2 k space, and σ 2 is the variance. When Shannon s entropy is used along with this pdf estimation, the measure becomes very complex. Fortunately, HC quadratic entropy of (6) leads to a simpler form and we obtain the following entropy measure for a set of discrete data points { }: y i H( { y i }) H H2 ( Y { y i }) 1 f Y ( y) 2 dy 1 V( { y i }) V( { y i }) N N + N 2 i 1 j Gy ( y i, σ 2 )Gy ( y j, σ 2 ) dy N N Gy ( i y j, 2σ 2 ) N 2 i 1 j 1 (8) Jose C. Principe 7 CNEL, University of Florida

8 With this estimator the mutual information related to the quadratic HC entropy becomes IYA (, ) k w N l Gy ( y i + l ) y Gy ( y N 2 i ) d 2 dy i i l k (9) The second term estimates the entropy due to all the input images, while the first term estimates the conditional entropy. In order to train the MLP, we take the derivative of (9) with respect to the parameters and interpret it as an injected error to the back-propagation algorithm [12]. In this way, the feature extraction mapping for pose estimation can be obtained. After training the testing image x is presented to the MLP, and its output y estimates the discrete conditional pdf in the output feature space ( ). Then the pose can be estimated by using (3). f YA ai ya i 3.0 Experimental Results This algorithm was validated in the MSTAR public release database [6]. We trained the pose estimator with the class BMP-2 vehicle, type sn-c21 with a depression angle of 15 degrees. We simply clipped the chips (128x128) from pixel 20 to 99 both vertically and horizontally (obtained image chip size of 80x80) to preserve the image of the vehicle and its shadow. No fine centering of the vehicle was attempted. The training set was constructed from 53 chips taken at approximately 3.5 o angle apart to cover angles from 0 to 180 degrees. The algorithm takes about 100 batch iterations to converge (very repeatable performance). In Figure 3 the circle at left (diamonds) represents the training results in the feature space. Notice that the MLP trained with our criterion created an output that is almost a perfect circle. The circle can be interpreted as the best output distribution to maximize the mutual information between the input and the pose. This result is intuitively appealing, but notice that it was discovered automatically using our algorithm (i.e. we did not enforced the circle as a desired response). The triangles at the left show the typical results in a test set (the chips for the same vehicle not used for training). It is interesting that the amplitude for the test set fluctuates a lot, but the outputs tend to move inwards along the radial direction, preserving the quality of the pose estimation. This means that Jose C. Principe 8 CNEL, University of Florida

9 the algorithm created an output metric that preserves angle relationships as we expected. The figure at the right shows the true and estimated pose. The vertical axis is the angle and the horizontal axis is the exemplar index. angle y1 y2 image # Figure 3. BMP-2, CN-C21 (180 degree training) The testing was conducted in the rest of the chips from the same vehicle and two other vehicle types (SN-9563 and 9566) which represent different configurations (all at the same depression angle). We also tested the pose estimator on a different class, the T-72, using the type sn-s7. Table I quantifies the results. Table 1: Testing with training class/type error mean (degrees) error std. dev. (degrees) BMP2/sn-c BMP2/sn BMP2/sn T72/sn-s Notice that the pose estimation error in the testing of the same vehicle type is basically the same as the resolution in training (3.5 o ) which means that the accuracy of the estimator is very good. Therefore we expect that more precise pose estimation are achievable by creating training sets with more images with finer resolution in pose. Table I also shows that the algorithm generalizes very well to both other vehicle types and even Jose C. Principe 9 CNEL, University of Florida

10 other vehicle classes. We notice a degradation in performance in the T72, but it is a smooth rolloff. If we want to obviate this degradation of performance with the vehicle type we should utilize more than one vehicle in the training set, which at the same time will obviate the resolution problem addressed above. However, we have to state that the algorithm for mutual information estimation is O(N 2 ), which means that there is practically a limit on the number of exemplars (N) utilized in training. In order to quantify the robustness of the algorithm to vehicle occlusion we have replaced progressively one vehicle image with the background (this is an image of the BMP2 not used in training). We observed that although the amplitude of the output feature decreased appreciably when the bright output of the vehicle was substituted by the darker background (the triangles in the left portion of Figure 4), the pose estimation hold-off remarkably well (right portion of Figure 4). In this case the pose was within an angle of +/- 5 degree up to 50% occlusion and +/- 10 degrees up to 95% occlusion (which occurs at increment 36 in the plot). In our opinion this smooth degradation is one of the advantages of using a distributed system as a mapper, and the same behavior has been extensively reported in the associative memory literature [13]. However, different occlusion directions may provide different performance (it all depends upon which portions of the image are occluded). angle y1 y2 occlusion sequence Figure 4. Results of pose estimation with vehicle occlusion. Vehicle pose is 58 degrees. Jose C. Principe 10 CNEL, University of Florida

11 6.0 Conclusions This paper reports on our present efforts to create a robust and easy to implement pose estimator for SAR imagery. The need for such an algorithm stems from our goal of creating accurate and robust classifiers. Knowing the pose of the vehicle will streamline the size and training of the classifier module which should be translated in better performance. Our pose estimation framework is statistical in nature and utilizes directly information through manipulation of entropy from examples. We address the enormous complexity of the input space by creating an adaptive systems with optimal parameters. This is probably the best way to deal and conquer complexity. We project the input data to a subspace such that some property of the input relevant for our problem is optimally preserved. This can be thought as information filtering as opposed to the more conventional signal filtering. The issue is the choice of the criterion for optimization. We were fully aware of the limitation of the second order methods utilized traditionally in pattern recognition, so we sought a method that would utilize the full information about the pdf of the input class. The mutual information between the feature and pose becomes the criterion of choice. This criterion measures simply the uncertainty remaining in the feature (the output of the mapper) about pose. By maximizing mutual information we are decreasing the uncertainty of pose in the feature, i.e. we are transferring as much information as possible between the feature and pose. There are also other reasons to use mutual information for classification such as the decrease of the lower bound of the classification error according to Fano s equality [14]. The big issue is the estimation of entropy from examples. In [12] we proposed a Parzen window to estimate the pdf along with mean squared difference between the uniform distribution and the estimated one to manipulate the output entropy. The derivative of the criterion can be used as an injected error to adapt the parameters of our mapper (linear or nonlinear) using the backpropagation algorithm. In this paper we couple the entropy estimator with the definition of Quadratic entropy according to Havrda-Charvat to come up with an estimator of mutual information. The preliminary results of our method are very promising. We successfully trained our pose esti- Jose C. Principe 11 CNEL, University of Florida

12 mator with MSTAR vehicles. The accuracy in the test set is similar to the training set in the same vehicle, and the performance degrades gracefully to other vehicle types. Even with severe occlusion of the training vehicle (up to 95% occlusion) we obtain estimates of pose within +/- 10 degrees. Further testing of the algorithm is required, as well as further refinements to the theory. The image set is realistic, but still simple (1-DOF). Extension to more degrees of freedom will be pursued next, as well as more vehicles. Our pose estimator is based on the fact that the angle is discrete. It is important for accuracy to utilize the angle as a continuous variable. This will require a new estimator for the condition entropy. It is also important to understand the algorithm better and to compare its performance with alternate approaches. One of the bottlenecks of the method is that the computation is O(N 2 ), which imposes a practical limit on the size of training sets. Acknowledgments: This work was partially support by DARPA-Air force grant F References [1] Kumar B., Minimum variance synthetic discriminant functions, J. Opt. Soc. Am. A 3(1), , [2] Duda R. and Hart P., Pattern classificatioin and scene analysis, Wiley, [3] MSTAR Kickoff Meeting Proceedings, Washingtom, [4] Minardi M., Moving & stationary target acquisition and recognition, WL talk, September [5] Lowe, D., Solving parameters of object models from image descriptions, In Proc. ARPA IU workshop, pp , [6] MSTAR (Public) Targets, CDROM, Veda Inc., Ohio, [7] Jaynes E., Information theory and statistical mechanics, Phys. Rev., vol 106, pp , [8] Shannon, C.E. A mathematical theory of communication. Bell Sys. Tech. J. 27, 1948, pp , [9] Parzen, E. On the estimation of a probability density function and the mode, Ann. Math. Jose C. Principe 12 CNEL, University of Florida

13 Stat. 33, 1962, p1065 [10] Kapur, J.N. Measures of Information and Their Applications. John Wiley & Sons [11] Haykin S., Neural Networks, A Comprehensive Foundation, Macmillan Publishing Company, 1994 [12] Fisher J., Principe J., Entropy manipulation of arbitrary nonlinear mappings, Proc. IEEE Workshop on Neural Nt. for Signal Proc. VII, 14-23, [13] Kohonen T., Self-organization and associative memory, Springer Verlag, [14] Fisher J.W.III Nonlinear Extensions to the Minimum Average Correlation Energy Filter Ph.D dissertation, Dept. of ECE, University of Florida, Jose C. Principe 13 CNEL, University of Florida

Entropy Manipulation of Arbitrary Non I inear Map pings

Entropy Manipulation of Arbitrary Non I inear Map pings Entropy Manipulation of Arbitrary Non I inear Map pings John W. Fisher I11 JosC C. Principe Computational NeuroEngineering Laboratory EB, #33, PO Box 116130 University of Floridaa Gainesville, FL 326 1

More information

Learning from Examples with Information Theoretic Criteria

Learning from Examples with Information Theoretic Criteria Learning from Examples with Information Theoretic Criteria Jose C. Principe, Dongxin Xu, Qun Zhao, John W. Fisher III Abstract Computational NeuroEngineering Laboratory, University of Florida, Gainesville,

More information

A Nonlinear Extension of the MACE Filter

A Nonlinear Extension of the MACE Filter page 1 of 27 A Nonlinear Extension of the MACE Filter John W. Fisher III and Jose C. Principe Computational NeuroEngineering Laboratoryfisher@synapse.ee.ufl.edu Department of Electrical Engineeringprincipe@brain.ee.ufl.edu

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Statistical Learning Theory and the C-Loss cost function

Statistical Learning Theory and the C-Loss cost function Statistical Learning Theory and the C-Loss cost function Jose Principe, Ph.D. Distinguished Professor ECE, BME Computational NeuroEngineering Laboratory and principe@cnel.ufl.edu Statistical Learning Theory

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES THEORY AND PRACTICE Bogustaw Cyganek AGH University of Science and Technology, Poland WILEY A John Wiley &. Sons, Ltd., Publication Contents Preface Acknowledgements

More information

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Recursive Least Squares for an Entropy Regularized MSE Cost Function Recursive Least Squares for an Entropy Regularized MSE Cost Function Deniz Erdogmus, Yadunandana N. Rao, Jose C. Principe Oscar Fontenla-Romero, Amparo Alonso-Betanzos Electrical Eng. Dept., University

More information

WHEN IS A MAXIMAL INVARIANT HYPOTHESIS TEST BETTER THAN THE GLRT? Hyung Soo Kim and Alfred O. Hero

WHEN IS A MAXIMAL INVARIANT HYPOTHESIS TEST BETTER THAN THE GLRT? Hyung Soo Kim and Alfred O. Hero WHEN IS A MAXIMAL INVARIANT HYPTHESIS TEST BETTER THAN THE GLRT? Hyung Soo Kim and Alfred. Hero Department of Electrical Engineering and Computer Science University of Michigan, Ann Arbor, MI 489-222 ABSTRACT

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Sean Borman and Robert L. Stevenson Department of Electrical Engineering, University of Notre Dame Notre Dame,

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

Generalized Information Potential Criterion for Adaptive System Training

Generalized Information Potential Criterion for Adaptive System Training IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1035 Generalized Information Potential Criterion for Adaptive System Training Deniz Erdogmus, Student Member, IEEE, and Jose C. Principe,

More information

Feature Extraction with Weighted Samples Based on Independent Component Analysis

Feature Extraction with Weighted Samples Based on Independent Component Analysis Feature Extraction with Weighted Samples Based on Independent Component Analysis Nojun Kwak Samsung Electronics, Suwon P.O. Box 105, Suwon-Si, Gyeonggi-Do, KOREA 442-742, nojunk@ieee.org, WWW home page:

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Statistical Rock Physics

Statistical Rock Physics Statistical - Introduction Book review 3.1-3.3 Min Sun March. 13, 2009 Outline. What is Statistical. Why we need Statistical. How Statistical works Statistical Rock physics Information theory Statistics

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically

More information

Small sample size generalization

Small sample size generalization 9th Scandinavian Conference on Image Analysis, June 6-9, 1995, Uppsala, Sweden, Preprint Small sample size generalization Robert P.W. Duin Pattern Recognition Group, Faculty of Applied Physics Delft University

More information

Machine Learning Lecture 2

Machine Learning Lecture 2 Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Microarray Data Analysis: Discovery

Microarray Data Analysis: Discovery Microarray Data Analysis: Discovery Lecture 5 Classification Classification vs. Clustering Classification: Goal: Placing objects (e.g. genes) into meaningful classes Supervised Clustering: Goal: Discover

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Variable selection and feature construction using methods related to information theory

Variable selection and feature construction using methods related to information theory Outline Variable selection and feature construction using methods related to information theory Kari 1 1 Intelligent Systems Lab, Motorola, Tempe, AZ IJCNN 2007 Outline Outline 1 Information Theory and

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Recognition Performance from SAR Imagery Subject to System Resource Constraints

Recognition Performance from SAR Imagery Subject to System Resource Constraints Recognition Performance from SAR Imagery Subject to System Resource Constraints Michael D. DeVore Advisor: Joseph A. O SullivanO Washington University in St. Louis Electronic Systems and Signals Research

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Overriding the Experts: A Stacking Method For Combining Marginal Classifiers

Overriding the Experts: A Stacking Method For Combining Marginal Classifiers From: FLAIRS-00 Proceedings. Copyright 2000, AAAI (www.aaai.org). All rights reserved. Overriding the Experts: A Stacking ethod For Combining arginal Classifiers ark D. Happel and Peter ock Department

More information

Classification of Ordinal Data Using Neural Networks

Classification of Ordinal Data Using Neural Networks Classification of Ordinal Data Using Neural Networks Joaquim Pinto da Costa and Jaime S. Cardoso 2 Faculdade Ciências Universidade Porto, Porto, Portugal jpcosta@fc.up.pt 2 Faculdade Engenharia Universidade

More information

Regularization in Neural Networks

Regularization in Neural Networks Regularization in Neural Networks Sargur Srihari 1 Topics in Neural Network Regularization What is regularization? Methods 1. Determining optimal number of hidden units 2. Use of regularizer in error function

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems

An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems 1780 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 7, JULY 2002 An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems Deniz Erdogmus, Member, IEEE, and Jose

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

Aruna Bhat Research Scholar, Department of Electrical Engineering, IIT Delhi, India

Aruna Bhat Research Scholar, Department of Electrical Engineering, IIT Delhi, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 6 ISSN : 2456-3307 Robust Face Recognition System using Non Additive

More information

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor

More information

Density Estimation: ML, MAP, Bayesian estimation

Density Estimation: ML, MAP, Bayesian estimation Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum

More information

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses

More information

Machine Learning Lecture 2

Machine Learning Lecture 2 Machine Perceptual Learning and Sensory Summer Augmented 6 Computing Announcements Machine Learning Lecture 2 Course webpage http://www.vision.rwth-aachen.de/teaching/ Slides will be made available on

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Functional Preprocessing for Multilayer Perceptrons

Functional Preprocessing for Multilayer Perceptrons Functional Preprocessing for Multilayer Perceptrons Fabrice Rossi and Brieuc Conan-Guez Projet AxIS, INRIA, Domaine de Voluceau, Rocquencourt, B.P. 105 78153 Le Chesnay Cedex, France CEREMADE, UMR CNRS

More information

No. of dimensions 1. No. of centers

No. of dimensions 1. No. of centers Contents 8.6 Course of dimensionality............................ 15 8.7 Computational aspects of linear estimators.................. 15 8.7.1 Diagonalization of circulant andblock-circulant matrices......

More information

Classifier s Complexity Control while Training Multilayer Perceptrons

Classifier s Complexity Control while Training Multilayer Perceptrons Classifier s Complexity Control while Training Multilayer Perceptrons âdu QDVRaudys Institute of Mathematics and Informatics Akademijos 4, Vilnius 2600, Lithuania e-mail: raudys@das.mii.lt Abstract. We

More information

Learning features by contrasting natural images with noise

Learning features by contrasting natural images with noise Learning features by contrasting natural images with noise Michael Gutmann 1 and Aapo Hyvärinen 12 1 Dept. of Computer Science and HIIT, University of Helsinki, P.O. Box 68, FIN-00014 University of Helsinki,

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete

More information

Global Scene Representations. Tilke Judd

Global Scene Representations. Tilke Judd Global Scene Representations Tilke Judd Papers Oliva and Torralba [2001] Fei Fei and Perona [2005] Labzebnik, Schmid and Ponce [2006] Commonalities Goal: Recognize natural scene categories Extract features

More information

Links between Perceptrons, MLPs and SVMs

Links between Perceptrons, MLPs and SVMs Links between Perceptrons, MLPs and SVMs Ronan Collobert Samy Bengio IDIAP, Rue du Simplon, 19 Martigny, Switzerland Abstract We propose to study links between three important classification algorithms:

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Information-Theoretic Learning

Information-Theoretic Learning Information-Theoretic Learning I- Introduction Tutorial Jose C. Principe, Ph.D. One of the fundamental problems of our technology driven society is the huge amounts of data that are being generated by

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

A Method to Improve the Accuracy of Remote Sensing Data Classification by Exploiting the Multi-Scale Properties in the Scene

A Method to Improve the Accuracy of Remote Sensing Data Classification by Exploiting the Multi-Scale Properties in the Scene Proceedings of the 8th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences Shanghai, P. R. China, June 25-27, 2008, pp. 183-188 A Method to Improve the

More information

Old painting digital color restoration

Old painting digital color restoration Old painting digital color restoration Michail Pappas Ioannis Pitas Dept. of Informatics, Aristotle University of Thessaloniki GR-54643 Thessaloniki, Greece Abstract Many old paintings suffer from the

More information

Non-parametric Classification of Facial Features

Non-parametric Classification of Facial Features Non-parametric Classification of Facial Features Hyun Sung Chang Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Problem statement In this project, I attempted

More information

Advanced statistical methods for data analysis Lecture 2

Advanced statistical methods for data analysis Lecture 2 Advanced statistical methods for data analysis Lecture 2 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

FAST METHODS FOR EVALUATING THE ELECTRIC FIELD LEVEL IN 2D-INDOOR ENVIRONMENTS

FAST METHODS FOR EVALUATING THE ELECTRIC FIELD LEVEL IN 2D-INDOOR ENVIRONMENTS Progress In Electromagnetics Research, PIER 69, 247 255, 2007 FAST METHODS FOR EVALUATING THE ELECTRIC FIELD LEVEL IN 2D-INDOOR ENVIRONMENTS D. Martinez, F. Las-Heras, and R. G. Ayestaran Department of

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction

More information

Neutron inverse kinetics via Gaussian Processes

Neutron inverse kinetics via Gaussian Processes Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques

More information

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

EM-algorithm for Training of State-space Models with Application to Time Series Prediction EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research

More information

Change Detection in Optical Aerial Images by a Multi-Layer Conditional Mixed Markov Model

Change Detection in Optical Aerial Images by a Multi-Layer Conditional Mixed Markov Model Change Detection in Optical Aerial Images by a Multi-Layer Conditional Mixed Markov Model Csaba Benedek 12 Tamás Szirányi 1 1 Distributed Events Analysis Research Group Computer and Automation Research

More information

Statistical Independence and Novelty Detection with Information Preserving Nonlinear Maps

Statistical Independence and Novelty Detection with Information Preserving Nonlinear Maps Statistical Independence and Novelty Detection with Information Preserving Nonlinear Maps Lucas Parra, Gustavo Deco, Stefan Miesbach Siemens AG, Corporate Research and Development, ZFE ST SN 4 Otto-Hahn-Ring

More information

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares R Gutierrez-Osuna Computer Science Department, Wright State University, Dayton, OH 45435,

More information

Goodness of Fit Test and Test of Independence by Entropy

Goodness of Fit Test and Test of Independence by Entropy Journal of Mathematical Extension Vol. 3, No. 2 (2009), 43-59 Goodness of Fit Test and Test of Independence by Entropy M. Sharifdoost Islamic Azad University Science & Research Branch, Tehran N. Nematollahi

More information

Learning Kernel Parameters by using Class Separability Measure

Learning Kernel Parameters by using Class Separability Measure Learning Kernel Parameters by using Class Separability Measure Lei Wang, Kap Luk Chan School of Electrical and Electronic Engineering Nanyang Technological University Singapore, 3979 E-mail: P 3733@ntu.edu.sg,eklchan@ntu.edu.sg

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21 CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Artificial Neural Networks

Artificial Neural Networks Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks

More information

Probability Models for Bayesian Recognition

Probability Models for Bayesian Recognition Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Feature selection and extraction Spectral domain quality estimation Alternatives

Feature selection and extraction Spectral domain quality estimation Alternatives Feature selection and extraction Error estimation Maa-57.3210 Data Classification and Modelling in Remote Sensing Markus Törmä markus.torma@tkk.fi Measurements Preprocessing: Remove random and systematic

More information

Sensor Tasking and Control

Sensor Tasking and Control Sensor Tasking and Control Sensing Networking Leonidas Guibas Stanford University Computation CS428 Sensor systems are about sensing, after all... System State Continuous and Discrete Variables The quantities

More information

Face Detection and Recognition

Face Detection and Recognition Face Detection and Recognition Face Recognition Problem Reading: Chapter 18.10 and, optionally, Face Recognition using Eigenfaces by M. Turk and A. Pentland Queryimage face query database Face Verification

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Human Pose Tracking I: Basics. David Fleet University of Toronto

Human Pose Tracking I: Basics. David Fleet University of Toronto Human Pose Tracking I: Basics David Fleet University of Toronto CIFAR Summer School, 2009 Looking at People Challenges: Complex pose / motion People have many degrees of freedom, comprising an articulated

More information

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

ECE662: Pattern Recognition and Decision Making Processes: HW TWO ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are

More information

Nonparametric Methods Lecture 5

Nonparametric Methods Lecture 5 Nonparametric Methods Lecture 5 Jason Corso SUNY at Buffalo 17 Feb. 29 J. Corso (SUNY at Buffalo) Nonparametric Methods Lecture 5 17 Feb. 29 1 / 49 Nonparametric Methods Lecture 5 Overview Previously,

More information

W vs. QCD Jet Tagging at the Large Hadron Collider

W vs. QCD Jet Tagging at the Large Hadron Collider W vs. QCD Jet Tagging at the Large Hadron Collider Bryan Anenberg: anenberg@stanford.edu; CS229 December 13, 2013 Problem Statement High energy collisions of protons at the Large Hadron Collider (LHC)

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

VECTOR-QUANTIZATION BY DENSITY MATCHING IN THE MINIMUM KULLBACK-LEIBLER DIVERGENCE SENSE

VECTOR-QUANTIZATION BY DENSITY MATCHING IN THE MINIMUM KULLBACK-LEIBLER DIVERGENCE SENSE VECTOR-QUATIZATIO BY DESITY ATCHIG I THE IIU KULLBACK-LEIBLER DIVERGECE SESE Anant Hegde, Deniz Erdogmus, Tue Lehn-Schioler 2, Yadunandana. Rao, Jose C. Principe CEL, Electrical & Computer Engineering

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

Generalized Laplacian as Focus Measure

Generalized Laplacian as Focus Measure Generalized Laplacian as Focus Measure Muhammad Riaz 1, Seungjin Park, Muhammad Bilal Ahmad 1, Waqas Rasheed 1, and Jongan Park 1 1 School of Information & Communications Engineering, Chosun University,

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Linear Classifiers as Pattern Detectors

Linear Classifiers as Pattern Detectors Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2014/2015 Lesson 16 8 April 2015 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Information Theory in Computer Vision and Pattern Recognition

Information Theory in Computer Vision and Pattern Recognition Francisco Escolano Pablo Suau Boyan Bonev Information Theory in Computer Vision and Pattern Recognition Foreword by Alan Yuille ~ Springer Contents 1 Introduction...............................................

More information

Towards a Ptolemaic Model for OCR

Towards a Ptolemaic Model for OCR Towards a Ptolemaic Model for OCR Sriharsha Veeramachaneni and George Nagy Rensselaer Polytechnic Institute, Troy, NY, USA E-mail: nagy@ecse.rpi.edu Abstract In style-constrained classification often there

More information

The memory centre IMUJ PREPRINT 2012/03. P. Spurek

The memory centre IMUJ PREPRINT 2012/03. P. Spurek The memory centre IMUJ PREPRINT 202/03 P. Spurek Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland J. Tabor Faculty of Mathematics and Computer

More information

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University Heeyoul (Henry) Choi Dept. of Computer Science Texas A&M University hchoi@cs.tamu.edu Introduction Speaker Adaptation Eigenvoice Comparison with others MAP, MLLR, EMAP, RMP, CAT, RSW Experiments Future

More information

Supplementary Figure 1: Scheme of the RFT. (a) At first, we separate two quadratures of the field (denoted by and ); (b) then, each quadrature

Supplementary Figure 1: Scheme of the RFT. (a) At first, we separate two quadratures of the field (denoted by and ); (b) then, each quadrature Supplementary Figure 1: Scheme of the RFT. (a At first, we separate two quadratures of the field (denoted by and ; (b then, each quadrature undergoes a nonlinear transformation, which results in the sine

More information