Machine Learning. Measuring Distance. several slides from Bryan Pardo

Similar documents
Machine Perception of Music & Audio. Topic 9: Measuring Distance

Machine Learning. Topic 4: Measuring Distance

Machine Learning. Measuring Distance. several slides from Bryan Pardo

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Spatial Statistics and Analysis Methods (for GEOG 104 class).

Spectral Clustering. Shannon Quinn

CONDUCTORS AND INSULATORS

Linear Classification, SVMs and Nearest Neighbors

Error Bars in both X and Y

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Chapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D

More metrics on cartesian products

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

} Often, when learning, we deal with uncertainty:

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Mathematics Intersection of Lines

Rigid body simulation

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Similarity for Nominal Variables Based on F-Divergence

Chapter 7 Clustering Analysis (1)

Kernel Methods and SVMs Extension

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

3.1 ML and Empirical Distribution

Which Separator? Spring 1

Generative classification models

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Retrieval Models: Language models

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

ECE559VV Project Report

NP-Completeness : Proofs

Probability Density Function Estimation by different Methods

9. Complex Numbers. 1. Numbers revisited. 2. Imaginary number i: General form of complex numbers. 3. Manipulation of complex numbers

Machine Learning. Classification. Theory of Classification and Nonparametric Classifier. Representing data: Hypothesis (classifier) Eric Xing

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Why Monte Carlo Integration? Introduction to Monte Carlo Method. Continuous Probability. Continuous Probability

10.34 Fall 2015 Metropolis Monte Carlo Algorithm

EM and Structure Learning

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

1 The Mistake Bound Model

PHY2049 Exam 2 solutions Fall 2016 Solution:

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE

Bayesian epistemology II: Arguments for Probabilism

Richard Socher, Henning Peters Elements of Statistical Learning I E[X] = arg min. E[(X b) 2 ]

Physics 141. Lecture 14. Frank L. H. Wolfs Department of Physics and Astronomy, University of Rochester, Lecture 14, Page 1

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Differentiating Gaussian Processes

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,

Similarities, Distances and Manifold Learning

CSE 252C: Computer Vision III

Learning Theory: Lecture Notes

Pattern Classification

A Note on Bound for Jensen-Shannon Divergence by Jeffreys

Lecture 3: Shannon s Theorem

PHYS 705: Classical Mechanics. Calculus of Variations II

Natural Language Processing and Information Retrieval

Learning from Data 1 Naive Bayes

Gaussian Mixture Models

Unit 5: Quadratic Equations & Functions

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Report on Image warping

Lecture 12: Classification

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Performance of Different Algorithms on Clustering Molecular Dynamics Trajectories

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

European Journal of Combinatorics

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

Turing Machines (intro)

Generalized Linear Methods

DECOUPLING THEORY HW2

Line Drawing and Clipping Week 1, Lecture 2

On mutual information estimation for mixed-pair random variables

Lecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem.

Maximum Likelihood Estimation (MLE)

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

Expectation Maximization Mixture Models HMMs

Some Reading. Clustering and Unsupervised Learning. Some Data. K-Means Clustering. CS 536: Machine Learning Littman (Wu, TA)

Singular Value Decomposition: Theory and Applications

Grid Generation around a Cylinder by Complex Potential Functions

Machine Learning: and 15781, 2003 Assignment 4

Statistical pattern recognition

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:

CSCE 790S Background Results

5.76 Lecture #21 2/28/94 Page 1. Lecture #21: Rotation of Polyatomic Molecules I

Lecture 12: Discrete Laplacian

Classification learning II

Lecture Notes 7: The Unruh Effect

SCALARS AND VECTORS All physical quantities in engineering mechanics are measured using either scalars or vectors.

10-701/ Machine Learning, Fall 2005 Homework 3

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Chapter Newton s Method

A random variable is a function which associates a real number to each element of the sample space

Affine transformations and convexity

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Transcription:

Machne Learnng Measurng Dstance several sldes from Bran Pardo 1

Wh measure dstance? Nearest neghbor requres a dstance measure Also: Local search methods requre a measure of localt (Frda) Clusterng requres a dstance measure (later) Search engnes requre a measure of smlart etc.

Dmenson 2 Eucldean Dstance What people ntutvel thnk of as dstance d( ) ( 2 2 1 1) ( 2 2) Dmenson 1

Generalzed Eucldean Dstance ) ( d n n n and...... where ) ( 2 1 2 1 2 1/ 1 2 n = the number of dmensons

L p norms 01 1and Hammng Dstance : 2 Eucldean Dstance : norm L 1 Manhattan Dstance : norm L ) ( 2 2 1 1 1/ 1 p n p p p p d L p norms are all specal cases of ths: p changes the norm

Weghtng Dmensons Put pont n cluster wth the closest center of gravt? Whch cluster should the red pont go n? How do I measure dstance n a wa that gves the rght answer for both stuatons?

Weghtng Dmensons Thnk Start End Put pont n cluster wth the closest center of gravt? Whch cluster should the red pont go n? How do I measure dstance n a wa that gves the rght answer for both stuatons?

Weghtng Dmensons Par Start End Put pont n cluster wth the closest center of gravt? Whch cluster should the red pont go n? How do I measure dstance n a wa that gves the rght answer for both stuatons?

Weghtng Dmensons Share Put pont n cluster wth the closest center of gravt? Whch cluster should the red pont go n? How do I measure dstance n a wa that gves the rght answer for both stuatons?

Weghted Norms You can compensate b weghtng our dmensons. d n 1/ ( ) w p 1 p Ths lets ou turn our crcle of equal-dstance nto an elpse wth aes parallel to the dmensons of the vectors.

Mahalanobs dstance The regon of constant Mahalanobs dstance around the mean of a dstrbuton forms an ellpsod. The aes of ths ellpsod don t have to be parallel to the dmensons descrbng the vector Images from: http://www.aaccess.net/englsh/glossares/glosmod/e_gm_mahalanobs.htm 11

Calculatng Mahalanobs d ( ) ( ) T S 1 ( ) Ths matr S s called the covarance matr and s calculated from the data dstrbuton 12

Take-awa on Mahalanobs Is good for nonsphercall smmetrc dstrbutons. Accounts for scalng of coordnate aes Can reduce to Eucldean 13

What s a metrc? A metrc has these four qualtes. otherwse call t a measure (trangle nequalt) ) ( ) ( ) ( (smmetr) ) ( ) ( (non - negatve) 0 ) ( (reflevt) ff 0 ) ( z d z d d d d d d

Metrc or not? Drvng dstance wth 1-wa streets Categorcal Stuff : Is dstance (Jazz to Blues to Rock) no less than dstance (Jazz to Rock)?

Categorcal Varables Consder feature vectors for genre & vocals: Genre: {Blues Jazz Rock Hp Hop} Vocals: {vocalsno vocals} s1 = {rock vocals} s2 = {jazz no vocals} s3 = { rock no vocals} Whch two songs are more smlar?

One Soluton:Hammng dstance Blues Jazz Rock Hp Hop 0 0 1 0 1 0 1 0 0 0 0 0 1 0 0 Vocals s1 = {rock vocals} s2 = {jazz no_vocals} s3 = { rock no_vocals} Hammng Dstance = number of dfferent bts n two bnar vectors

Hammng Dstance {01}) ( and...... where ) ( 2 1 2 1 1 n n n d

Defnng our own dstance (an eample) How often does artst quote artst? Quote Frequenc Beethoven Beatles Lz Phar Beethoven 7 0 0 Beatles 4 5 0 Lz Phar? 1 2 Let s buld a dstance measure!

Defnng our own dstance (an eample) Beethoven Beatles Lz Phar Beethoven 7 0 0 Beatles 4 5 0 Lz Phar? 1 2 Quote frequenc Q Dstance d( ) 1 f ( ) value n table Q f Q zartsts ( ) f ( z)

Mssng data What f for some categor on some eamples there s no value gven? Approaches: Dscard all eamples mssng the categor Fll n the blanks wth the mean value Onl use a categor n the dstance measure f both eamples gve a value

Dealng wth mssng data n n w w n n d w 1 1 ) ( ) ( else 0 are defned and both f 1

Edt Dstance Quer = strng from fnte alphabet Target = strng from fnte alphabet Cost of Edts = Dstance Target: C A G E D - - Quer: C E A E D

Semantc Relatedness d(portland Hppes) << d(portland Monster trucks) 24

Semantc Relatedness Several measures have been proposed One that works well: Mlne-Wtten SR MW ( ) fracton of Wkpeda n-lnks to ether or that lnk to both 25

One more dstance measure Kullback Lebler dvergence Related to entrop & nformaton gan not a metrc snce t s not smmetrc Take EECS 428:Informaton Theor to fnd out more 31