Machine Learning. Topic 4: Measuring Distance

Similar documents
Machine Learning. Measuring Distance. several slides from Bryan Pardo

Machine Perception of Music & Audio. Topic 9: Measuring Distance

Machine Learning. Measuring Distance. several slides from Bryan Pardo

Unsupervised Learning and Other Neural Networks

Centroids & Moments of Inertia of Beam Sections

6. Nonparametric techniques

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Centroids Method of Composite Areas

Mechanics of Materials CIVL 3322 / MECH 3322

Introduction to local (nonparametric) density estimation. methods

Generative classification models

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Binary classification: Support Vector Machines

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

Third handout: On the Gini Index

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Centers of Gravity - Centroids

Lecture 3 Probability review (cont d)

1 Onto functions and bijections Applications to Counting

n -dimensional vectors follow naturally from the one

Econometric Methods. Review of Estimation

Regression and the LMS Algorithm

22 Nonparametric Methods.

MEASURES OF DISPERSION

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Generalization of the Dissimilarity Measure of Fuzzy Sets

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

7. Joint Distributions

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Summary of the lecture in Biostatistics

Statistics: Unlocking the Power of Data Lock 5

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Lecture Notes Types of economic variables

Lecture 9: Tolerant Testing

PTAS for Bin-Packing

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory

Logistic regression (continued)

STATISTICS 13. Lecture 5 Apr 7, 2010

Unit 9. The Tangent Bundle

PROJECTION PROBLEM FOR REGULAR POLYGONS

ESS Line Fitting

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58

Some Different Perspectives on Linear Least Squares

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Multiple Choice Test. Chapter Adequacy of Models for Regression

THE COMPLETE ENUMERATION OF FINITE GROUPS OF THE FORM R 2 i ={R i R j ) k -i=i

Support vector machines II

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

QR Factorization and Singular Value Decomposition COS 323

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

Ideal multigrades with trigonometric coefficients

PHYS Look over. examples 2, 3, 4, 6, 7, 8,9, 10 and 11. How To Make Physics Pay PHYS Look over. Examples: 1, 4, 5, 6, 7, 8, 9, 10,

d dt d d dt dt Also recall that by Taylor series, / 2 (enables use of sin instead of cos-see p.27 of A&F) dsin

D KL (P Q) := p i ln p i q i

MOLECULAR VIBRATIONS

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Descriptive Statistics

Log1 Contest Round 2 Theta Complex Numbers. 4 points each. 5 points each

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

Special Instructions / Useful Data

18.413: Error Correcting Codes Lab March 2, Lecture 8

Generalized Linear Regression with Regularization

( ) ( ) A number of the form x+iy, where x & y are integers and i = 1 is called a complex number.

Entropy, Relative Entropy and Mutual Information

St John s College. Preliminary Examinations July 2014 Mathematics Paper 1. Examiner: G Evans Time: 3 hrs Moderator: D Grigoratos Marks: 150

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Laboratory I.10 It All Adds Up

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

Median as a Weighted Arithmetic Mean of All Sample Observations

TESTS BASED ON MAXIMUM LIKELIHOOD

= y and Normed Linear Spaces

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues

CSE 5526: Introduction to Neural Networks Linear Regression

MAX-MIN AND MIN-MAX VALUES OF VARIOUS MEASURES OF FUZZY DIVERGENCE

CHAPTER VI Statistical Analysis of Experimental Data

Chapter 4 Multiple Random Variables

Support vector machines

Functions of Random Variables

3. Basic Concepts: Consequences and Properties

MA/CSSE 473 Day 27. Dynamic programming

Objectives of Multiple Regression

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

7.0 Equality Contraints: Lagrange Multipliers

Transforms that are commonly used are separable

MMJ 1113 FINITE ELEMENT METHOD Introduction to PART I

STK3100 and STK4100 Autumn 2018

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

1 Lyapunov Stability Theory

Clustering: K-Means. Machine Learning , Fall Bhavana Dalvi Mishra PhD student LTI, CMU

Transcription:

Mache Learg Topc 4: Measurg Dstace Bra Pardo Mache Learg: EECS 349 Fall 2009

Wh measure dstace? Clusterg requres dstace measures. Local methods requre a measure of localt Search eges requre a measure of smlart Bra Pardo Mache Learg: EECS 349 Fall 2009

Dmeso 2 Eucldea Dstace What people tutvel thk of as dstace d ) 2 2 ) 2 2) Dmeso Bra Pardo Mache Learg: EECS 349 Fall 2009

Geeralzed Eucldea Dstace = the umber of dmesos d where ) 2 2...... 2 / 2 } ad ) Bra Pardo Mache Learg: EECS 349 Fall 2009

L p orms L p orms are all specal cases of ths: d ) p / p p chages the orm L orm Mahatta Dstace: p 2 L 2 orm Eucldea Dstace: p 2 Hammg Dstace: p ad 0 Bra Pardo Mache Learg: EECS 349 Fall 2009

Weghtg Dmesos Put pot the cluster wth the closest ceter of gravt Whch cluster should the red pot go? How do I measure dstace a wa that gves the rght aswer for both stuatos? Bra Pardo Mache Learg: EECS 349 Fall 2009

Weghted Norms You ca compesate b weghtg our dmesos. d / ) w p p Ths lets ou tur our crcle of equal-dstace to a elpse wth aes parallel to the dmesos of the vectors. Bra Pardo Mache Learg EECS 349 Fall 2009

Mahalaobs dstace The rego of costat Mahalaobs dstace aroud the mea of a dstrbuto forms a ellpsod. The aes of ths ellpsod do t have to be parallel to the dmesos descrbg the vector Images from: http://www.aaccess.et/eglsh/glossares/glosmod/e_gm_mahalaobs.htm Bra Pardo Mache Learg: EECS 349 Fall 2009 8

Calculatg Mahalaobs d ) ) T S ) Ths matr S - s called the covarace matr ad s calculated from the data dstrbuto Let s look at the demo here: http://www.aaccess.et/eglsh/glossares/glosmod/e_gm_mahalaobs.htm#amato%20mahalaobs Bra Pardo Mache Learg: EECS 349 Fall 2009 9

Take-awa o Mahalaobs Is good for osphercall smmetrc dstrbutos. Accouts for scalg of coordate aes Ca reduce to Eucldea Bra Pardo Mache Learg: EECS 349 Fall 2009 0

What s a metrc? A metrc has these four qualtes. otherwse call t a measure equalt) tragle ) ) ) smmetr) ) ) o - egatve) 0 ) reflev t) ff 0 ) z d z d d d d d d Bra Pardo Mache Learg: EECS 349 Fall 2009

Metrc or ot? Drvg dstace wth -wa streets Categorcal Stuff : Is dstace Jazz to Blues to Rock) o less tha dstace Jazz to Rock)? Bra Pardo Mache Learg: EECS 349 Fall 2009

Categorcal Varables Cosder feature vectors for gere & vocals: Gere: {Blues Jazz Rock Zdeco} Vocals: {vocalso vocals} s = {rock vocals} s2 = {jazz o vocals} s3 = { rock o vocals} Whch two sogs are more smlar? Bra Pardo Mache Learg: EECS 349 Fall 2009

Oe Soluto:Hammg dstace Blues Jazz Rock Zdeco 0 0 0 0 0 0 0 0 0 0 0 Vocals s = {rock vocals} s2 = {jazz o_vocals} s3 = { rock o_vocals} Hammg Dstace = umber of bts dfferet betwee bar vectors Bra Pardo Mache Learg: EECS 349 Fall 2009

Hammg Dstace Bra Pardo Mache Learg: EECS 349 Fall 2009 {0}) ad }...... where ) 2 2 d

Defg our ow dstace a eample) How ofte does artst quote artst? Quote Frequec Beethove Beatles Lz Phar Beethove 7 0 0 Beatles 4 5 0 Lz Phar? 2 Let s buld a dstace measure! Bra Pardo Mache Learg: EECS 349 Fall 2009

Defg our ow dstace a eample) Beethove Beatles Lz Phar Beethove 7 0 0 Beatles 4 5 0 Lz Phar? 2 Quotefrequec Q Dstace d ) f ) value table Q f Q zartsts ) f z) Bra Pardo Mache Learg: EECS 349 Fall 2009

Mssg data What f for some categor o some eamples there s o value gve? Approaches: Dscard all eamples mssg the categor Fll the blaks wth the mea value Ol use a categor the dstace measure f both eamples gve a value Bra Pardo Mache Learg: EECS 349 Fall 2009

Dealg wth mssg data w d w ) ) else are defed ad both f 0 Bra Pardo Mache Learg: EECS 349 Fall 2009

Edt Dstace Quer = strg from fte alphabet Target = strg from fte alphabet Cost of Edts = Dstace Target: C A G E D - - Quer: C E A E D

Oe more dstace measure Kullback Lebler dvergece Related to etrop & formato ga ot a metrc sce t s ot smmetrc Take EECS 428:Iformato Theor to fd out more Bra Pardo Mache Learg: EECS 349 Fall 2009 2