Some Reading. Clustering and Unsupervised Learning. Some Data. K-Means Clustering. CS 536: Machine Learning Littman (Wu, TA)

Size: px
Start display at page:

Download "Some Reading. Clustering and Unsupervised Learning. Some Data. K-Means Clustering. CS 536: Machine Learning Littman (Wu, TA)"

Transcription

1 Some Readng Clusterng and Unsupervsed Learnng CS 536: Machne Learnng Lttman (Wu, TA) Not sure what to suggest for K-Means and sngle-lnk herarchcal clusterng. Klenberg (00). An mpossblty theorem for clusterng n NIPS-15. Landauer, Laham, and Foltz (1999). Learnng human-lke knowledge by sngular value decomposton: A progress report n NIPS-10. K-Means Clusterng Sldes from Andrew Moore (CMU). Some Data Ths could easly be modeled by a Gaussan Mxture (wth 5 components) But let s look at a satsfyng, frendly and nfntely popular alternatve

2 Lossy Compresson Frst Idea Want to transmt ponts, lossly, wth only two bts per pont. Loss = Sum Squared Error between decoded and orgnal coordnates. What encoder/ decoder wll lose the least nformaton? Break nto a grd, decode each bt-par as the mddle of each grd-cell. Any better deas? Second Idea K-Means Break nto a grd, decode each bt-par as the centrod of all data n that grd-cell Other deas? Ask user how many clusters he d lke. (say, k=5) Randomly guess k cluster center locatons Each datapont fnds out whch center t s closest to. Each center fnds the centrod of the ponts t owns and umps there Repeat untl termnated!

3 K-Means Illustrated For fast mplementaton, see: Dan Pelleg and Andrew Moore. Acceleratng Exact k- means Algorthms wth Geometrc Reasonng. Proc. Conference on Knowledge Dscovery n Databases 1999, (KDD99) (avalable on

4

5 K-Means Questons What s t tryng to optmze? Are we sure t wll termnate? Are we sure t wll fnd an optmal clusterng? How should we start t? How could we automatcally choose the number of centers?.we ll deal wth these questons over the next few sldes

6 Dstorton Gven.. an encoder functon: ENCODE :! m " [1..k] a decoder functon: DECODE : [1..k] "! m Defne Dstorton = R!( x " DECODE[ ENCODE( x )]) = 1 We may as well wrte DECODE[ ] = c so Dstorton = R! = 1 ( x " c ENCODE ( x )) The Mnmal Dstorton What propertes must centers c 1, c,, c k have when dstorton s mnmzed? Dstorton = R! = 1 ( x " c ENCODE ( x )) (1) x must be encoded by ts nearest center.why? c ENCODE ( x ) = arg mn ( x! c ) c "{ c, c,... c } 1...at the mnmal dstorton k The Mnmal Dstorton () Partal dervatve of Dstorton wth respect to each center locaton must be zero. (Center at centrod). Dstorton " Dstorton " c = = = = = R # = 1 k = 1 $ OwnedBy( c ) " " c ( x! c # # $ OwnedBy( c )! ENCODE( x ) # # ( x! c ( x! c ( x! c $ OwnedBy( c ) 0 (for a mnmum) ) ) ) ) OwnedBy(c ) = the set of records owned by center c. Improvng the Confguraton What propertes can be changed for centers c 1, c,, c k have when dstorton s not mnmzed? (1) Change encodng so that x s encoded by ts nearest center () Set each center to the centrod of ponts t owns. There s no pont applyng ether operaton twce n successon. But t can be proftable to alternate. And that s K-means! Easy to prove ths procedure wll termnate n a state at whch nether (1) or () change the confguraton. Why?

7 Is It Optmal? Not necessarly. Can you nvent a confguraton that has converged, but does not have the mnmum dstorton? Seekng Good Optma Idea 1: Be careful about where you start. (Use dataponts, far apart f possble.) Idea : Do many runs of K-Means, each from a dfferent random startng pont. Many deas floatng around. Common Uses of K-Means Often used as an exploratory data analyss tool In one-dmenson, a good way to quantze realvalued varables nto k non-unform buckets Used on acoustc data n speech understandng to convert waveforms nto one of k categores (known as Vector Quantzaton) Also used for choosng color palettes on old fashoned graphcal dsplay devces! (Note: How choose numbers of centers?) Sngle Lnkage Herarchcal Clusterng 1. Say Every pont s ts own cluster. Fnd most smlar par of clusters 3. Merge t nto a parent cluster 4. Repeat

8 Some Detals How do we defne smlarty between clusters? Mnmum dstance between ponts n clusters (smply Eucldan Mnmum Spannng Trees) Maxmum dstance between ponts n clusters Average dstance between ponts n clusters You re left wth a nce dendrogram (taxonomy, herarchy, tree) of dataponts Sngle Lnkage Comments It s nce that you get a herarchy nstead of an amorphous collecton of groups If you want k groups, ust cut the (k-1) longest lnks There s no real statstcal or nformatontheoretc foundaton to ths. Also known n the trade as Herarchcal Agglomeratve Clusterng (note the acronym) Stuff to Know All the detals of K-means The theory behnd K-means as an optmzaton algorthm How K-means can get stuck The outlne of herarchcal clusterng Clusterng Desderata Abstractly, clusterng algorthm f maps dstance matrx D between obects to partton P of obects. What s an deal clusterng algorthm? 1. Rchness: for all P, exsts D s.t. f(d) = P.. Scale nvarance: for all c>0 f(d) = f(cd). 3. Consstency: for any D, f f(d)=p, then for any D wth smaller wthn-cluster dstances and larger between-cluster dstances, f(d )=P.!

9 Any Par of These Possble Usng sngle-lnk clusterng, we can vary the stoppng condton and get: Rchness and consstency: Stop when dstances are more than r. Scale nvarance and consstency: Stop when number of clusters s k. Rchness and scale nvarance: Stop when dstances exceed r/max D. All Three Impossble! It s mpossble to specfy a clusterng algorthm that satsfes all three condtons smultaneously (Klenberg 0). Essentally, the proof shows that consstency and scale nvarance severely constran what parttons can be created. Why Clusterng? Whch brngs up the queston, why dd we want to do clusterng anyway? What s the task? Can we accomplsh the same thngs by assgnng dstances or smlartes to pars of obects? Alternatves to Clusterng An embeddng algorthm e maps dstances D to locatons L n some vector space so that dst(l, L ) approxmates D. multdmensonal scalng (MDS) sngular value decomposton (SVD) non-negatve matrx factorzaton (NMF) non-lnear embeddng (LLE, Isomap) and more No hard decson on clusters.

10 Latent Semantc Analyss Start wth a set of text documents and form a term by document matrx M. Say the smlarty between two words s defned by the dot product of the vectors of documents n whch the words appear. Near synonyms often not assgned hgh smlarty score f they tend not to cooccur. Clusterng could help. Dfferent dea: embed the words n a lower dmensonal space, capturng smlarty. Wll tend to drop nsgnfcant dfferences. Example LSA Applcatons Tran on 30k+ documents, 16k+ words. Scores at human level n vocabulary, multple-choce psych exams. Can grade essay tests. Recognzes synonyms and antonyms as beng closely related (but separable). Smlarty patterns more lke kds (r=.5) than adults (r=.3). Exhbts prmng-lke effects. Useful for IR, and cross-language IR. Can udge coherence, predct what to read. Open Queston What s somethng we can do wth clusterng that we don t know how to do wth embeddng? hgh-level representatons? composablty? collectng cluster-based statstcs? Lots to thnk about.

Instance-Based Learning and Clustering

Instance-Based Learning and Clustering Instane-Based Learnng and Clusterng R&N 04, a bt of 03 Dfferent knds of Indutve Learnng Supervsed learnng Bas dea: Learn an approxmaton for a funton y=f(x based on labelled examples { (x,y, (x,y,, (x n,y

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Clustering gene expression data & the EM algorithm

Clustering gene expression data & the EM algorithm CG, Fall 2011-12 Clusterng gene expresson data & the EM algorthm CG 08 Ron Shamr 1 How Gene Expresson Data Looks Entres of the Raw Data matrx: Rato values Absolute values Row = gene s expresson pattern

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Chapter 8 SCALAR QUANTIZATION

Chapter 8 SCALAR QUANTIZATION Outlne Chapter 8 SCALAR QUANTIZATION Yeuan-Kuen Lee [ CU, CSIE ] 8.1 Overvew 8. Introducton 8.4 Unform Quantzer 8.5 Adaptve Quantzaton 8.6 Nonunform Quantzaton 8.7 Entropy-Coded Quantzaton Ch 8 Scalar

More information

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

p 1 c 2 + p 2 c 2 + p 3 c p m c 2 Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Lecture Nov

Lecture Nov Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances

More information

CSE 252C: Computer Vision III

CSE 252C: Computer Vision III CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification Instance-Based earnng (a.k.a. memory-based learnng) Part I: Nearest Neghbor Classfcaton Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Lecture 3. Ax x i a i. i i

Lecture 3. Ax x i a i. i i 18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Support Vector Machines

Support Vector Machines CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at

More information

Clustering & Unsupervised Learning

Clustering & Unsupervised Learning Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

EDMS Modern Measurement Theories. Multidimensional IRT Models. (Session 6)

EDMS Modern Measurement Theories. Multidimensional IRT Models. (Session 6) EDMS 74 - Modern Measurement Theores Multdmensonal IRT Models (Sesson 6) Sprng Semester 8 Department of Measurement, Statstcs, and Evaluaton (EDMS) Unversty of Maryland Dr. André A. Rupp, (3) 45 363, ruppandr@umd.edu

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

Clustering & (Ken Kreutz-Delgado) UCSD

Clustering & (Ken Kreutz-Delgado) UCSD Clusterng & Unsupervsed Learnng Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y ), fnd an approxmatng

More information

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University Clusterng CS4780/5780 Mahne Learnng Fall 2012 Thorsten Joahms Cornell Unversty Readng: Mannng/Raghavan/Shuetze, Chapters 16 (not 16.3) and 17 (http://nlp.stanford.edu/ir-book/) Outlne Supervsed vs. Unsupervsed

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Spectral Clustering. Shannon Quinn

Spectral Clustering. Shannon Quinn Spectral Clusterng Shannon Qunn (wth thanks to Wllam Cohen of Carnege Mellon Unverst, and J. Leskovec, A. Raaraman, and J. Ullman of Stanford Unverst) Graph Parttonng Undrected graph B- parttonng task:

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD.

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD. Cluster Analyss Cluster Valdaton Determnng Number of Clusters 1 Cluster Valdaton The procedure of evaluatng the results of a clusterng algorthm s known under the term cluster valdty. How do we evaluate

More information

Spatial Statistics and Analysis Methods (for GEOG 104 class).

Spatial Statistics and Analysis Methods (for GEOG 104 class). Spatal Statstcs and Analyss Methods (for GEOG 104 class). Provded by Dr. An L, San Dego State Unversty. 1 Ponts Types of spatal data Pont pattern analyss (PPA; such as nearest neghbor dstance, quadrat

More information

Unsupervised Learning

Unsupervised Learning Unsupervsed Learnng Kevn Swngler What s Unsupervsed Learnng? Most smply, t can be thought of as learnng to recognse and recall thngs Recognton I ve seen that before Recall I ve seen that before and I can

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

Cathy Walker March 5, 2010

Cathy Walker March 5, 2010 Cathy Walker March 5, 010 Part : Problem Set 1. What s the level of measurement for the followng varables? a) SAT scores b) Number of tests or quzzes n statstcal course c) Acres of land devoted to corn

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems

More information

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them? Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of

More information

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0 Bezer curves Mchael S. Floater August 25, 211 These notes provde an ntroducton to Bezer curves. 1 Bernsten polynomals Recall that a real polynomal of a real varable x R, wth degree n, s a functon of the

More information

CS4495/6495 Introduction to Computer Vision. 3C-L3 Calibrating cameras

CS4495/6495 Introduction to Computer Vision. 3C-L3 Calibrating cameras CS4495/6495 Introducton to Computer Vson 3C-L3 Calbratng cameras Fnally (last tme): Camera parameters Projecton equaton the cumulatve effect of all parameters: M (3x4) f s x ' 1 0 0 0 c R 0 I T 3 3 3 x1

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Clustering with Gaussian Mixtures

Clustering with Gaussian Mixtures Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule: 15-745 Lecture 6 Data Dependence n Loops Copyrght Seth Goldsten, 2008 Based on sldes from Allen&Kennedy Lecture 6 15-745 2005-8 1 Common loop optmzatons Hostng of loop-nvarant computatons pre-compute before

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence) /24/27 Prevew Fbonacc Sequence Longest Common Subsequence Dynamc programmng s a method for solvng complex problems by breakng them down nto smpler sub-problems. It s applcable to problems exhbtng the propertes

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Machine Learning. Measuring Distance. several slides from Bryan Pardo

Machine Learning. Measuring Distance. several slides from Bryan Pardo Machne Learnng Measurng Dstance several sldes from Bran Pardo 1 Wh measure dstance? Nearest neghbor requres a dstance measure Also: Local search methods requre a measure of localt (Frda) Clusterng requres

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering Clusterng: Smlarty-Based Clusterng CS4780/5780 Mahne Learnng Fall 2013 Thorsten Joahms Cornell Unversty Supervsed vs. Unsupervsed Learnng Herarhal Clusterng Herarhal Agglomeratve Clusterng (HAC) Non-Herarhal

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Similarities, Distances and Manifold Learning

Similarities, Distances and Manifold Learning Smlartes, Dstances and Manfold Learnng Prof. Rchard C. Wlson Dept. of Computer Scence Unversty of York Part I: Eucldean Space Poston, Smlarty and Dstance Manfold Learnng n Eucldean space Some famous technques

More information

Newton s Method for One - Dimensional Optimization - Theory

Newton s Method for One - Dimensional Optimization - Theory Numercal Methods Newton s Method for One - Dmensonal Optmzaton - Theory For more detals on ths topc Go to Clck on Keyword Clck on Newton s Method for One- Dmensonal Optmzaton You are free to Share to copy,

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

find (x): given element x, return the canonical element of the set containing x;

find (x): given element x, return the canonical element of the set containing x; COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

Société de Calcul Mathématique SA

Société de Calcul Mathématique SA Socété de Calcul Mathématque SA Outls d'ade à la décson Tools for decson help Probablstc Studes: Normalzng the Hstograms Bernard Beauzamy December, 202 I. General constructon of the hstogram Any probablstc

More information

Hopfield Training Rules 1 N

Hopfield Training Rules 1 N Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7 Structure from Moton Forsyth&once: Chap. 2 and 3 Szelsk: Chap. 7 Introducton to Structure from Moton Forsyth&once: Chap. 2 Szelsk: Chap. 7 Structure from Moton Intro he Reconstructon roblem p 3?? p p 2

More information

Clustering Techniques for Information Retrieval

Clustering Techniques for Information Retrieval Clusterng Technques for Informaton Retreval Berln Chen Department of Computer Scence & Informaton Engneerng Natonal Tawan Normal Unversty References:. Chrstopher D. Mannng, Prabhaar Raghavan and Hnrch

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Protein Structure Comparison

Protein Structure Comparison Proten Structure Comparson Proten Structure Representaton CPK: hard sphere model Ball-and-stck Cartoon Degrees of Freedom n Protens Bond length Dhedral angle 3 4 Bond angle + Proten Structure: Varables

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to THE INVERSE POWER METHOD (or INVERSE ITERATION) -- applcaton of the Power method to A some fxed constant ρ (whch s called a shft), x λ ρ If the egenpars of A are { ( λ, x ) } ( ), or (more usually) to,

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Complex Numbers Alpha, Round 1 Test #123

Complex Numbers Alpha, Round 1 Test #123 Complex Numbers Alpha, Round Test #3. Wrte your 6-dgt ID# n the I.D. NUMBER grd, left-justfed, and bubble. Check that each column has only one number darkened.. In the EXAM NO. grd, wrte the 3-dgt Test

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Learning with Maximum Likelihood

Learning with Maximum Likelihood Learnng wth Mamum Lelhood Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm,

More information