p 1 c 2 + p 2 c 2 + p 3 c p m c 2

Similar documents
Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Lecture 10 Support Vector Machines II

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Lecture 10 Support Vector Machines. Oct

APPENDIX A Some Linear Algebra

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Errors for Linear Systems

Formulas for the Determinant

10-701/ Machine Learning, Fall 2005 Homework 3

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Report on Image warping

VQ widely used in coding speech, image, and video

CS4495/6495 Introduction to Computer Vision. 3C-L3 Calibrating cameras

Which Separator? Spring 1

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Linear Feature Engineering 11

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Feb 14: Spatial analysis of data fields

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Nonlinear Classifiers II

Math 217 Fall 2013 Homework 2 Solutions

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Ph 219a/CS 219a. Exercises Due: Wednesday 23 October 2013

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Review: Fit a line to N data points

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD.

CSE 252C: Computer Vision III

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:

Math1110 (Spring 2009) Prelim 3 - Solutions

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra

PHYS 705: Classical Mechanics. Calculus of Variations II

Generalized Linear Methods

Lecture Notes on Linear Regression

The Geometry of Logit and Probit

The Second Eigenvalue of Planar Graphs

Spectral Graph Theory and its Applications September 16, Lecture 5

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

Singular Value Decomposition: Theory and Applications

( ) [ ( k) ( k) ( x) ( ) ( ) ( ) [ ] ξ [ ] [ ] [ ] ( )( ) i ( ) ( )( ) 2! ( ) = ( ) 3 Interpolation. Polynomial Approximation.

Norms, Condition Numbers, Eigenvalues and Eigenvectors

2.3 Nilpotent endomorphisms

Numerical Algorithms for Visual Computing 2008/09 Example Solutions for Assignment 4. Problem 1 (Shift invariance of the Laplace operator)

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Poisson brackets and canonical transformations

Section 8.3 Polar Form of Complex Numbers

Differentiating Gaussian Processes

Topic 5: Non-Linear Regression

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Quantum Mechanics for Scientists and Engineers. David Miller

10.34 Fall 2015 Metropolis Monte Carlo Algorithm

Problem Set 9 Solutions

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

CHAPTER 4. Vector Spaces

Lecture 3: Dual problems and Kernels

Lecture 3. Ax x i a i. i i

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Spectral Clustering. Shannon Quinn

Homework Notes Week 7

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Please review the following statement: I certify that I have not given unauthorized aid nor have I received aid in the completion of this exam.

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Introduction to Simulation - Lecture 5. QR Factorization. Jacob White. Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy

Lecture 4: Universal Hash Functions/Streaming Cont d

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Finding Dense Subgraphs in G(n, 1/2)

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

1 Matrix representations of canonical matrices

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Support Vector Machines CS434

Error Bars in both X and Y

Cathy Walker March 5, 2010

12 MATH 101A: ALGEBRA I, PART C: MULTILINEAR ALGEBRA. 4. Tensor product

Mean Field / Variational Approximations

ORIGIN 1. PTC_CE_BSD_3.2_us_mp.mcdx. Mathcad Enabled Content 2011 Knovel Corp.

Ballot Paths Avoiding Depth Zero Patterns

The Fundamental Theorem of Algebra. Objective To use the Fundamental Theorem of Algebra to solve polynomial equations with complex solutions

The Second Anti-Mathima on Game Theory

On the Multicriteria Integer Network Flow Problem

Lecture Nov

Computing Correlated Equilibria in Multi-Player Games

Pattern Classification

Kernel Methods and SVMs Extension

Chapter Newton s Method

Tensor Analysis. For orthogonal curvilinear coordinates, ˆ ˆ (98) Expanding the derivative, we have, ˆ. h q. . h q h q

Kristin P. Bennett. Rensselaer Polytechnic Institute

Linear Classification, SVMs and Nearest Neighbors

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Conservation of Angular Momentum = "Spin"

Transcription:

Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance between a proposed locaton c and a pont. But dfferent houses have dfferent deas about where to put the frehouse how to combne ther preferences nto a sngle locaton? Choose the pont that mnmzes the average staton-to-house dstance. Same as mnmzng the sum of staton-to-house dstances. Ths could be really bad for houses that are outsde of the town center. Choose the pont that mnmzes the maxmum staton-to-house dstance.ths could be really bad for most of the houses! Choose the pont that mnmzes sum of squared staton-to-house dstances p 1 c 2 + p 2 c 2 + p 3 c 2 + + p m c 2 Ths s a sort of compromse lke average but f some house s very far away the squared dstance s very large. (These three dfferent measures are called L 1, L, L 2.)

Puttng a faclty n the locaton that mnmzes sum of squared dstances Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton so as to mnmze sum of squared dstances p 1 c 2 + p 2 c 2 + + p m c 2 Queston:How to fnd ths locaton? Answer: c = 1 m (p 1 + p 2 + + p m ) Called the centrod of p 1,..., p m. It s the average for vectors. In fact, for = 1,..., n, entry of the centrod s the average of entry of all the ponts. Centrod p satsfes the equaton m p = p. Therefore (p p) equals the zero vector.

Provng that the centrod mnmzes the sum of squared dstances Let q be any pont. We show that the sum of squared q-to-datapont dstances s at least the sum of squared p-to-datapont dstances. For = 1,..., m, p q 2 = p p + p q 2 Summng over = 1,..., m, p q 2 = = = p p + p q, p p + p q = p p, p p + p p, p q + p q, p p + p q, p q = p p 2 + p p, p q + p q, p p + p q 2 p p 2 + p p, p q + p q, p p + p q 2 p p 2 + (p p), p q + p q, (p p) + p q 2

Provng that the centrod mnmzes the sum of squared dstances Let q be any pont. We show that the sum of squared q-to-datapont dstances s at least the sum of squared p-to-datapont dstances. Summng over = 1,..., m, p q 2 = = = p p 2 + p p, p q + p q, p p + p q 2 p p 2 + (p p), p q + p q, (p p) + p q 2 p p 2 + 0, p q + p q, 0 + p q 2 = p p 2 + 0 + 0 + p q 2 = sum of squared p-to-datapont dstances + squared p-to-q dstance

k-means clusterng usng Lloyd s Algorthm k-means clusterng: Gven data ponts (vectors) p 1,..., p m n R n, select k centers c 1,..., c k so as to mnmze the sum of squared dstances of data ponts to the nearest centers. That s, defne the functon f (x, [c 1,..., c k ]) = mn{ x c 2 : {1,..., k}}. Ths functon returns the squared dstance from x to whchever of c 1,..., c k s nearest. The goal of k-means clusterng s to select ponts c 1,..., c k so as to mnmze f (p 1, [c 1,..., c k ]) + f (p 2, [c 1,..., c k ]) + + f (p m, [c 1,..., c k ]) The purpose s to partton the data ponts nto k groups (called clusters).

k-means clusterng usng Lloyd s Algorthm Select k centers to mnmze sum of squared dstances of data ponts to nearest centers. Combnes two deas: 1. Assgn each data pont to the nearest center. 2. Choose the centers so as to be close to data ponts. Suggests an algorthm. Start wth k centers somewhere, perhaps randomly chosen. Then repeatedly perform the followng steps: 1. Assgn each data pont to nearest center. 2. Move each center to be as close as possble to the data ponts assgned to t Ths means let the new locaton of the center be the centrod of the assgned ponts.

Orthogonalzaton [9] Orthogonalzaton

Fndng the closest pont n a plane Goal: Gven a pont b and a plane, fnd the pont n the plane closest to b.

Fndng the closest pont n a plane Goal: Gven a pont b and a plane, fnd the pont n the plane closest to b. By translaton, we can assume the plane ncludes the orgn. The plane s a vector space V. Let {v 1, v 2 } be a bass for V. Goal: Gven a pont b, fnd the pont n Span {v 1, v 2 } closest to b. Example: v 1 = [8, 2, 2] and v 2 = [4, 2, 4] b = [5, 5, 2] pont n plane closest to b: [6, 3, 0].

Closest-pont problem n hgher dmensons Goal: An algorthm that, gven a vector b and vectors v 1,..., v n, fnds the vector n Span {v 1,..., v n } that s closest to b. Specal case: We can use the algorthm to determne whether b les n Span {v 1,..., v n }: If the vector n Span {v 1,..., v n } closest to b s b tself then clearly b s n the span; f not, then b s not n the span. Let A = v 1 v n. Usng the lnear-combnatons nterpretaton of matrx-vector multplcaton, a vector n Span {v 1,..., v n } can be wrtten Ax. Thus testng f b s n Span {v 1,..., v n } s equvalent to testng f the equaton Ax = b has a soluton. More generally: Even f Ax = b has no soluton, we can use the algorthm to fnd the pont n {Ax : x R n } closest to b. Moreover: We hope to extend the algorthm to also fnd the best soluton x.

Hgh-dmensonal projecton onto/orthogonal to For any vector b and any vector a, defne vectors b a and b a so that b = b a + b a and there s a scalar σ R such that b a = σ a and b a s orthogonal to a Defnton: For a vector b and a vector space V, we defne the projecton of b onto V (wrtten b V ) and the projecton of b orthogonal to V (wrtten b V ) so that b = b V + b V and b V s n V, and b V s orthogonal to every vector n V. projecton onto V b projecton orthogonal tov b = b V + b V

Hgh-Dmensonal Fre Engne Lemma Defnton: For a vector b and a vector space V, we defne the projecton of b onto V (wrtten b V ) and the projecton of b orthogonal to V (wrtten b V ) so that b = b V + b V and b V s n V, and b V s orthogonal to every vector n V. One-dmensonal Fre Engne Lemma: The pont n Span {a} closest to b s b a and the dstance s b a. Hgh-Dmensonal Fre Engne Lemma: The pont n a vector space V closest to b s b V and the dstance s b V.

Fndng the projecton of b orthogonal to Span {a 1,..., an} Hgh-Dmensonal Fre Engne Lemma: Let b be a vector and let V be a vector space. The vector n V closest to b s b V. The dstance s b V. Suppose V s specfed by generators v 1,..., v n Goal: An algorthm for computng b V n ths case. nput: vector b, vectors v 1,..., v n output: projecton of b onto Span {v 1,..., v n } We already know how to solve ths when n = 1: Let s try to generalze... def project_along(b, v): return (0 f v.s_almost_zero() else (b*v)/(v*v))*v

project onto(b, vlst) def project_along(b, v): return (0 f v.s_almost_zero() else (b*v)/(v*v))*v def project_onto(b, vlst): return sum([project_along(b, v) for v n vlst]) Revews are n... Short, elegant,... and flawed Beautful f only t worked! A tragc falure.

Falure of project onto v 1 b Try t out on vector b and vlst = [v 1, v 2 ] n R 2, so v = Span {v 1, v 2 }. In ths case, b s n Span {v 1, v 2 }, so b V = b. The algorthm tells us to fnd the projecton of b along v 1 and the projecton of b along v 2. The sum of these projectons should be equal to b... but t s not. v 2

Falure of project onto v 1 b Try t out on vector b and vlst = [v 1, v 2 ] n R 2, so v = Span {v 1, v 2 }. In ths case, b s n Span {v 1, v 2 }, so b V = b. The algorthm tells us to fnd the projecton of b along v 1 and the projecton of b along v 2. The sum of these projectons should be equal to b... but t s not. v 2

Falure of project onto v 1 b Try t out on vector b and vlst = [v 1, v 2 ] n R 2, so v = Span {v 1, v 2 }. In ths case, b s n Span {v 1, v 2 }, so b V = b. The algorthm tells us to fnd the projecton of b along v 1 and the projecton of b along v 2. The sum of these projectons should be equal to b... but t s not. v 2

Falure of project onto v 1 b Try t out on vector b and vlst = [v 1, v 2 ] n R 2, so v = Span {v 1, v 2 }. In ths case, b s n Span {v 1, v 2 }, so b V = b. The algorthm tells us to fnd the projecton of b along v 1 and the projecton of b along v 2. The sum of these projectons should be equal to b... but t s not. v 2

What went wrong wth project onto? Suppose we run the algorthm on b and vlst = [v 1,..., v n ]. Let V denote Span {v 1,..., v n }. For each vector v vlst, the vector returned by project along(b, v ) s σ v where σ s b v v v (or 0, f v = 0). The vector returned by project onto(b, vlst) s the sum σ 1 v 1 + σ 2 v 2 + + σ n v n. Let ˆb denote the returned vector. Want to check that ˆb s b V... Is ˆb n V? It s a lnear combnaton of v 1,..., v n, so YES. If ˆb were b V then b ˆb would be b V so b ˆb would be orthogonal to all vectors n V. In partcular, t would be orthogonal to the generators v 1,..., v n. Is t? To check, calculate the nner product of b ˆb wth each of v 1,..., v n. Consder, for example, the generator v 1. b ˆb, v 1 = b, v 1 ˆb, v 1 = b, v 1 σ 1 v 1 + σ 2 v 2 + + σ n v n, v 1 Is ths zero? Expand the last nner product get σ 1 v 1, v 1 but also cross-terms lke σ 2 v 1, v 1 and σ n v n, v 1. The cross-terms keep the algorthm from workng correctly.

How to repar project onto? Don t change the procedure. Fx the spec. Requre that vlst conssts of mutually orthogonal vectors: the th vector n the lst s orthogonal to the j th vector n the lst for every j.