p 1 c 2 + p 2 c 2 + p 3 c p m c 2

Where to put a faclty? Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton. Want c to be as close as possble to all the house. We know how to measure dstance between a proposed locaton c and a pont. But dfferent houses have dfferent deas about where to put the frehouse how to combne ther preferences nto a sngle locaton? Choose the pont that mnmzes the average staton-to-house dstance. Same as mnmzng the sum of staton-to-house dstances. Ths could be really bad for houses that are outsde of the town center. Choose the pont that mnmzes the maxmum staton-to-house dstance.ths could be really bad for most of the houses! Choose the pont that mnmzes sum of squared staton-to-house dstances p 1 c 2 + p 2 c 2 + p 3 c 2 + + p m c 2 Ths s a sort of compromse lke average but f some house s very far away the squared dstance s very large. (These three dfferent measures are called L 1, L, L 2.)

Puttng a faclty n the locaton that mnmzes sum of squared dstances Gven locatons p 1,..., p m n R n of m houses, want to choose a locaton c n R n for the fre staton so as to mnmze sum of squared dstances p 1 c 2 + p 2 c 2 + + p m c 2 Queston:How to fnd ths locaton? Answer: c = 1 m (p 1 + p 2 + + p m ) Called the centrod of p 1,..., p m. It s the average for vectors. In fact, for = 1,..., n, entry of the centrod s the average of entry of all the ponts. Centrod p satsfes the equaton m p = p. Therefore (p p) equals the zero vector.

Provng that the centrod mnmzes the sum of squared dstances Let q be any pont. We show that the sum of squared q-to-datapont dstances s at least the sum of squared p-to-datapont dstances. For = 1,..., m, p q 2 = p p + p q 2 Summng over = 1,..., m, p q 2 = = = p p + p q, p p + p q = p p, p p + p p, p q + p q, p p + p q, p q = p p 2 + p p, p q + p q, p p + p q 2 p p 2 + p p, p q + p q, p p + p q 2 p p 2 + (p p), p q + p q, (p p) + p q 2

Provng that the centrod mnmzes the sum of squared dstances Let q be any pont. We show that the sum of squared q-to-datapont dstances s at least the sum of squared p-to-datapont dstances. Summng over = 1,..., m, p q 2 = = = p p 2 + p p, p q + p q, p p + p q 2 p p 2 + (p p), p q + p q, (p p) + p q 2 p p 2 + 0, p q + p q, 0 + p q 2 = p p 2 + 0 + 0 + p q 2 = sum of squared p-to-datapont dstances + squared p-to-q dstance

k-means clusterng usng Lloyd s Algorthm k-means clusterng: Gven data ponts (vectors) p 1,..., p m n R n, select k centers c 1,..., c k so as to mnmze the sum of squared dstances of data ponts to the nearest centers. That s, defne the functon f (x, [c 1,..., c k ]) = mn{ x c 2 : {1,..., k}}. Ths functon returns the squared dstance from x to whchever of c 1,..., c k s nearest. The goal of k-means clusterng s to select ponts c 1,..., c k so as to mnmze f (p 1, [c 1,..., c k ]) + f (p 2, [c 1,..., c k ]) + + f (p m, [c 1,..., c k ]) The purpose s to partton the data ponts nto k groups (called clusters).

k-means clusterng usng Lloyd s Algorthm Select k centers to mnmze sum of squared dstances of data ponts to nearest centers. Combnes two deas: 1. Assgn each data pont to the nearest center. 2. Choose the centers so as to be close to data ponts. Suggests an algorthm. Start wth k centers somewhere, perhaps randomly chosen. Then repeatedly perform the followng steps: 1. Assgn each data pont to nearest center. 2. Move each center to be as close as possble to the data ponts assgned to t Ths means let the new locaton of the center be the centrod of the assgned ponts.

Orthogonalzaton [9] Orthogonalzaton

Fndng the closest pont n a plane Goal: Gven a pont b and a plane, fnd the pont n the plane closest to b.

Fndng the closest pont n a plane Goal: Gven a pont b and a plane, fnd the pont n the plane closest to b. By translaton, we can assume the plane ncludes the orgn. The plane s a vector space V. Let {v 1, v 2 } be a bass for V. Goal: Gven a pont b, fnd the pont n Span {v 1, v 2 } closest to b. Example: v 1 = [8, 2, 2] and v 2 = [4, 2, 4] b = [5, 5, 2] pont n plane closest to b: [6, 3, 0].

Closest-pont problem n hgher dmensons Goal: An algorthm that, gven a vector b and vectors v 1,..., v n, fnds the vector n Span {v 1,..., v n } that s closest to b. Specal case: We can use the algorthm to determne whether b les n Span {v 1,..., v n }: If the vector n Span {v 1,..., v n } closest to b s b tself then clearly b s n the span; f not, then b s not n the span. Let A = v 1 v n. Usng the lnear-combnatons nterpretaton of matrx-vector multplcaton, a vector n Span {v 1,..., v n } can be wrtten Ax. Thus testng f b s n Span {v 1,..., v n } s equvalent to testng f the equaton Ax = b has a soluton. More generally: Even f Ax = b has no soluton, we can use the algorthm to fnd the pont n {Ax : x R n } closest to b. Moreover: We hope to extend the algorthm to also fnd the best soluton x.

Hgh-dmensonal projecton onto/orthogonal to For any vector b and any vector a, defne vectors b a and b a so that b = b a + b a and there s a scalar σ R such that b a = σ a and b a s orthogonal to a Defnton: For a vector b and a vector space V, we defne the projecton of b onto V (wrtten b V ) and the projecton of b orthogonal to V (wrtten b V ) so that b = b V + b V and b V s n V, and b V s orthogonal to every vector n V. projecton onto V b projecton orthogonal tov b = b V + b V

Hgh-Dmensonal Fre Engne Lemma Defnton: For a vector b and a vector space V, we defne the projecton of b onto V (wrtten b V ) and the projecton of b orthogonal to V (wrtten b V ) so that b = b V + b V and b V s n V, and b V s orthogonal to every vector n V. One-dmensonal Fre Engne Lemma: The pont n Span {a} closest to b s b a and the dstance s b a. Hgh-Dmensonal Fre Engne Lemma: The pont n a vector space V closest to b s b V and the dstance s b V.

Fndng the projecton of b orthogonal to Span {a 1,..., an} Hgh-Dmensonal Fre Engne Lemma: Let b be a vector and let V be a vector space. The vector n V closest to b s b V. The dstance s b V. Suppose V s specfed by generators v 1,..., v n Goal: An algorthm for computng b V n ths case. nput: vector b, vectors v 1,..., v n output: projecton of b onto Span {v 1,..., v n } We already know how to solve ths when n = 1: Let s try to generalze... def project_along(b, v): return (0 f v.s_almost_zero() else (b*v)/(v*v))*v

project onto(b, vlst) def project_along(b, v): return (0 f v.s_almost_zero() else (b*v)/(v*v))*v def project_onto(b, vlst): return sum([project_along(b, v) for v n vlst]) Revews are n... Short, elegant,... and flawed Beautful f only t worked! A tragc falure.

Falure of project onto v 1 b Try t out on vector b and vlst = [v 1, v 2 ] n R 2, so v = Span {v 1, v 2 }. In ths case, b s n Span {v 1, v 2 }, so b V = b. The algorthm tells us to fnd the projecton of b along v 1 and the projecton of b along v 2. The sum of these projectons should be equal to b... but t s not. v 2

What went wrong wth project onto? Suppose we run the algorthm on b and vlst = [v 1,..., v n ]. Let V denote Span {v 1,..., v n }. For each vector v vlst, the vector returned by project along(b, v ) s σ v where σ s b v v v (or 0, f v = 0). The vector returned by project onto(b, vlst) s the sum σ 1 v 1 + σ 2 v 2 + + σ n v n. Let ˆb denote the returned vector. Want to check that ˆb s b V... Is ˆb n V? It s a lnear combnaton of v 1,..., v n, so YES. If ˆb were b V then b ˆb would be b V so b ˆb would be orthogonal to all vectors n V. In partcular, t would be orthogonal to the generators v 1,..., v n. Is t? To check, calculate the nner product of b ˆb wth each of v 1,..., v n. Consder, for example, the generator v 1. b ˆb, v 1 = b, v 1 ˆb, v 1 = b, v 1 σ 1 v 1 + σ 2 v 2 + + σ n v n, v 1 Is ths zero? Expand the last nner product get σ 1 v 1, v 1 but also cross-terms lke σ 2 v 1, v 1 and σ n v n, v 1. The cross-terms keep the algorthm from workng correctly.

How to repar project onto? Don t change the procedure. Fx the spec. Requre that vlst conssts of mutually orthogonal vectors: the th vector n the lst s orthogonal to the j th vector n the lst for every j.