Clustering Affine Subspaces: Algorithms and Hardness

Size: px

Start display at page:

Download "Clustering Affine Subspaces: Algorithms and Hardness"

Silvia Bridges
5 years ago
Views:

1 Clusterng Affne Subspaces: Algorthms and Hardness Thess by Euwoong Lee In Partal Fulfllment of the Requrements for the Degree of Master of Scence Calforna Insttute of Technology Pasadena, Calforna 01 (Submtted )

2 c 01 Euwoong Lee All Rghts Reserved

3 Acknowledgements The author s ndebted to Professor Leonard J. Schulman, who supervsed the thess, for helpful crtcsm and advce.

4 v Abstract We study a generalzaton of the famous k-center problem where each object s an affne subspace of dmenson, and gve ether the frst or sgnfcantly mproved algorthms and hardness results for many combnatons of parameters. Ths generalzaton from ponts ( = 0) s motvated by the analyss of ncomplete data, a pervasve challenge n statstcs: ncomplete data objects n R d can be modeled as affne subspaces. We gve three algorthmc results for dfferent values of k, under the assumpton that all subspaces are axs-parallel, the man case of nterest because of the correspondence to mssng entres n data tables. 1) k = 1: Two polynomal tme approxmaton schemes whch runs n poly(, 1/ɛ)nd. ) k = : O( 1/4 )-approxmaton algorthm whch runs n poly(n, d, ) 3) General k: Polynomal tme approxmaton scheme whch runs n O( k log k(1+1/ɛ )) nd We also prove nearly matchng hardness results; n both the general (not necessarly axs-parallel) case (for k ) and n the axs-parallel case (for k 3), the runnng tme of an approxmaton algorthm wth any approxmaton rato cannot be polynomal n even one of k and, unless P = NP. Furthermore, assumng that the 3-SAT problem cannot be solved subexponentally, the dependence on both k and must be exponental n the general case (n the axs-parallel case, only the dependence on k drops to Ω( k) ). The smplcty of the frst and the thrd algorthm suggests that they mght be actually used n statstcal applcatons. The second algorthm, whch demonstrates a theoretcal gap between the axs-parallel and general case for k =, dsplays a strong connecton between geometrc clusterng and classcal colorng problems on graphs and hypergraphs, va a new Helly-type theorem.

5 v Contents Acknowledgements Abstract v 1 Introducton Related Work Our Results Prelmnares 6 3 Two algorthms for 1-center for axs-parallel flats 8 4 Clusterng Hardness 16 5 Clusterng Algorthms 4 6 -Clusterng of axs-parallel flats 7 7 Conclusons and Future Work 31 Bblography 3

6 1 Chapter 1 Introducton Clusterng s one of the most mportant problems n computatonal theory. Due to ts wde range of applcatons, t has been nvestgated for decades and contnues to be actvely studed not only n theoretcal computer scence, but n other dscplnes such as statstcs, data mnng and machne learnng. Generally speakng, the goal of clusterng s to partton a gven set of n data objects nto k groups so that the objects wthn each group are smlar. One way to represent data objects and measure smlarty s to regard each data object wth d features as a pont n the Eucldean space R d, where each coordnate corresponds to each feature; two objects are smlar when the Eucldean dstance between them s small. Based on ths observaton, the famous k-means clusterng (mnmzng the sum of the squared dstance from each pont to the nearest center), k-medan clusterng (mnmzng the sum of the dstances), and k-center clusterng (mnmzng the maxmum dstance) problems have been suggested and studed both theoretcally and practcally. However, n realty, data objects often do not come fully equpped wth a mappng nto Eucldean space. A smple and ubqutous example can be found n a questonnare where a few questons are left blank. Lkewse, n a longtudnal medcal study, a patent s mssed appontment creates mssng real-valued entres n ther record. Assume that the answers to these unanswered questons, or unmeasured values, actually exst (but are not known to us). Then a true data pont exsts, and les on the axs-parallel affne subspace fxed by the answers or measurements whch we do possess. No more nformaton, however, can be retreved from the ncomplete data object. Ths s a huge and pervasve problem n statstcal practce, and there are many methods to deal wth these ncomplete data n the statstcal lterature [1]. Lstwse deleton deletes any ncomplete data object, and parwse deleton only consders avalable entres n each feature to compute statstcal data such as the average and the varance. The former mght throw away too much nformaton, whle the latter cannot be drectly appled to clusterng, snce t only gves a way to retreve some statstcal nformaton about the whole set but does not specfy how to represent each data object to cluster. More sophstcated methods are those fllng n the mssng entres usng

7 heurstcs. They nclude substtutng wth the sample mean and replacng a mssng element usng a learnng algorthm or crteron (EM, max lkelhood). Whle these methods work better than the smple deleton methods n practce [1], lttle s known about ther theoretcal performance, except n the very specal and almost-never satsfed case that the data s mssng at random wth no correlaton wth the underlyng mssng entres. It s unusual that the expermenter even has any good probablstc model for the mssngness of entres, and the absence of such a model precludes guarantees of the type one s accustomed to for statstcal procedures n the theoretcal statstcs lterature. As an alternatve, Gao et al. [13] frst suggested a new approach rooted n the geometry of a data set. A data object that s lackng nformaton about one or more features corresponds to an affne subspace, also known as a flat, n R d, whose dmenson s the number of mssng features. Snce the dstance between a flat and a pont (we stll model each center as a pont) s well-defned, the classcal clusterng problems such as k-means, k-medan, and k-center can be defned on a set of flats; we smply need to fnd k centers to mnmze the objectve functon, based on the dstance between each flat to ts nearest center. Ths approach has the strong theoretcal guarantee that f we fnd the optmal clusterng of a set of flats wth respect to some objectve, we automatcally compute the set of ponts, one n each flat, such that t mnmzes the objectve over all such sets of ponts. In terms of clusterng ncomplete data, ths method computes the best guess about ncomplete entres (.e., guess about a true pont n each flat) that gves the most-clusterable complete data among all guesses. Therefore, f the set of data objects s guaranteed to be well-clustered, ths geometrc approach yelds the correct clusterng as well as the correct guesses for mssng entres. We choose to extend the k-center problem to ths settng, whch mnmzes the maxmum dstance from any flat to ts nearest center. Among many clusterng problems, the k-center problem has the most classc and smplest theoretcal guarantees; there s a smple -approxmaton algorthm [17], and t s NP-hard to approxmate wth an approxmaton rato smaller than 1.8 [11]. However, the only geometrc property that enables a smple -approxmaton algorthm, the trangle nequalty, does not hold at all; even though two ponts are wthn some fxed dstance from one flat, the two ponts can be arbtrarly far apart. Ths fact changes the classcal k-center problem dramatcally so that we need new algorthms and hardness results, even when there s only one cluster (k = 1). Ths paper starts by gvng effcent algorthms for k = 1 and moves on to general k where we gve an approxmaton scheme and the matchng hardness results, closng wth the dfference between the case where all flats are axs-parallel and the general case.

8 1.1 Related Work 3 Dfferent clusterng problems have been studed extensvely. The k-medan problem has been often studed n a fnte metrc space, where the problem can be solved exactly n polynomal tme f k s fxed. For general k, there s a known (1 + /e)-hardness of approxmaton result [1] and substantal work on approxmaton algorthms [18, 7,, 1, 9], wth the best guarantee a 3 + ɛ approxmaton. The k-means problem has been often studed n the Eucldean space, and even -means problem was shown to be NP-hard [8]. Instead, many PTAS s have been suggested for the Eucldean k-means and k-medan problem, wth the best runnng tme polynomal n n and d but exponental n k[5, 6, 6, 9, 19, 3, 10]. Recently, the focus has been on the well-clusterable or stable nstances and approxmaton schemes wth better runnng tmes have been suggested for those nstances[7, 4, 3, ]. The k-center problem has relatvely smple hstory where a smple -approxmaton algorthm and an almost matchng hardness result were provded for general k [17, 11]. As n the Eucldean k-means and k-medan, there s a PTAS whch runs n tme exponental n k [6]. However, whle the 1-mean problem n the Eucldean space and the 1-medan problem n a fnte metrc space can be solved trvally, the 1-center problem n the Eucldean space, whch tres to fnd the smallest ball contanng all data ponts, s nontrval. There are several approxmaton schemes for ths 1-center problem, also known as the Mnmum Enclosng Ball problem [5, 9, 8], wth the best runnng tme O( nd ɛ ) to fnd an enclosng ball of radus at most (1 + ɛ) tmes the optmal radus. In [13], Gao et al. frst suggested the problem of clusterng flats as a means to cope wth ncomplete data, and studed the 1-center problem. They proved the exstence of a ɛ core set of sze O( 4 4 ɛ ); there s a subset of O( ɛ ) flats of whch the optmal radus s at least 1 1+ɛ of the optmal radus of the entre set. Ther man result s based on an Intrnsc-dmenson Helly theorem, whch says that there s a 1-core set of sze +. However, ther algorthm to fnd such a core set requres tme exponental n, so they suggested two alternatve algorthms for the problem. The frst algorthm usng convex programmng requres tme Õ( n(d 3 + d n)poly( )) to get a constant addtve approxmaton, and the second algorthm usng LP-type programmng requres expected tme O(n(poly( )d + poly( ) )) to get a constant multplcatve approxmaton. In the subsequent paper [14], the same authors studed an actual clusterng problem wth k =, 3 n the restrcted case = 1, and developed -approxmaton algorthms whose runnng tme s quas-lnear n n and d. To the best of our knowledge, all cases [ = 1 and k 4] and [ and k ] were completely open (even for axs-parallel flats). For the combnatons of small k and already studed, t was unclear whether there were faster approxmaton schemes.

9 1. Our Results 4 We present the algorthms and hardness results for clusterng flats for many possble combnatons of k and, where each of them ether s the frst result or sgnfcantly mproves the prevous results for the gven values for k and. Our algorthms and hardness results are summarzed n Table 1.1 and 1. respectvely. They can be classfed nto three categores. 1) Improved algorthms for k = 1 and the frst algorthm for general k: In Secton 3, we revst the case k = 1 where all flats are axs-parallel and suggest two PTAS s; both are smple to descrbe but ther runnng tmes are lnear n n and d, whch are even better than the fastest known algorthm, whch uses a convex programmng solver. We also partally explan why the problem gets harder as ncreases. The key lemma to many 1-center algorthms, whch shows an upper bound on the dstance between any pont and the optmal center n terms of the radus of the mnmum ntersectng ball centered at that pont, gets relaxed as ncreases. Extendng one of these algorthms, n Secton 5, we present a PTAS for fxed k and for the axs-parallel case. The 1 O( k log k(1+ runnng tme to get a (1 + ɛ)-approxmaton s ɛ )) nd, whch s tght n the sense that t s exponental n both k and but lnear n both n and d. tme: ) The frst and almost matchng lower bounds on approxmaton rato and runnng In Secton 4, we show that actual k-center clusterng(k ) s greatly harder than fndng the mnmum ntersectng ball. In fact, we show that the runnng tme of any approxmaton algorthm has to depend exponentally on both k and ; for fxed k there s no polynomal (n n, d, ) tme algorthm that guarantees any approxmaton rato, and for fxed 1 there s no polynomal (n n, d, k) tme approxmaton algorthm ether. In the axs-parallel case, the same result holds for k 3 and 3, meanng that the axs-parallel case s not much easer than the general case when k and become large. Furthermore, assumng that the 3-SAT problem cannot be solved n subexponental tme, our reductons also show that there cannot be any approxmaton algorthm for the general case whose runnng tme s subexponental n ether k or ; our algorthm s almost tght n each parameter. 3) Axs-parallel vs General, Connecton to Graph Colorng, and New Helly-type theorem Fnally, n Secton 6, we study axs-parallel -center, where our napproxmablty rato drops from n the general case to 3. The fact that axs-parallel -center s greatly easer than even slghtly more general problems shows the relatonshp between clusterng flats n the Eucldean space and the fundamental colorng problems n graphs and hypergraphs. We gve a polynomal tme approxmaton algorthm for the axs-parallel -center problem usng an algorthm for the GRAPH -COLORING problem, the only verson of the colorng problems that can be solved effcently. As a tool, we prove another Helly-type theorem for a set of axs-parallel flats; when all parwse dstances are at most r, then there s a ball of radus O(d 1 4 r) that ntersects every flat. We also gve

10 5 an example where the optmal radus s Ω(d 1 4 r) k (R 1 + δ, O( n(d 3 + d n)poly( ) log (n/δ))) [13] ((1 + ɛ)r, O( nd ɛ log ɛ )) ((1 + ɛ)r, O(nd log 1/ɛ + n log n log 1/ɛ)) [14] (O( 1/4 )R, O(dn logn)) 3 ((1 + ɛ)r, O(nd log 1/ɛ + n log n log 1/ɛ ɛ )) [14] ((1 + ɛ)r, O( (1+1/ɛ )) nd) 4... ((1 + ɛ)r, O(klogk(1+1/ɛ )) nd) ((1 + ɛ)r, O( klogk(1+1/ɛ )) nd) Table 1.1: Algorthms for dfferent k and. Each par (x, y) ndcates that t takes tme y untl the objectve s less than x, when R denotes the optmum. Pror work s ndcated by plan text and a ctaton, and apples to general flats; results of ths paper are ndcated by boldface, and apply to axs-parallel flats. Algorthms n column... work for any value of, and algorthms n row 4... work for any value of k. k 1 (PTAS) , [11] 3 Table 1.: Lower bounds on the multplcatve approxmaton rato of polynomal algorthms. Pror work s ndcated by plan text and a ctaton; results of ths paper are ndcated by boldface. In each cell, the top-rght corner represents the general case and the bottom-left corner represents the axs-parallel case. Results n column 3... hold for any value of, and results n row 3... hold for any value of k.

11 6 Chapter Prelmnares A -flat n R d (d > ) s a -dmensonal affne space. Let L = {l 1,..., l n } be the set of flats that we want to cluster. For smplcty, we consder the stuaton where all of them are of dmenson. All our algorthms and analyses work even when L contans flats of dfferent dmensons and s taken to be the greatest dmenson of any flat. For c R d and r R +, let B c,r be the closed ball of radus r centered at c. The ball B c,r ntersects L f t ntersects each flat n L. Let R (L) be the mnmum radus of any ntersectng ball of L. A mnmum ntersectng ball s B c,r for some c R d that ntersects L. Unlke the stuaton where L conssts of ponts, there can be more than one mnmum ntersectng ball. Let C (L) be the set of ponts such that c C (L) f and only f c s the center of a mnmum ntersectng ball of L. When t s clear from the context, we do not use L n these notatons. Let d(p, q) be the Eucldean dstance between two ponts p and q, d(p, l) = mn q l d(p, q) be the dstance between a pont p and a set of ponts l, and R(c, L) = max l L d(c, l ) be the mnmum radus of any ntersectng ball centered at c. Note that R R(c) for all c R d. For a flat l and ts translaton l that contans the orgn(.e., l s a subspace of Rd ), let d l (p, q) be the length of the vector p q projected to l, and d l (p, q) be the length of the vector p q projected to the orthogonal complement of l. By the Pythagorean theorem, (d(p, q)) = (d l (p, q)) + (d l (p, q)). Sometmes we consder the stuaton where all flats are axs-parallel. In ths case, each flat l s represented as l = { (x 1,..., x d ) (x j1,..., x j ) R } where each x u (u / {j 1,..., j }) s fxed. We sometmes wrte l as a d-dmensonal vector, wrtng n the jth coordnate when t s not fxed. For example, l j = (, 5,, 3, 1) = { (x 1, 5, x 3, 3, 1) (x 1, x 3 ) R } represents a -flat that s parallel to the frst and thrd unt vector. Call jth coordnate trval when there s no flat that has the fxed jth coordnate. We can assume that no coordnate s trval snce otherwse smply removng ths coordnate from all flats wll decrease and d by 1 whle not affectng the clusterng cost for any k. Let m j and M j be the mnmum and the maxmum of the jth coordnate over all flats that have the fxed jth coordnate, respectvely. Snce the dameter of the mnmum ntersectng ball must be at least the largest dfference n the jth coordnate for any j, R (L) max j M j m j. Furthermore,

12 7 when the jth coordnate of c s outsde of [m j, M j ] (say c j < m j ), changng c j to m j does not ncrease d(c, l ) for any l, whch means that there s an optmal center c C n the hypercube [m 1, M 1 ]... [m d, M d ]. Therefore, any pont c n ths hypercube satsfes R(c, L) dr, and d(c, C ) dr.

13 8 Chapter 3 Two algorthms for 1-center for axs-parallel flats We gve two approxmaton schemes for the case k = 1, whch s fndng the smallest ball that ntersects every flat. The frst algorthm resembles Lloyd s method for the k-means problem [4]; n each teraton we fnd the closest pont to the current center n each flat, and move the current center to mnmze the maxmum dstance to these ponts. However, unlke Lloyd s method for k > 1, the convergence tme of ths teratve method to the global optmum can be shown to be fast, as we shown n Theorem 3.1. Frst, we take the ntal center to be any pont n l ([m 1, M 1 ]... [m d, M d ]) for any. For any optmal center c n the hypercube, d l (c, c ) d l (c, l ) + d l (l, c ) 0 + R R and d l (c, c ) max j (M j m j ) 4(R ). Therefore, (d(c, c )) O( (R ) ) and R(c) O( (R ) ). Algorthm 1 Lloyd-Type Algorthm Gven a set of -flats L = {l 1,..., l n } and the ntal center c l 1 ([m 1, M 1 ]... [m d, M d ]) 1: For = 1,..., n, fnd p l whch s closer to c than any other pont n l (.e. p s the projecton of c to l ). : Fnd an approxmate mnmum enclosng ball for p 1,..., p n usng an exstng algorthm (see Secton 1.1). Update c to be the center of the mnmum enclosng ball. 3: Iterate 1 and for the precomputed number of teratons. The convergence of Algorthm 1 depends on how fast R(c, L) s mproved n each teraton. Gven c and ts projectons p 1,..., p n to l 1,..., l n respectvely, we show that there exsts c such that R(c, {p 1,..., p n }) s sgnfcantly less than R(c, {p 1,..., p n }) = R(c, L). Snce n each step an (approxmate) mnmum enclosng ball of {p 1,..., p n } s found, we are guaranteed that our updated center reduces R(c, L) sgnfcantly, leadng to the convergence of the algorthm. Ths c s found from the lne segment jonng the current center c and one of the optmal centers c C. Theorem 3.1. If all flats are axs-parallel, Algorthm 1 satsfes R(1 ɛ 0 ) < R n O( log + ) ɛ 0 teratons.

14 9 Proof. We fnd c on the lne segment jonng the current center c and one of the optmal centers such that R(c, {p 1,..., p n }) s sgnfcantly less than R(c, {p 1,..., p n }) = R(c, L). Let c C be the optmal center whch s the closest to c. Let c = c + s c c (c c) for some s that wll be determned later. In other words, we consder the pont obtaned by movng c by s towards c. If c s sgnfcantly better than c for {p 1,..., p n }, so wll the new center computed by the mnmum enclosng ball algorthm, and t mples that R(c) s sgnfcantly mproved. Fx one teraton between Lne 1 and Lne of Algorthm 1 and let R = R(c) be the current radus. Suppose R = βr for some β (0, 1). We classfy each flat nto one of three dfferent categores. The frst category conssts of l s such that d(c, p ) R. Snce we moved the center by s, d(c, p ) R + s by usng the smple trangle nequalty. The second category conssts of l s such that R < d(c, p ) αr for some constant α (β, 1) that wll be determned later. Snce d(c, l ) R < d(c, p ), the angle p cc = p cc cannot be greater than π/. Therefore, d(c, p ) (αr) + s. The thrd category, whch conssts of l s such that αr < d(c, p ) R, s the only category where each member s dstance s guaranteed to be decreased by choosng the new center, and therefore needs the most nvolved analyss. Fgure 3.1 shows the three dfferent categores. c c c c c c c R αr R p l Fgure 3.1: 3 possble categores dependng on d(c, p ) n d =, = 1 Fx one l n the thrd category. Assume wthout loss of generalty that t s parallel to the frst coordnates. In other words, l = (,...,, x +1,..., x d ) = { (x 1,..., x d ) (x 1,..., x ) R } where x +1,..., x d are fxed. To show that d(c, p ) s sgnfcantly less than d(c, p ), we show an upper bound of d l (c, c ) and a lower bound of d l (c, c ). Ths means that the vector (c c) has the sgnfcant component orthogonal to l. As shown n Fgure 3., by choosng s approprately, we can make d(c, p ) sgnfcantly less than d(c, p ). Lemma 3.. d l (c, c ) R Proof. Let c = (c 1,..., c d ), c = (c 1,..., c d ), and assume wthout loss of generalty that c j > c j for all j. For each jth coordnate(j = 1,..., ), there exsts a flat n L that has the fxed jth coordnate smaller than c j, snce otherwse we can slghtly ncrease c j to reduce d(c, c ) whle

15 c c d l (c, c ) c 10 d l (c, c ) p l Fgure 3.: Larger d l (c, c ) guarantees smaller d(c, p ). not ncreasng R(c ), contradctng the fact that c s the optmal center closest to c. Therefore, R max 1 j (c j c j ) 1 (c1 c 1 ) (c c ) = 1 d l (c, c ). Lemma 3.3. d l (c, c ) d(c, l ) R αr R = (α β)r. Proof. Snce d(p, l ) = d l (p, l ) = d l (p, q) for any p R d and ts projecton q on l, d(c, l ) = d l (c, p ) and R d(c, l ) = d l (c, p ). Therefore, the trangle nequalty d l (c, c ) d l (c, p ) d l (p, c ) mples the frst nequalty. The last nequalty just follows from the assumpton made on the thrd category. Lemma 3.4. Let s l = d l (c, c ). (d l (c, p )) R Rs l 1 ( β α ) + (s l ). Proof. Snce ths lemma deals only wth d l, we restrct ourselves to the orthogonal complement of l, where l s the translaton of l that s a subspace. Fgure 3.3 shows the stuaton projected to (l ). In ths space, l = p s a sngle pont. Let θ be the angle c cl = c cl. By the law of cosnes, (d l (c, p )) R Rs l cos θ + (s l ). Snce d l (c, p ) < d l (c, p ), we know that θ < π/. By the law of snes, cos θ 1 ( β α ). sn θ d l (c,p = sn cc p ) d l (c,p 1 R ) αr, so θ arcsn αr = arcsn β α. Therefore, s l c c (c, c ) θ c αr d l (c, c ) R d l (c, c ) R p = l Fgure 3.3: The whole space projected to (l ) Fnally, we are ready to combne the results on d l and d l to compute how we can get close to p by movng along c c. By Lemma 3. we have d l (c, c ) R, and by Lemma 3.3 we have

16 11 d l (c, c ) (α β)r. Therefore, f we make c = c + s c c (c c), (d l (c, c )) s d l (c, c ) s (α β) +(α β). Take s to be R (α β) +(α β) 1 ( β α ). +(α β) and (d(c, p )) = (d l (c, p )) + (d l (c, p )) s + (α β) + R Rs l 1 ( β α ) + (s l ) = s + (α β) + (α β) R Rs + (α β) 1 ( β α ) + s (α β) + (α β) = R (α β) Rs + (α β) 1 ( β α ) + s = R s Combnng all three categores, (d(c, p )) max((r + s), (αr) + s, R s ). R s (αr) + s f and only f s (1 α ) R, and R s (R + s) f and only f s β+ β R. 1 α = 1/ and β 1/4 makes ( 3 4 )R s R and satsfes these two nequaltes, meanng that the maxmum dstance to each flat s stll attaned n a flat n the thrd category. We can apply the above analyss for the thrd category to argue that R (c) s decreased by at least (1 Ω( 1 ))R n each teraton untl R < 4R. From the nequalty (1 c ) e c for any 0 < c <, R decreases exponentally n every O( ) teraton. Snce we can take the ntal c such that R(c) = O( R ), we need only O( log ) teratons untl R < 4R. Ths analyss also shows that R(c) converges to R, and to get 1 1 ɛ 0 -approxmaton we need O( ) teratons. For each teraton after R < 4R, choose ɛ be such that R ɛ 0 R = β = 1 ɛ and let α = 1 ɛ, makng s = ɛ3 ( 3ɛ) ( +ɛ )(1 ɛ) R. R < 4R mples that ɛ < 3 8. As above, we need to check (1 α that the maxmum dstance to each flat s attaned n a flat n the thrd category. ) becomes ɛ( ɛ), and ɛ 3 ( 3ɛ) ɛ) ( ( + ɛ )/(ɛ( ) = )(1 ɛ) = ɛ 3 ( 3ɛ) ( + ɛ )(1 ɛ) ɛ( ɛ) ɛ ( 3ɛ) ( + ɛ )(1 ɛ) ( ɛ) ɛ ( + ɛ )(1 ɛ) ɛ ( (1 ɛ) ) ( 3 5 ) 1

17 1 shows that the maxmum dstance n the second category cannot exceed the maxmum dstance n the thrd category. For the frst category, we have β+ β = ɛ ɛ 4ɛ R ɛr. ɛ 3 ( 3ɛ) ( ( + ɛ )(1 ɛ) )/(ɛ ) = 1 ɛ( 3ɛ) ( + ɛ )(1 ɛ) ɛ( 3ɛ) (1 ɛ) snce ɛ 3ɛ (1 ɛ) < 1 for all 0 < ɛ < 3/8. Therefore, the maxmum dstance n the thrd category agan domnates, and we can apply the same argument to show that R (c) s decreased by s ɛ 3 ( 3ɛ) ( +ɛ )(1 ɛ) R = Ω( ɛ3 )R. Therefore, as long as R > 1 1 ɛ 0 R, R s decreased by Ω( ɛ3 0 ). If we start from R < 4R, we need O( ɛ 3 0 ) teratons untl R > 1 1 ɛ 0 R. Snce R and ɛ decrease together and ɛ n the early teratons s much bgger than ɛ 0, usng the dea of Gao et al. [13], we can prove a better upper bound on the number of teratons. Assume ɛ 0 = 1/ m. For each = 1,..., m, β can be reduced from (1 1/ 1 ) to (1 1/ ) n just O( 1/ (1/ ) 3 ) = O( ) teratons. Summng over = 1,..., m, the last term s greater than the sum of the other terms, so the total runnng tme s O( m ) = O( ɛ 0 ). Combnng the frst part of teratons that make R < 4R, the number of teratons untl R(1 ɛ 0 ) < R s O( log + ɛ 0 ). Snce Algorthm 1 uses an algorthm for fndng the mnmum enclosng ball for ponts as a subroutne, ts overall tme complexty depends on the algorthm we use. Currently, the best algorthm for the Mnmum Enclosng Ball problem runs n tme O( nd ɛ ) to get a (1 + ɛ)-approxmaton [9, 8]. Snce we am to reduce R by Ω( 1 ) n the frst phase to have R < 4R and Ω( ɛ3 0 ) n the second phase to have R < R 1 ɛ, we need a (1 + Ω( 1 ))-approxmaton n the frst phase and (1 + Ω( ɛ3 0 ))- approxmaton n the second phase. Therefore, the total runnng tme s O(nd( log + )). ɛ 5 0 Even though the runnng tme s better than that of the algorthm whch uses a convex programmng solver n terms of n, d, and, t has the relatvely bad dependence on ɛ. The next algorthm, nspred by the work of Pangrahy [8], shows that fndng a mnmum ntersectng ball for flats can be done nearly as fast as fndng the mnmum enclosng ball for ponts except the natural penalty dependng on. Ths algorthm starts from guessng the optmal radus, whch can be done easly by usng bnary search. Wth the ball of radus (1 + ɛ)r, we fnd a flat that s not covered by the ball, move the ball untl t hts the flat. Its convergence s guaranteed by the fact that d(c, c ) wll be decreased n each teraton. Theorem 3.5. If all flats are axs-parallel and R s gven, Algorthm computes an ntersectng ball wth radus R < R (1 + ɛ) n tme O( nd ɛ ).

18 13 Algorthm Movng Center Algorthm Gven a set of -flats L = {l 1,..., l n }, the ntal center c l 1 [m 1, M 1 ]... [m d, M d ], and the optmal radus R = R (L), 1: Fnd the farthest flat l from the current center c. Fnd p l such that d(c, l ) = d(c, p ) (.e. p s the projecton of c to l ). : If d(c, l ) < (1 + ɛ)r, we found a (1 + ɛ)-approxmate mnmum ntersectng ball. Stop. 3: Move c towards p untl B c,r contans p (can compute n closed form). 4: Iterate 1,, and 3 for the precomputed number of teratons. Proof. Let l be the flat chosen n the th teraton. Let c be the center before we move t towards l, and c be the center after the move. Snce c s on the lne segment jonng c and ts projecton p to l and every dstance from a pont l s defned n l, we can restrct our attenton to the orthogonal complement of l. In ths space, l = p becomes a sngle pont. d(c, l ) = d l (c, l ) R, d(c, l ) = d l (c, l ) > (1 + ɛ)r, d(c, l ) = d l (c, l ) = R Snce d l (c, l ) d l (c, l ), l c c s acute and cc c s obtuse as shown n Fgure 3.4. Therefore, (d l (c, c )) (d l (c, c)) (d l (c, c )) (d l (c, c)) ɛ. Therefore, n each teraton, (d l (c, c)) s decreased by ɛ, and so s (d(c, c)). Snce we take the ntal center c such that d(c, c ) O( )R, so we need only O( ɛ ) teratons. Each teraton only conssts of computng the dstance and the projecton to each flat from the current center, whch takes tme O(nd). The total runnng tme s O( nd ɛ ). p R R ɛr c c c Fgure 3.4: The center of the ball gets closer to the optmal poston n each teraton Note that our ntal center also satsfes R(c) O( R ). Therefore, to guess the optmal radus wth error < ɛ, we need at most O(log ɛ ) tmes of guessng. Therefore, the overall runnng tme s O( nd ɛ log ɛ ). Why s the runnng tme stll worse than the orgnal Mnmum Enclosng Ball problem, and why does t depend on? Earler we mentoned that the trangle nequalty does not hold when 1. Algorthmcally, the lack of the trangle nequalty prevented us from usng a smple but powerful

19 lemma frst proved n [16]; when c and c 14 are the current and the optmal center respectvely, we can fnd a pont p such that d(c, p) = R and d(c, p) (R ) + d(c, c ), whch mples d(c, c ) R (R ). Many algorthms for the Mnmum Enclosng Ball problem [6, 8] used ths lemma to mprove ther runnng tme. Also n our algorthms, ths lemma would guarantee a tghter bound of the dstance between the current center and the optmal center, gvng a better runnng tme. However, the next lemma shows that the bound of ths key lemma gets gradually relaxed as ncreases. When s bg, d(c, c ) can be nearly as bg as R, whch does not mprove our convergence analyses. Lemma 3.6. Let c be the current center, c C be the closest optmal center to c, and R = R(c, L) be the radus of the mnmum ntersectng ball centered at c. There s an nstance where (d(c, c )) (R (R ) ) 1 (R ) ( ). Proof. Our counterexample has n = d = +, and conssts of n flats of dfferent dmensons. Note that ths can be easly extended to the case where all flats have the same dmenson, by addng at most coordnates and gvng ether the same fxed value or a free varable to each added coordnate. L = {l 1, l, p 1,..., p }. l 1 and l are of dmenson and each p s of dmenson less than. Let l 1 = ( I, 0,,..., ), l = (+I, 0,,..., ) and p = (0,..., 0, I, 0,,..., ) where I > 0 appears n the + 1th coordnate. In other words, p has the frst coordnates 0, + 1th coordnate I, + th coordnate 0, and the remanng coordnates. Clam 3.7. The orgn s the center of unque the mnmum ntersectng ball,.e. C = {O}. Proof. Snce the frst coordnate of l 1 and l dffer by I, R I. It s obvous that B O,I ntersects all flats, snce each flat has only one nonzero and nonfree coordnate of whch the value s I. We argue that there s no other center c that acheves I. By l 1 and l, t s obvous that to acheve the radus I, the frst coordnate of c must be zero. We consder p 1,..., p n ths order. Inductvely, when we look at p and the frst coordnates are fxed to 0, t means that the dstance from c to each of l 1, l, p 1,..., p 1 n the frst coordnates s I. Snce the value of ( + 1)th coordnate of each of these flats s 0 or, c must have the the + 1 coordnate 0. Agan, the dstance from c to p n the frst + 1 coordnates s I, so the nducton s complete untl all coordnates of c must be zero.

20 15 Let c = (0, x, x 1,..., x, x)i. (d(c, l 1 )) = (1 + x +1 )I (d(c, p 1 )) = ((1 x ) + x )I = (1 + x +1 x )I (d(c, p )) = (x +1 + (1 x 1 ) + x 1 )I = (1 + x +1 + x x 1 )I... (d(c, p )) = (x x +3 + (1 x +1 ) + x +1 )I = (1 + x x + x +1 )I 1 + x x + x +1 1 for small x, so (R(c )) = (d(c, l 1 )) = (1 + x +1 )I. (R(c )) (R ) = x +1 I, but (d(c, c )) x I (x +1 ) 1 I I = ((R(c )) (R ) ) 1 (R )

21 16 Chapter 4 Clusterng Hardness A k-clusterng C = (L 1,..., L n, c 1,..., c k ) s a partton of L nto dsjont subsets (clusters) L 1,..., L k wth centers c 1,..., c k. Let R(C) = max R(c, L ) be the maxmum radus of any cluster of C, whch we also call the radus of the clusterng C. Let R (k, L) be the optmal radus of any k-clusterng. Agan, we do not use L or C n these notatons when t s clear from the context. Whle we have a FPTAS for the 1-center problem, that s not the case when we have a real clusterng problem wth k. The fundamental concept of clusterng - parttonng objects nto k groups so that the objects n each group are smlar - has been consdered n even the earlest problems n computatonal theory when each object corresponds to a vertex n a graph and there s an edge when two objects are smlar or related; the VERTEX COVER and DOMINATING SET problem ask to fnd k centers that cover ther relatonshps (ncdent edges) or drectly smlar objects (neghbors), respectvely. k-coloring, whch s fndng a partton of the vertex set nto k dsjont sets such that the subgraph nduced by each set s a coclque (here, the exstence of an edge means that the two vertces are dfferent), looks more lke a verson of the clusterng problem. Even more boldly, some varants of the 3-SAT problem can be sad to look for a partton of a set of varables nto dsjont sets (those assgned true or false) to satsfy the gven constrants, f flppng true and false does not change the satsfablty (.e., clusters are symmetrc). These many fundamental graph-theoretc or logc problems that are related to the concept of clusterng, and ther hardness results have not been reflected very much n the typcal clusterng problems studed n the Eucldean space. One of the reasons s that t s hard to fnd a rght embeddng of a gven graph to the Eucldean space such that the relatonshps are preserved. When objects correspond to -flats, however, such embeddngs can be found more easly, enablng us to prove that clusterng flats s much harder than clusterng ponts. Some nterestng graph-theoretc propertes are preserved n the reducton to our problem, as the dstncton of easy -COLORING and hard 3-COLORING s reflected n the strct better approxmablty wth centers than wth 3 centers n the axs-parallel case. We start by reducng the k-coloring problem to our problem, whch shows that k-clusterng s hard even for fxed k 3 and the axs-parallel restrcton.

22 17 Theorem 4.1. For fxed k 3 and any I, there s no algorthm that computes a I-approxmate k-clusterng and runs n tme polynomal of n, d, unless P = NP. Proof. We reduce the GRAPH k-coloring problem(k 3) to our problem. Gven G = (V, E) wth V = { } { v 1,..., v V and E = (v1,1, v 1, ),..., (v E,1, v E, ) }, we construct the set of n = V flats L = {l 1,..., l n } n d = E -dmensonal coordnates. Each flat l corresponds to each vertex v and each jth coordnate corresponds to each edge (v j,1, v j, ). For each 1 n and 1 j m, (l ) j s decded as the followng: v / (v j,1, v j, ): (l ) j =,.e. s does not have a fxed value n the jth coordnate. v = v j,1 : (l ) j = 1 v = v j, : (l ) j = +1 G s k-colorable f and only f there s a k-clusterng where each cluster does not have two flats that have the fxed values 1 and +1 n the same coordnate, whch ensures R (k, L) = 0. Therefore, G s k-colorable f and only f there s a k-clusterng wth R (k, L) = 0. If there s an algorthm that computes a I-approxmate k-clusterng and runs n tme polynomal of n, d,, we can solve the GRAPH k-coloring problem, mplyng P = NP. Note that all the flats used n the above theorem are axs-parallel, so both axs-parallel and general case are hard when k 3. Snce -COLORING s n P, a harder verson of colorng(hypergraph -COLORING) s used to show that -clusterng n the general (not necessarly axs-parallel) case s stll hard. Theorem 4.. There s no algorthm that computes a I-approxmate -clusterng and runs n tme polynomal of n, d, for any I unless P = NP. Proof. We reduce the SET SPLITTING problem, also known as the HYPERGRAPH -COLORING problem, to our problem. The SET SPLITTING problem, gven a collecton S of subsets of a fnte set U, decdes whether there s a partton of U nto two subsets U 1 and U such that no subset n S s entrely contaned n ether U 1 or U. Ths problem remans NP-complete even f all s S have s 3. Therefore, we assume that s = 3 for all s S wthout losng NP-completeness. Let U = {u 1,..., u n } be the gven fnte set, and S = {s 1,..., s m } be the collecton of subsets. For each j, s j = {u j,1, u j,, u j,3 } S. Gven such an nstance, we construct the set of n flats L = {l 1,..., l n } n d = m-dmensonal coordnates. Thnk (j 1)th and (j)th coordnates are assocated wth s j. For each 1 n and 1 j m, (l ) j 1 and (l ) j are decded as the followng. Let A = (0, 1), B = ( 3, 1 ), C = ( 3, 1 ) be the vertces of the trangle that contans 0 n the mddle.

23 18 u / s j : (l ) j 1 = (l ) j =,.e. s does not have a fxed value n these coordnates. u = u j,1 : {(l ) j 1, (l ) j } R s the lne passng A and B. u = u j, : {(l ) j 1, (l ) j } R s the lne passng A and C. u = u j,3 : {(l ) j 1, (l ) j } R s the lne passng B and C.... ( 1, )... (j 1, j)... l 1... / l... \... /... l \... l Table 4.1: An example where S = {u 1, u, u 3 } and S j = {u, u 3, u 4 }. Three lnes n the same two columns form a regular trangle around the orgn. Let (C 1, C ) be a -clusterng of L. If l j,1, l j,, l j,3 are n the same cluster for any j, the mnmum ntersectng ball for that cluster must cover the trangle formed n (j 1)th and (j)th coordnates, forcng R (k, L) 1/. If l j,1, l j,, l j,3 are splt for every j, each cluster has at most of them and we can place the center on the ntersecton of those two lnes, so R = 0. Therefore, R (k, L) = 0 f and only f the gven nstance of the SET SPLITTING problem has a satsfactory parttonng. If there s an algorthm that computes a I-approxmate -clusterng and runs n tme polynomal of n, d,, we can solve the SET SPLITTING problem wth s = 3 for all s S, mplyng P = NP. Ths constructon s not axs-parallel. By replacng three edges of a trangle wth three vertces wth some addtonal changes smlar to the NP-hardness of -means[8], we can make each flat used n the reducton axs-parallel. However, n ths case the napproxmablty rato reduces from nfnty to a constant less than. In Secton 6, we show a O( 1/4 )-approxmaton algorthm for k =, ndcatng the fundamental dfference between the axs-parallel case and the general case n -clusterng. Theorem 4.3. When all flats are restrcted to be axs-parallel, there s no algorthm that computes a 3 -approxmate -clusterng and runs n tme polynomal of n, d, unless P = NP. Proof. We reduce the NOT ALL EQUAL 3-SAT problem, whch s qute smlar to the SET SPLIT- TING PROBLEM ntroduced above, to our problem. Gven a set of varables x 1,..., x n and a set of clauses C = {C 1,..., C m } where each C j conssts of three lterals x j,1, x j,, x j,3, negated or unnegated, the problem asks to fnd the truth assgnment to each x such that no clause has three true lterals or three false lterals. It s also NP-hard to fnd an exact soluton for the NOT ALL EQUAL 3-SAT problem.

24 19 Gven an nstance of the NOT ALL EQUAL 3-SAT problem, we construct the set of 6m flats L = { l,j, l,j x C j } n d = (n + m)-dmensonal coordnates. The number of flats s 6m snce we generate two flats, l,j for the lteral and l,j for ts negaton, per each appearance of a lteral. The purpose of the frst n coordnates s to make sure that all l,j s for fxed are n the same cluster and all l,j s n the other, and the purpose of remanng m coordnates s to penalze the clause wth the three lterals wth the same value. For the frst n coordnates, (l,j ) = 0, (l,j ) = M for a large M, and (l,j ) = (l,j ) = when. Therefore, l,j and l,j cannot be n the same cluster even f j j. Snce we have only two clusters, for fxed, all l,j s have to gather n the one cluster and all l,j s have to gather n the other. The (n + j 1)th and (n + j)th coordnates are assocated wth C j. For each 1 n and 1 j, j m such that x C j (.e. l,j L), (l,j ) j 1, (l,j ) j, (l,j ) j 1 and (l,j ) j are decded as the followng: j j : (l,j ) j 1 = (l,j ) j = (l,j ) j = (l,j ) j =,.e. l,j and l,j n these coordnates. does not have a fxed value x = x j,1 : (l,j ) j 1 = 0, (l,j ) j = 1. (l,j ) j 1 = 3, (l,j ) j = 1. Swap the role of l,j and l,j f x j,1 s negated. x = x j, : (l,j ) j 1 = 3, (l,j) j = 1. (l,j ) j 1 = 3, (l,j ) j = 1. Swap the role of l,j and l,j f x j, s negated. x = x j,3 : (l,j ) j 1 = 3, (l,j) j = 1. (l,j ) j 1 = 0, (l,j ) j = 1. Swap the role of l,j and l,j f x j,3 s negated (j 1, j)... l 1,j l 1,j M l,j l,j M l 3,j l 3,j M Table 4.: An example where C j = (x 1, x, x 3 ). (0, 1), ( 3, 1 ), ( 3, 1 ) are represented as up, left, rght arrow vectors, respectvely. The above table shows the example of a clause C j whch conssts of x j,1 = x 1, x j, = x, x j,3 = x 3. Snce l,j and l,j must be separated, there are at most 3 = 8 clusterngs (f we do consder the order of the clusters), and we cannot have the three same ponts n one cluster. Furthermore, there are exactly two cases where we can have three dfferent ponts n one cluster, namely l 1,j, l,j, l 3,j n one cluster and l 1,j, l,j, l 3,j n the other (agan, when we do not consder the order of the clusters,

25 0 t becomes one case). They are exactly the two assgnments that make all the lterals true or false at the same tme. By symmetry, ths argument holds for any clause wth arbtrary negated or unnegated lterals. When all l,j s gather n the one cluster and all l,j s gather n the other for fxed, the frst n coordnates do not affect the radus of the clusterng at all. For each two coordnates correspondng to C j, there are only 6 flats wth fxed values, 3 for each cluster. And these flats do not have fxed values for any other coordnate correspondng to C j C j. Therefore, the (n + j 1)th and (n + j)th coordnates of the center are decded by these coordnates of the 3 flats (ether three ponts or two ponts) only, and the dstances of these flats to the center are also decded by only those two coordnates. It mples that the radus of a clusterng s 1 f and only f there s a clause for whch each cluster corresponds to the assgnment resultng n three lterals wth the same value. Otherwse, the radus of clusterng s 3. Therefore, there exsts a satsfyng assgnment to an nstance of the NOT ALL EQUAL 3- SAT problem f and only f the optmal radus s 3. If there s an algorthm that computes a 3 -approxmate -clusterng and runs n tme polynomal of n, d,, we can solve the NOT ALL EQUAL 3-SAT problem, mplyng P = NP. The three theorems above show that t s mpossble to fnd a good approxmaton algorthm that runs n tme polynomal n n, d,, even for fxed k. We also study the case where s fxed. Smlarly to what we have proved, t s mpossble to fnd any approxmaton algorthm that runs n tme polynomal n n, d, k, even for fxed. The reducton s from the VERTEX-COVER problem. Theorem 4.4. For fxed 1, there s no algorthm that computes a I-approxmate k-clusterng and runs n tme polynomal of n, d, k for any I unless P = NP. Proof. We reduce the famous VERTEX COVER problem to our problem wth = 1. Gven G = (V, E) wth V = { } { v 1,..., v V and E = (v1,1, v 1, ),..., (v E,1, v E, ) }, we construct the regular d = ( V 1)-smplex whose V vertces corresponds to each vertex n V. L conssts of n = E lnes where each lne corresponds to each edge; l s the lne passng v,1 and v,. If there s a vertex cover of sze k, usng the vertces n the vertex cover as centers ensures R (k, L) = 0, snce each lne contans at least one center. Conversely, f there s a k-clusterng wth R (k, L) = 0, we can move each center to some vertex whle mantanng R = 0 snce there s no ntersecton of any two lnes except the vertces of the smplex. Therefore, there s a vertex cover of sze k f and only f there s a k-clusterng wth R (k, L) = 0. If there s an algorthm that computes a I-approxmate k-clusterng and runs n tme polynomal of n, d, k for any I, we can solve the VERTEX COVER problem, mplyng P = NP.

26 1 Agan, the flats used n the reducton are not axs-parallel. By allowng a slghtly larger dmenson and more nvolved analyss, we can use the same dea to extend the proof to the axs-parallel case. Theorem 4.5. When all flats are restrcted to be axs-parallel, for fxed 3, there s no algorthm that computes a I-approxmate k-clusterng and runs n tme polynomal of n, d, k for any I unless P = NP. Proof. We reduce the 3-REGULAR VERTEX COVER problem to our problem. Frst, we prove that the problem remans NP-hard even though all graphs are restrcted to have no cycle of length 4. For any gven graph G = (V, E), construct the graph G wth V + E vertces and 3 E edges by splttng each edge of G nto three edges and addng two vertces of degree two n each juncton. If G has a vertex cover of sze k, G has a vertex cover of sze k + E snce f an edge of G s covered, at least one edge of G correspondng to the orgnal edge s covered, so we need only one vertex to cover the remanng two edges of G correspondng to the orgnal edge. For the other drecton, suppose G has a vertex cover of sze k+ E and ths vertex cover contans k vertces of G. Snce ths vertex cover has to cover each mddle edge (the mddle of three peces of the orgnal edge) that does not contan any vertex of G (there are E of them), k k. Also, f the two endponts of any mddle edge are n the vertex cover, movng one of them to the nearest vertex of G stll ensures that all edges of G are covered (t s okay even f the nearest vertex was already n the cover). Ths may ncrease k, but stll k k. Now we clam that these k vertces of G form a vertex cover of G. If there s an uncovered edge of G by these vertces, the only way to cover the three edges of G s to nclude two new vertces nto the vertex cover of G, but we already removed these cases. Therefore, G has a vertex cover of sze k f and only f G has a vertex cover of sze k + E, completng the frst reducton. Gven G = (V, E) wth V = { } { } v 1,..., v V and E = e1,..., e E, we construct the set of n = E flats L = {l 1,..., l n } n d = V -dmensonal coordnates. Thnk jth coordnate s assocated wth v j. For each 1 n and 1 j m, (l ) j s decded as the followng: v j s an endpont of e : (l ) j = 1. v j s a neghbor of an endpont of e : (l ) j =. Otherwse: (l ) j = 0. Note that n a graph of whch the maxmum degree s 3, one edge can have at most 4 vertces (excludng ts endponts) that are adjacent to ts vertces. However, usng the reducton above to splt each edge nto three, we are sure that no adjacent vertces are both of degree 3, so each edge can have at most 3 such vertces. For fxed j, f c R d s such that c j = 1 and c j = 0 for j j, c ntersects all flats correspondng to the edges ncdent on v j. Therefore, f there s a vertex cover of sze k, there s a k-clusterng

27 C wth R(C) = 0. If there s no vertex cover of sze k, for any partton of the flats(edges) nto k clusters, there must be a cluster whose edges are not covered by a sngle vertex. There are two cases. There exsts e = (v x, v y ), e = (v z, v w ) such that x, y, z, w are dfferent: Note that (l ) x = (l ) y = 1. However, t s mpossble that (l ) x =, (l ) y =, because ths mples both v x and v y are adjacent to at least one of v z and v w, whch results n a cycle of length 3 or 4. There must be at least one coordnate n whch l s 1 and l s 0, mplyng that R (k, L) > 0. Each par of edges shares an endpont: Take any edge (v x, v z ) from ths cluster. Snce ths cluster s not covered by ether v x or v z, there must be an edge e = (v x, v y ) and e = (v z, v w ) for some y z and w x. y = w means a cycle of length 3, so y w, mplyng that x, y, z, w are all dfferent agan. Therefore, G admts a vertex cover of sze k f and only f R (k, L) = 0. If there s an algorthm that computes a I-approxmate k-clusterng and runs n tme polynomal of n, d, k for any I, we can solve the 3-REGULAR VERTEX COVER problem, mplyng P = NP. Note that our reductons do not ncrease the sze of each nstance much. Therefore, f we assume the exponental tme hypothess formalzed n [0], whch mples that the 3-SAT problem cannot be solved n o(n), our reductons also mply that we cannot expect an algorthm whose runnng tme s subexponental n k or, wth the only excepton beng the axs-parallel case for fxed. Theorem 4.6. Assumng the exponental tme hypothess, for fxed k 3, there s no algorthm that computes I-approxmate k-clusterng and runs n tme o( ) poly(n, d), both for the general and the axs-parallel case for any I. For fxed 1, there s no such algorthm for the general case whch runs n tme o(k) poly(n, d). In the axs-parallel case, for fxed 3, there s no such algorthm whch runs n tme o( k) poly(n, d). Proof. Note that f there s an approxmaton algorthm wth any factor I, the problems that have been used n our reductons wll be solved because all of our reductons mapped the acceptng nstances to the clusterng nstances wth R = 0, and the rejectng nstances to those wth R > 0. Assumng the exponental tme hypothess, the 3-SAT problem, the GRAPH 3-COLORING problem, and the VERTEX COVER problem (wth k as a part of nput, not a parameter) cannot be solved n tme o(n+m), where n s the number of vertces (varables) and m s the number of edges (clauses) [1]. Our reducton from the GRAPH 3-COLORING n Theorem 4.1, where we used d = E dmensons, shows that, for fxed k, we cannot obtan an algorthm whose runnng tme s subexponental n. Ths result holds both n the general case and n the axs-parallel case snce the range of the reducton conssts of axs-parallel nstances.

28 3 For fxed, our reducton n Theorem 4.4 preserves k n the orgnal VERTEX-COVER problem, so we cannot have a subexponental (n k) algorthm n the general case. In the axs-parallel case, the reducton from the 3-SAT problem to the 3-REGULAR VERTEX COVER problem ncreases the sze (n + m) of the problem at most quadratcally [15], and the reducton to the 3-REGULAR VERTEX COVER wthout any cycle of length 4 n Theorem 4.5 ncreases the sze of the graph lnearly(n = n + m, m = 3m). Therefore, an algorthm for the axs-parallel case whch runs n tme o( k) poly(n, d) solves the 3-REGULAR VERTEX COVER problem n tme o( n+m) and the 3-SAT problem n tme o(n+m), contradctng the exponental tme hypothess.

29 4 Chapter 5 Clusterng Algorthms In the prevous secton, we showed that there cannot be any polynomal tme approxmaton algorthm even when one of k or s fxed. Therefore, the natural next step s to fnd an approxmaton algorthm whch s fast for small k and. Usng Algorthm (Movng Center) for computng the mnmum ntersectng ball, we can obtan a PTAS whose runnng tme s lnear n n and d for fxed k and. Note that algorthm starts wth an ntal center whose dstance to the closest optmal center s bounded by O( R ). However, n clusterng, t s mpossble, because R can be arbtrarly small compared to max j (M j m j ); gven two ponts, the radus of the mnmum enclosng ball s at least half of the dstance between these ponts, but the radus of the optmal clusterng becomes zero when we allow two clusters. Thus, we need to study how Algorthm behaves f t starts wth an arbtrary center. Fortunately, wth some modfcaton on the algorthm, ths does not affect the runnng tme greatly. Snce the dfference between the optmal center and the current center n the jth coordnate s bounded by R once the jth coordnate s consdered (the center has move to a flat wth a fxed value for the jth coordnate), and we have only unconsdered coordnates after the frst move, we need only + 1 addtonal steps to move to a reasonable poston. Fgure 5.1 shows a smple example n whch an arbtrary ntal center moves to the desred poston n = + 1 steps. Theorem 5.1. If all flats are axs-parallel and R s gven, Algorthm wth an arbtrary ntal center c computes an ntersectng ball wth radus R < R (1 + ɛ) n tme O(nd ( 1 ɛ + 1)). start end Fgure 5.1: A smple example n whch an arbtrary ntal center moves to the desred poston

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem