Supplement to Clusterng wth Statstcal Error Control Mchael Vogt Unversty of Bonn Matthas Schmd Unversty of Bonn In ths supplement, we provde the proofs that are omtted n the paper. In partcular, we derve heorems 4. 4.3 from Secton 4. hroughout the supplement, we use the symbol C to denote a unversal real constant whch may tae a dfferent value on each occurrence. Auxlary results In the proofs of heorems 4. 4.3, we frequently mae use of the followng unform convergence result. Lemma S.. Let Z s = {Z st : t } be sequences of real-valued random varables for s S wth the followng propertes: for each s, the random varables n Z s are ndependent of each other, and E[Z st ] = 0 and E[ Z st φ ] C < for some φ > and C > 0 that depend nether on s nor on t. Suppose that S = q wth 0 q < φ/. hen s S Z st > η = o, where the constant η > 0 can be chosen as small as desred. roof of Lemma S.. Defne τ S, = S /{+δq+} wth some suffcently small δ > 0. In partcular, let δ > 0 be so small that + δq + < φ. Moreover, set Z st = Z st Z st τ S, E [ Z st Z st τ S, ] Z st > = Z st Z st > τ S, E [ Z st Z st > τ S, ] and wrte Z st = Z st + Z > st.
In what follows, we show that s S s S Z > st Z st > C η = o S. > C η = o S. for any fxed constant C > 0. Combnng S. and S. mmedately yelds the statement of Lemma S.. We start wth the proof of S.: It holds that s S Z > st > C η Q > + Q >, where and Q > := Q > := S s= S s= S s= CS τ φ S, S s= Z st Z st > τ S, > C η Z st > τ S, for some t Z st > τ S, = o S s= [ Zst φ ] E τ φ S, E [ Z st Z st > τ S, ] > C η = 0 for S and suffcently large, snce hs yelds S.. E [ Z st Z st > τ S, ] C τ φ S, [ Zst φ E τ φ S, = o η. ] Z st > τ S,
We next turn to the proof of S.: We apply the crude bound s S Z st > C η S s= Z st > C η and show that for any s S, Z st > C η C 0 ρ, S.3 where C 0 s a fxed constant and ρ > 0 can be chosen as large as desred by pcng η slghtly larger than / / + δ. Snce S = O q, ths mmedately mples S.. o prove S.3, we mae use of the followng facts: For a random varable Z and λ > 0, Marov s nequalty says that ± Z > δ E exp±λz. expλδ Snce Z st/ τ S, /, t holds that λ S, Z st/ /, where we set λ S, = /4τ S,. As expx + x + x for x /, ths mples that [ Z ] st E exp ± λ S, + λ S, E[ Zst ] λ S, exp E[ Zst ]. By defnton of λ S,, t holds that λ S, = 4S +δq+ = 4 q+ +δq+ = 4 +δ. Usng and wrtng EZ st C Z <, we obtan that Z st > C η exp { λ S, C η [ E = exp λ S, C η { Z st > C η + exp λ S, Z st > C η Z st ] + E [ Z ] st E exp λ S, + [ Z ] } st exp λ S, [ Z ] } st E exp λ S, 3
exp λ S, C η λ S, exp E[ Zst ] = exp C Z λ S, Cλ S, η CZ = exp 6 C 0 ρ, C +δ 4 +δ η where ρ > 0 can be chosen arbtrarly large f we pc η slghtly larger than / / + δ. roof of heorem 4. We frst prove that Ĥ[K 0 ] qα = α + o. S.4 o do so, we derve a stochastc expanson of the ndvdual statstcs [K 0]. Lemma S.. It holds that [K 0] = [K 0] +, where and the remander [K 0] = p has the property that n { ε j σ } /κ [K R 0 ] > p ξ = o S.5 for some ξ > 0. he proof of Lemma S. as well as those of the subsequent Lemmas S.3 S.5 are postponed untl the proof of heorem 4. s complete. Wth the help of Lemma S., we can bound the probablty of nterest as follows: Snce α := Ĥ[K 0 ] qα = [K 0] qα n n [K 0] n [K 0] + n n [K 0] n, 4
t holds that where < α > α = = < α α > α, n [K 0] n [K 0] qα n R[K 0] qα + n R[K 0]. As the remander has the property S.5, we further obtan that α + o α α + o, S.6 where α α = = n [K 0] n [K 0] qα p ξ qα + p ξ. Wth the help of strong approxmaton theory, we can derve the followng result on the asymptotc behavour of the probabltes α and α. Lemma S.3. It holds that α α = α + o = α + o. ogether wth S.6, ths mmedately yelds that α = α+o, thus completng the proof of S.4. We next show that for any K < K 0, Ĥ[K] qα = o. S.7 Consder a fxed K < K 0 and let S {Ĝ[K] followng property: : K} be any cluster wth the #S n := mn K0 #G, and S contans elements from at least two dfferent classes G and G. S.8 It s not dffcult to see that a cluster wth the property S.8 must always exst under our condtons. By C {Ĝ[K] : K}, we denote the collecton of clusters that have the property S.8. Wth ths notaton at hand, we can derve the [K] followng stochastc expanson of the ndvdual statstcs. 5
Lemma S.4. For any S and S C, t holds that [K] = κσ p d j +, where d j = µ j #S S µ j and the remander for some small ξ > 0. Usng S.9 and the fact that we obtan that Ĥ[K] qα S C S { [K] S C S S C S κσ p = n [K] has the property that [K] R > p ξ = o S.9 qα [K] qα S C S S C S { κσ p { S C S κσ p d j d j d j } } S C S S C S, [K] R qα } qα + p ξ + o. S.0 he arguments from the proof of Lemma S.3, n partcular S., mply that qα C log n for some fxed constant C > 0 and suffcently large n. Moreover, we can prove the followng result. Lemma S.5. It holds that for some fxed constant c > 0. { p S C S d j } c p Snce qα C log n and log n/ p = o by C3, Lemma S.5 allows us to nfer that { S C S κσ p d j } qα + p ξ = o. ogether wth S.0, ths yelds that Ĥ[K] qα = o. 6
roof of Lemma S.. Let n = #G and wrte ε = p p ε j along wth µ = p p µ j. Snce {Ĝ[K 0 ] } { } : K 0 = G : K 0 by 3., we can gnore the estmaton error n the clusters Ĝ[K 0] by the true classes G. For G, we thus get and replace them [K 0] = [K 0] +,A + R[K 0],B R[K 0],C + R[K 0],D, where,a,b = κ κ = κ p σ σ p,c = κ σ p,d = κ σ p { ε j σ } ε j { ε j ε + } ε n j ε G {ε + } ε n j ε. G We now show that G,l = o p p ξ for any and l = A,..., D. hs mples that n,l = K0 G,l = o p p ξ for l = A,..., D, whch n turn yelds the statement of Lemma S.. hroughout the proof, we use the symbol η > 0 to denote a suffcently small constant whch results from applyng Lemma S.. By assumpton, σ = σ + O p p /+δ and κ = κ + O p p δ for some δ > 0. Applyng Lemma S. and choosng ξ > 0 such that ξ < δ η, we obtan that and R [K 0 ],A G κ { ε } j p κ G σ = κ O p p η = O p p δ η = o p p ξ κ R [K 0 ],B G κ = σ { p σ G ε j σ + σ p} κ σ σ { Op p η + σ p } = o p p ξ. 7
We next show that R [K 0 ],C = op p 4. S. G o do so, we wor wth the decomposton,c = { κ σ }{,C, +R[K 0],C, R[K 0],C,3 }, where,c, = ε j ε p,c, = p,c,3 = p ε j n Wth the help of Lemma S., we obtan that G ε j ε j n G ε. R [K 0 ],C, p p G G ε j = Op p η p. S. Moreover, snce p n and,c, = n p G n p G n R [K 0 ],C, = Op n 4, S.3 G { ε j σ } + σ p n + n G p G p { ε j σ } = Op p η n ε j ε j, S.4 ε j ε j = O p n 4. S.5 S.4 s an mmedate consequence of Lemma S.. S.5 follows upon observng that for any constant C 0 > 0, G n n G G p G ε j ε j > C 0 n /4 p ε j ε j > C 0 n /4 8
{ E n G G { n 4 p C, C0 4 G p,..., 4 G,..., 4 } 4 /{ C0 ε j ε j j,...,j 4 = n /4 } 4 E [ ] }/{ C } 4 0 ε j... ε j4 ε j... ε 4 j 4 n /4 the last nequalty resultng from the fact that the mean E[ε j... ε j4 ε j... ε 4 j 4 ] can only be non-zero f some of the ndex pars l, j l for l =,..., 4 are dentcal. Fnally, wth the help of Lemma S., we get that R [K 0 ],C,3 G ε n G p G p η ε j = Op. S.6 n p Combnng S., S.3 and S.6, we arrve at the statement S. on the remander,c. We fnally show that R [K 0 ] p η,d = Op. S.7 G p For the proof, we wrte,d = { κ σ }{,D, + R[K 0],D, }, where,d, = p p,d, = p ε j Wth the help of Lemma S., we obtan that { } ε n j ε. G Moreover, straghtforward calculatons yeld that S.7 now follows upon combnng S.8 and S.9. R [K 0 ] p η,d, = Op. S.8 G p p,d, = O p. S.9 G n 9
roof of Lemma S.3. We mae use of the followng three results: R Let {W : n} be ndependent random varables wth a standard normal dstrbuton and defne a n = / log n together wth hen for any w R, b n = log n log log n + log4π. log n lm W a n w + b n = exp exp w. n n In partcular, for wα ± ε = log log α ± ε, we get lm W a n wα ± ε + b n = α ± ε. n n he next result s nown as Khntchne s heorem. R Let F n be dstrbuton functons and G a non-degenerate dstrbuton functon. Moreover, let α n > 0 and β n R be such that F n α n x + β n Gx for any contnuty pont x of G. hen there are constants α n > 0 and β n R as well as a non-degenerate dstrbuton functon G such that F n α nx + β n G x at any contnuty pont x of G f and only f α n α n α, β n β n α n β and G x = Gα x + β. he fnal result explots strong approxmaton theory and s a drect consequence of the so-called KM heorems; see Komlós et al. 975, 976: R3 Wrte [K 0] = p X j wth X j = { ε j σ }/ κ and let F denote the dstrbuton functon of X j. It s possble to construct..d. random varables { X j : n, j p} wth the dstrbuton functon F and ndependent standard normal random varables {Z j : 0
n, j p} such that [K 0] = p X j and = p Z j have the followng property: [K 0 ] > Cp +δ θ/ p +δ for some arbtrarly small but fxed δ > 0 and some constant C > 0 that does not depend on, p and n. We now proceed as follows: We show that for any w R, n [K 0] a n w + b n exp exp w. S.0 hs n partcular mples that n [K 0] w n α ± ε α ± ε, S. where w n α ± ε = a n wα ± ε + b n wth a n, b n and wα ± ε as defned n R. he proof of S.0 s postponed untl the arguments for Lemma S.3 are complete. he statement S. n partcular holds n the specal case that ε j N0, σ. In ths case, qα s the α-quantle of n [K 0]. Hence, we have n [K 0] w n α ε α ε qα = α n [K 0] n [K 0] w n α + ε α + ε, whch mples that for suffcently large n. w n α ε qα w n α + ε S. Snce p ξ /a n = p ξ log n = o by C3, we can use S.0 together wth R to obtan that n [K 0] w n α ± ε ± p ξ α ± ε. S.3
As w n α ε p ξ qα p ξ qα + p ξ w n α + ε + p ξ for suffcently large n, t holds that α,ε := n [K 0] w n α ε p ξ α = n [K 0] qα p ξ α = n [K 0] qα + p ξ α,ε := w n α + ε + p ξ n [K 0] for large n. Moreover, snce α,ε α ε and α,ε α + ε for any fxed ε > 0 by S.3, we can conclude that α = α + o and = α + o, whch s the statement of Lemma S.3. α It remans to prove S.0: Usng the notaton from R3 and the shorthand w n = a n w + b n, we can wrte n [K 0] w n = [K 0] w n = n n = π S.4 wth π = [K 0 ] w n. he probabltes π can be decomposed nto two parts as follows: π = w n + { [K } 0] = π + π >, where π π > = w n + { [K } 0], [K 0] Cp = w n + { [K } 0], [K 0] > Cp +δ +δ. Wth the help of R3 and the assumpton that n p θ/4, we can show that n π = = n π + R n, S.5 = where R n s a non-negatve remander term wth R n n = n θ/ p +δ = o.
Moreover, the probabltes π can be bounded by w π n + Cp +δ w n Cp +δ θ/ p +δ, the second lne mang use of R3. From ths, we obtan that n π = Π n Π n + o, S.6 where Π n = n w n + Cp Π n = +δ n w n Cp +δ. By combnng S.4 S.6, we arrve at the ntermedate result that Π n + o n [K 0] w n Π n + o. S.7 Snce p +δ /an = p +δ log n = o, we can use R together wth R to show that Π n exp exp w and Π n exp exp w. S.8 luggng S.8 nto S.7 mmedately yelds that whch completes the proof. n [K 0] w n exp exp w, roof of Lemma S.4. We use the notaton n S = #S along wth ε = p p ε j, µ = p p µ j and d = p p d j. For any S and S C, we can wrte where [K] = κσ p d j +,A + R[K],B + R[K],C + R[K],D R[K],E + R[K],F + R[K],G,,A = κ σ κσ p d j 3
,B = p,c κ = p κ { ε j σ }/ κ,d = κ σ p σ,e = κ σ p,f = κ σ p,g = κ σ p { ε j σ } ε j { ε j ε + ε n j ε } S S { ε + ε n j ε } S S { ε j ε } ε n j ε d j. S We now show that S C S,l = o pp / ξ for l = A,..., G. hs mmedately yelds the statement of Lemma S.4. hroughout the proof, η > 0 denotes a suffcently small constant that results from applyng Lemma S.. Wth the help of Lemma S. and our assumptons on σ and κ, t s straghtforward to see that S C S,l n,l = o pp / ξ for l = A, B, C, D wth some suffcently small ξ > 0. We next show that S C S S [K] R = Op p η. o do so, we wrte,e = { κ σ }{,E, + R[K],E, R[K],E,3 }, where,e, = p,e, = p,e,3 = p,e ε j ε ε j n S S ε j ε j n S S ε. S.9 Lemma S. yelds that S C S,E, n,e, = O pp η / p. Moreover, t holds that = O p p η, S C S,E, 4
snce and S C S S C S,E, = n S p n S p n S S { ε j σ } + σ p n S + n S { ε j σ } n n ε j ε p j < n p p whch follows upon applyng Lemma S.. Fnally, S p ε j ε j { ε j σ } p η = Op ε j ε j = O p p η, n,e,3 S C S { p n p ε j } = Op p η p, whch can agan be seen by applyng Lemma S.. uttng everythng together, we arrve at S.9. Smlar arguments show that,f S C S = O p p η S.30 as well. o analyze the term,g, we denote the sgnal vector of the group G by m = m,,..., m p, and wrte K 0 µ j = λ S, m j, n S S = wth λ S, = #S G /n S. Wth ths notaton, we get where,g = { κ σ }{,G, R[K],G, R[K],G,3 R[K],G,4 + R[K],G,5 },,G, = p = µ j ε j K0,G, = λ S, p,g,3 = p ε d j 5 m j, ε j
,G,4 = n S p S,G,5 = n S K 0 S = µ j ε j λ S, p m j, ε j. Wth the help of Lemma S., t can be shown that S C S for l =,..., 5. For example, t holds that,g,l = Op p η S C S [K] R,G,4 < n p µ j ε j = O p p η. As a result, we obtan that hs completes the proof. S C S [K] R = Op p η.,g S.3 roof of Lemma S.5. Let S C. In partcular, suppose that S G and S G for some. We show the followng clam: there exsts some S such that p d j c p, S.3 where c = δ 0 / wth δ 0 defned n assumpton C. From ths, the statement of Lemma S.5 mmedately follows. For the proof of S.3, we denote the Eucldean dstance between vectors v = v,..., v p and w = w,..., w p by dv, w = p v j w j /. Moreover, as n Lemma S.4, we use the notaton K 0 µ j = λ S, m j,, n S S where n S = #S, λ S, = #S G /n S and m = m,,..., m p, s the sgnal vector of the class G. ae any S G. If = d µ, K 0 = λ S, m = d K 0 m, λ S, m = δ0 p, 6
the proof s fnshed, as S.3 s satsfed for. Next consder the case that d m, K 0 = λ S, m < δ0 p. By assumpton C, t holds that dm, m δ 0 p for. Hence, by the trangle nequalty, mplyng that δ0 p d m, m d m, < K 0 δ0 p + d = = K 0 K 0 λ S, m + d = λ S, m, m, = K 0 d λ S, m, m > δ0 p. λ S, m, m hs shows that the clam S.3 s fulflled for any S G. roof of heorem 4. By heorem 4., K0 > K 0 Ĥ[K] = > qα for all K K 0 Ĥ[K = Ĥ[K 0 ] > qα 0 ] > qα, Ĥ[K] qα for some K < K 0 = Ĥ[K 0 ] > qα + o = α + o and K0 < K 0 = Ĥ[K] qα for some K < K 0 K 0 K= = o. Ĥ[K] qα 7
Moreover, {Ĝ : K } { } 0 G : K 0 = {Ĝ : K } { } 0 G : K 0, K0 = K 0 + {Ĝ : K } { } 0 G : K 0, K0 K 0 = α + o, snce {Ĝ : K } { } 0 G : K 0, K0 = K 0 {Ĝ[K = 0 ] } { } : K 0 G : K 0, K0 = K 0 {Ĝ[K 0 ] } { } : K 0 G : K 0 = o by the consstency property 3. and {Ĝ : K } { } 0 G : K 0, K0 K 0 = K0 K 0 = α + o. roof of heorem 4.3 Wth the help of Lemma S., we can show that ρ, = σ + p µj µ j + op S.33 unformly over and. hs together wth C allows us to prove the followng clam: Wth probablty tendng to, the ndces,..., K belong to K dfferent classes n the case that K K 0 and to K 0 dfferent classes n the case that K > K 0. S.34 Now let K = K 0. Wth the help of S.33 and S.34, the startng values C [K 0],......, C [K 0] K 0 can be shown to have the property that {C [K 0 ] } { } : K 0 = G : K 0. S.35 8
ogether wth Lemma S., S.35 yelds that ρ = σ + p µj m j, + op unformly over and. Combned wth C, ths n turn mples that the -means algorthm converges already after the frst teraton step wth probablty tendng to and Ĝ[K 0] are consstent estmators of the classes G n the sense of 3.. roof of 3.6 Suppose that C C3 along wth 3.5 are satsfed. As already noted n Secton 3.4, the -means estmators {ĜA : K } can be shown to satsfy 3.4, that s, for any =,..., K. ĜA G for some K 0 S.36 hs can be proven by very smlar arguments as the consstency property 3.. We thus omt the detals. Let E A be the event that Ĝ A G for some K 0 holds for all clusters ĜA wth =,..., K. E A can be regarded as the event that the partton {ĜA : K } s a refnement of the class structure {G : K 0 }. By S.36, the event E A occurs wth probablty tendng to. Now consder the estmator σ RSS = K n p/ = ĜA Ŷ B #ĜA ĜA Ŷ B. Snce the random varables Ŷ B are ndependent of the estmators ĜA, t s not dffcult to verfy the followng: for any δ > 0, there exsts a constant C δ > 0 that does not depend on {ĜA : K } such that on the event E A, σ RSS σ C δ { } Ĝ A : K δ. p From ths, the frst statement of 3.6 easly follows. he second statement can be obtaned by smlar arguments. 9
References Komlós, J., Major,. and usnády, G. 975. An approxmaton of partal sums of ndependent RV s, and the sample DF. I. Zetschrft für Wahrschenlchetstheore und Verwandte Gebete, 3 3. Komlós, J., Major,. and usnády, G. 976. An approxmaton of partal sums of ndependent RV s, and the sample DF. II. Zetschrft für Wahrschenlchetstheore und Verwandte Gebete, 34 33 58. 0