arxiv: v3 [cs.sc] 17 Jun 2015

 Stephanie Maxwell
 4 days ago
 Views:
Transcription
1 Probablstc analyss of Wedemann s algorthm for mnmal polynomal computaton arxv: v3 [cs.sc] 17 Jun 2015 Gavn Harrson Drexel Unversty Jeremy Johnson Drexel Unversty B. Davd Saunders Unversty of Delaware, do: /j.jsc c 2015, Elsever. Lcensed under the Creatve Commons AttrbutonNonCommercalNoDervatves 4.0 Internatonal Abstract Blackbox algorthms for lnear algebra problems start wth projecton of the sequence of powers of a matrx to a sequence of vectors (Lanczos), a sequence of scalars (Wedemann) or a sequence of smaller matrces (block methods). Such algorthms usually depend on the mnmal polynomal of the resultng sequence beng that of the gven matrx. Here exact formulas are gven for the probablty that ths occurs. They are based on the generalzed Jordan normal form (drect sum of companon matrces of the elementary dvsors) of the matrx. Sharp bounds follow from ths for matrces of unknown elementary dvsors. The bounds are vald for all fnte feld szes and show that a small blockng factor can gve hgh probablty of success for all cardnaltes and matrx dmensons. 1 Introducton The mnmal polynomal of a n n matrx A may be vewed as the mnmal scalar generatng polynomal of the lnearly recurrent sequence of powers of Ā = (A 0, A 1, A 2, A 3,...). Wedemann s algorthm (Wedemann, 1986) projects the matrx sequence to a scalar sequence s = (s 0, s 1, s 2,...), where s = u T A v. The vectors u, v are chosen at random. The algorthm contnues by computng the mnmal generatng polynomal of s whch, wth hgh probablty, s the mnmal polynomal of A. Block Wedemann algorthms (Coppersmth, 1995; Eberly et al., 2006; Kaltofen, 1995; Vllard, 1997, 1999) fatten u T to matrx U havng several rows and v to a matrx V havng multple columns, so that the projecton s to a sequence of smaller matrces, B = UĀV = (UA0 V, UA 1 V, UA 2 V,...), where, for chosen block sze b, U, V are unformly random matrces of shape b n and n b, respectvely. A block Berlekamp/Massey algorthm s then used to compute the matrx mnmal generatng polynomal of B (Kaltofen and Yuhasz, 2013; Gorg et al., 2003), and from t the mnmal scalar generatng polynomal. All of the algorthms based on these random projectons rely on preservaton of some propertes, ncludng at least the mnmal generatng polynomal. In ths paper we analyze the probablty of preservaton of mnmum polynomal under random projectons for a matrx over a fnte feld. Research supported by Natonal Scence Foundaton Grants CCF and CCF
2 Let A F n n q and let P q,b (A) denote the probablty that mnpoly(a) = mnpoly(uāv ) for unformly random U F b n q and V Fq n b. P q,b (A) s the focus of ths paper and ths notaton wll be used throughout. Our analyss proceeds by frst gvng exact formulas for P q,b (A) n terms of feld cardnalty q, projected dmenson b, and the elementary dvsors of A. Let P q,b (n) = mn({p q,b (A) A F n n q }), a functon of feld cardnalty, q, projected block sze, b, and the matrx dmenson, n. Buldng from our formula for P q,b (A), we gve a means to compute P q,b (n) precsely and hence to derve a sharp lower bound. Our bound s less pessmstc than earler ones such as (Kaltofen and Saunders, 1991; Kaltofen, 1995) whch prmarly apply when the feld s large. Even for cardnalty 2, we show that a modest block sze (such as b = 22) assures hgh probablty of preservng the mnmal polynomal. A key observaton s that when the cardnalty s small the number of low degree rreducble polynomals s also small. Wedemann (1986) used ths observaton to make a bound for probablty of mnmal polynomal preservaton n the nonblocked algorthm. Here, we have exact formulas for P q,b (A) whch are worst when the rreducbles n the elementary dvsors of A are as small as possble. Combnng that wth nformaton on the number of low degree rreducbles, we obtan a sharp lower bound for the probablty of mnmal polynomal preservaton for arbtrary n n matrx (when the elementary dvsors are not known a pror). Every square matrx, A, over a fnte feld F s smlar over F to ts generalzed Jordan normal form, J(A), a block dagonal drect sum of the Jordan blocks of ts elementary dvsors, whch are powers of rreducble polynomals n F[x]. A and J(A) have the same dstrbuton of random projectons. Thus we may focus attenton on matrces n Jordan form. After secton 2 on basc defntons concernng matrx structure and lnear recurrent sequences, the central result, theorem 16 s the culmnaton of secton 3 where probablty of preservng the mnmal polynomal for a matrx of gven elementary dvsors s analyzed. Examples mmedately followng theorem 16 llustrate the key ssues. The exact formulaton of the probablty of mnmal polynomal preservaton n terms of matrx, feld, and block szes s our man result, theorem 20, n secton 4. It s corollares provde some smplfed bounds. Secton 4.2, specfcally fgure 1, llustrates practcal applcablty. We fnsh wth concludng remarks, secton 5. 2 Defntons and Jordan blocks Let F m n be the vector space of m n matrces over F, and F m n the vector space of sequences of m n matrces over F. For a sequence S = (S 0, S 1, S 2,...) F m n and polynomal f(x) = d =0 f x F[x], defne f(s) as the sequence whose kth term s d =o f S +k. Ths acton s a multplcatve group acton of F[x] on F m n, because (fg)(s) = f(g(s)) for f, g F[x] and f(s +αt ) = f(s)+αf(t ) for S, T F m n and α F. Further, f f(s) = 0 we say f annhlates S. In ths case, S s completely determned by f and ts leadng d coeffcent matrces S 0, S 1,..., S d 1. Then S s sad to be lnearly generated, and f(x) s also called a generator of S. Moreover, for gven S, the set of polynomals that generate S s an deal of F[x]. Its unque monc generator s called the mnmal generatng polynomal, or just mnmal polynomal of S and s denoted mnpoly(s). In partcular, the deal of the whole of F[x] s generated by 1 and, actng on sequences, generates only the zero sequence. For a square matrx A, the mnmal polynomal of the sequence Ā = (I, A, A2,...) s also called the mnmal polynomal of A. (mnpoly(a) = mnpoly(ā)). We wll consder the natural transforms of sequences by matrx multplcaton on ether 2
3 sde. For U F b m, US = (US 0, US 1, US 2,...) over F b n, and for V F n b, SV = (S 0 V, S 1 V, S 2 V,...) over F m b. For any polynomal g, t follows from the defntons that g(usv ) = Ug(S)V. It s easy to see that the generators of S also generate US and SV, so that mnpoly(u S) mnpoly(s), and mnpoly(u SV ) mnpoly(sv ) mnpoly(s). More specfcally, we are concerned wth random projectons, UĀV, of a square matrx A, where U, V are unformly random, U F b n, V F n b. By unformly random, we mean that each of the (fntely many) matrces of the gven shape s equally lkely. Lemma 1. Let A, B be smlar square matrces over F q and let b be any block sze. Then P q,b (A) = P q,b (B). In partcular, P q,b (A) = P q,b (J) where J s the generalzed Jordan form of A. Proof. Suppose A and B are smlar, so that B = W AW 1, for a nonsngular matrx W. The (U, V ) projecton of W AW 1 s the (UW, W 1 V ) projecton of A. But when U, V are unformly random varables, then so are UW and W 1 V, snce the multplcatons by W and W 1 are bjectons. Thus, wthout loss of generalty, n the rest of the paper we wll restrct attenton to matrces n generalzed Jordan normal form. We descrbe our notaton for Jordan forms next. The companon matrx of a monc polynomal f(x) = f 0 + f 1 x f d 1 x d 1 + x d of degree d s f 0 C f f 1 I C f C f = f 2 and J f e = 0 I C f f d I C f s the Jordan block correspondng to f e, a de de matrx. It s standard knowledge that the mnmal polynomal of J f e s f e. When e = 1, J f = C f. In partcular, we use these basc lnear algebra facts: For rreducble f, (1) f e 1 (J f e) s zero everywhere except n the lowest leftmost block where t s a nonsngular polynomal n C f (see, for example, Robnson (1970)), and (2) the Krylov matrx K Cf (v) = (v, C f v, Cf 2 v,..., Cd 1 f v) s nonsngular unless v = 0. Generalzed Jordan normal forms are (block dagonal) drect sums of prmary components, J = J e f,j, j where the f are dstnct rreducbles and the e,j are postve exponents, nonncreasng wth respect to j. Every matrx s smlar to a generalzed Jordan normal form, unque up to order of blocks. 3 Probablty Computaton, Matrx of Gven Structure Recall our defnton that, for A F n n q, P q,b (A) denotes the probablty that mnmal polynomal s preserved under projecton to b b,.e., mnpoly(a) = mnpoly(uāv ) for 3
4 unformly random U F b n q and V F n b q. For the results of ths paper the characterstc of the feld s not mportant. However the cardnalty q s a key parameter n the results. For smplcty, we are restrctng to projecton to square blocks. It s straghtforward to adjust these formulas to the case of rectangular blockng. By lemma 1, we may assume that the gven matrx s n generalzed Jordan form, whch s a block dagonal matrx. The projectons of a block dagonal matrx are sums of ndependent projectons of the blocks. In other words, for the U, V projecton of A = A let U, V be the blocks of columns of U and rows of V conformal wth the block szes of the A. Then UĀV = U Ā V. In addtonto ths observaton the partcular structure of the Jordan form s utlzed. In subsecton 3.1 we show that the probablty P q,b (A) may be expressed n terms of P q,b (J(f)) for the prmary components, J(f) = j J f e j, assocated wth the dstnct rreducble factors of the mnmal polynomal of A. Ths s further reduced to the probablty for a drect sum of companon matrces C f n Fnally, the probablty for C f s calculated n by reducng t to the probablty that a sum of rank 1 matrces over the extenson feld F q [x]/ f(x) s zero. In consequence we obtan a formula for P q,b (A) n theorem 16. Examples Examples llustratng theorem 16 are gven n subsecton Reducton to Prmary Components Let A = j J f e,j F n n q, where the f F q [x] are dstnct rreducble polynomals and the e,j are postve exponents, nonncreasng wth respect to j. In ths secton, we show that P q,b (A) = P q,b j J f e,j. Lemma 2. Let S and T be lnearly generated matrx sequences. Then mnpoly(s + T ) lcm(mnpoly(s), mnpoly(t )). Proof. Let f = mnpoly(s), g = mnpoly(t ) and d = gcd(f, g). The lemma follows from the observaton that (fg/d)(s + T ) = (fg/d)(s) + (fg/d)(t ) = (g/d)(f(s)) + (f/d)(g(t )) = 0. As an mmedate corollary we get equalty when f and g are relatvely prme. Corollary 3. Let S and T be lnearly generated matrx sequences wth f = mnpoly(s) and g = mnpoly(t ) such that gcd(f, g) = 1. Then mnpoly(s + T ) = fg. Proof. By the prevous lemma, mnpoly(s + T ) = f 1 g 1 wth f 1 f and g 1 g. We show that f 1 = f and g 1 = g. Under our assumptons, 0 = fg 1 (S + T ) = fg 1 (S) + fg 1 (T ) = fg 1 (T ) so that fg 1 s a generator of T. But f g 1 s a proper dvsor of g, then fg 1 s not n the deal generated by g, a contradcton. Smlarly f 1 must equal f. Theorem 4. Let A = j J f e,j F n n q, where the f are dstnct rreducbles and the e,j are postve exponents, nonncreasng wth respect to j. Then, P q,b (A) = P q,b ( j J f e,j ). 4
5 Proof. Let S = UĀV, and S = U j J f e,j V, where U, V are blocks of U, V conformng to the dmensons of the blocks of A. Then, S = S. Let g = mnpoly (S ). Because g f e,1 and all f are unque rreducbles, then gcd(g, g j ) = 1 when j. Therefore, by corollary 3, mnpoly(s) = g. Therefore mnpoly(s) = mnpoly(a) f and only f mnpoly(s ) = f e,1 for all, and P q,b (A) = ( ) P q,b j J f e,j. 3.2 Probablty for a Prmary Component Next we calculate P q,b ( J f e ), where f F q [x] s an rreducble polynomal and e are postve ntegers. We begn wth the case of a sngle Jordan block before movng on to the case of a drect sum of several blocks. determned by an rreducble power, f e. P q,b (J) s ndependent of e. Thus, P q,b (J f e) = P q,b (C f ). Ths fact and P q,b (C f ) are the subject of the next lemma. Consder the Jordan block J F n n q Theorem 5. Gven a fnte feld F q, an rreducble polynomal f(x) F q [x] of degree d, an exponent e, and a block sze b, let J = J f e F de de q be the Jordan block of f e and let J be the sequence (I, J, J 2,...). For U F b de q and V F de b q the followng propertes of mnmal polynomals hold. 1. If the entres of V are unformly random n F q, then Prob(f e = mnpoly( JV )) = 1 1/q db. Note that the probablty s ndependent of e. 2. If V s fxed and the entres of U are unformly random n F q, then wth equalty f V 0. Prob(mnpoly( JV ) = mnpoly(u JV )) 1 1/q db, 3. If U and V are both unformly random, then P q,b (J) = Prob(f e = mnpoly(u JV )) = (1 1/q db ) 2 = P q,b (C f ). Proof. For parts 1 and 2, let M be the lower left d d block of f e 1 (J). M s nonzero and all other parts of f e 1 (J) are zero. Note that F q [C f ], the set of polynomals n the companon matrx C f, s somorphc to F q [x]/ f. Snce M s nonzero and a polynomal n C f, t s nonsngular. Snce for any polynomal g and matrx A one has g(ā) = Āg(A), the lower left blocks of the sequence f e 1 ( J) form the sequence (M, C f M, Cf 2M,...) = C f M. Part 1. f e 1 ( J)V s zero except n ts lower d rows whch are C f MV 1, where V 1 s the top d rows of V. Ths sequence s nonzero wth mnmal polynomal f unless V 1 = 0 whch has probablty 1/q db. Part 2. If V = 0 the nequalty s trvally true. For V 0, Uf e 1 ( J)V s zero except n ts lower left d d corner U ecf MV 1, where V 1 s the top d rows of V and U e s the rghtmost d columns of U. Snce M s nonsngular, MV 1 s unformly random and the queston s reduced to the case of projectng a companon matrx. Let C = C f for rreducble f of degree d. For nonzero V F d b, CV s nonzero and has mnpoly f. We must show that f U F b d s nonzero then U CV also has mnpoly f. Let 5
6 v be a nonzero column of V. The Krylov matrx K C (v) = (v, Cv, C 2 v,..., C d 1 v) has as t s columns the frst d vectors of the sequence Cv. Snce v s nonzero, ths Krylov matrx s nonsngular and uk C (v) = 0 mples u = 0. Thus, for any nonzero vector u, we have u Cv 0 so that, for nonzero U, the sequence U C f V s nonzero and has mnmal polynomal f as needed. Of the q db possble U, only U = 0 fals to preserve the mnmal polynomal. Part 3. By parts 1 and 2, we have (1 1/q db ) probablty of preservaton of mnmum polynomal f e, frst at rght reducton by V to the sequence JV and then agan the same probablty at the reducton by U to block sequence U JV. Therefore, P q,b (J) = (1 1/q db ) Reducton to a Drect Sum of Companon Matrces Consder the prmary component J = J f e, for rreducble f, and let e = max(e ). We reduce the queston of projectons preservng mnmal polynomal for J to the correspondng queston for drect sums of the companon matrx C f, whch s then addressed n the next secton. Lemma 6. Let J = J f e, where f F q [x] s rreducble, and e are postve ntegers. Let e = max(e ). Let s be the number of e equal to e. Then, ( s ) P q,b (J) = P q,b C f. =1 Proof. The mnmal polynomal of J s f e and that of f e 1 (J) s f. A projecton U JV preserves mnmal polynomal f e f and only f f e 1 (U JV ) has mnmal polynomal f. For all e < e we have f e 1 (J f e ) = 0, so t suffces to consder drect sums of Jordan blocks for a sngle (hghest) power f e. Let J e = J f e be the Jordan block for f e, and let A = s =1 J e. A projecton UĀV s successful f t has the same mnmal polynomal as A. Ths s the same as sayng the e 1 mnmal polynomal of f (UĀV ) s f. We have s e 1 f (UĀV ) = Ufe 1 (Ā)V = =1 U f e 1 ( J e )V = s =1 U,e Cf Ṽ,1. For the last expresson U,e s the rghtmost block of U and Ṽ,1 s the top block of MV. The equalty follows from the observaton n the proof of theorem 5 that f e 1 ( J) s the sequence that has C f M (M nonsngular) n the lower left block and zero elsewhere. Thus, P q,b (J) = P q,b ( s =1 C f ) Probablty for a Drect Sum of Companon Matrces Let f be rreducble of degree d. To determne the probablty that a block projecton of A = t =1 C f preserves the mnmal polynomal of A, we need to determne the probablty that U Cf V = 0. We show that ths s equvalent to the probablty that a sum of rank =1 one matrces over K = F q [x]/ f(x) s zero and we establsh a recurrence relaton for ths probablty n corollary 14. Ths may be consdered the heart of the paper. 6
7 Lemma 7. Let A = t =1 C f F n n q, where f F q [x] s rreducble of degree d. P q,b (A) s equal to the probablty that S = UĀV = U Cf V 0, where U Fq b n and V F n b q =1 are chosen unformly randomly, and U, V are blocks of U, V, respectvely, conformng to the dmensons of the blocks of A. Proof. Because mnpoly(s) mnpoly(a) and mnpoly(a) = f, then mnpoly(s) f. Because f s rreducble, t has just two dvsors: f and 1. The dvsor 1 generates only the zero sequence. Therefore, f S = 0 then mnpoly(s) = 1. Otherwse, mnpoly(s) = f. Thus P q,b (A) equals the probablty that S 0. The connecton between sums of sequences UC f V and sums of rank one matrces over the extenson feld K s obtaned through the observaton that for column vectors u, v, one has u T Cf v = u T ρ(v) where ρ s the regular matrx representaton of K,.e. ρ(v)u = vu n K. The vectors u and v can be nterpreted as elements of K by assocatng them wth the polynomals u(x) = d 1 =0 u x and v(x) = d 1 =0 v x. Moreover, f {1, x, x 2,..., x d 1 } s chosen as a bass for K over F, then ρ(x) = C f and ρ(v) = d 1 =0 v ρ(x) = d 1 =0 v Cf. Lettng C = C f, the ntal segment of u T Cf v s u T (v, Cv, C 2 v,..., C d 1 v), whch s u T K C (v), where K C (v) s the Krylov matrx whose columns are C v. The followng lemma shows that K C (v) = ρ(v) and establshes the connecton u T Cf v = u T ρ(v). Lemma 8. Let f be an rreducble polynomal and K = F[x]/ f be the extenson feld defned by f. Let ρ be the regular representaton of K and C = C f the companon matrx of f. Then ρ(v) = d 1 j=0 v jc j = K C (v). Proof. Let e j be the vector wth a one n the jth locaton and zeros elsewhere. Then, abusng notaton, ρ(v)e j = v(x)x j (mod f) and K C (v)e j = C j v = x j v(x)(mod f). Snce ths s true for arbtrary j the lemma s proved. Let U and V be b d and d b matrces over F. Let u be the th row of U and v j be jth column of V. The sequence U CV of b b matrces can be vewed as a b b matrx of sequences whose (, j) element s equal, by the dscusson above, to u ρ(v j ) T. Ths matrx can be mapped to the b b matrx over K whose (, j) element s the product u v j = ρ(v j )u. Ths s the outer product UV T, wth U and V vewed as a column vector over K and a row vector over K respectvely. Hence t s a rank one matrx over K provded nether U nor V s zero. Snce any rank one matrx s an outer product, ths mappng can be nverted. There s a one to one assocaton of sequences U CV wth rank one matrces over K. To show that ths mappng relates rank to the probablty that the block projecton UĀV preserves the mnmum polynomal of A, we must show that f t k=1 U kc f V k = 0 then the correspondng sum of t rank one matrces over K s the zero matrx and vce versa. Ths wll be shown usng the fact that the transpose ρ(v) T s smlar to ρ(v). Whle t s well known that a matrx s smlar to ts transpose, we provde a proof n the followng lemma whch constructs the smlarty transformaton and shows that the same smlarty transformaton works ndependent of v. Lemma 9. Gven an rreducble monc polynomal f F q [x] of degree d, there exsts a symmetrc nonsngular matrx P such that P 1 ρ(v)p = ρ(v) T, for all v F d q. 7
8 Proof. We begn wth C f. Every matrx s smlar to t s transpose by a symmetrc transform (Taussky and Zassenhaus, 1959). Let P be a smlarty transform such that P 1 C f P = Cf T. Then P 1 ρ(v)p = d 1 k=0 v kp 1 Cf kp = d 1 k=0 v k(cf k)t = ρ(v) T. It may be nformatve to have an explct constructon of such a transform P. It can be done wth Hankel structure (equalty on antdagonals). Let H n (a 1, a 2..., a n, a n+1,..., a 2n 1 ) denote the n n Hankel matrx wth frst ( row ) (a 1, a 2,..., a n ) and a b last row (a n, a n+1,..., a 2n 1 ). For example H 2 (a, b, c) =. Then defne P as b c P = f 0 H d 1 (f 2, f 3,..., f d 1, 1, 0,..., 0). A straghtforward computaton verfes C f P = P Cf T. Lemma 10. Gven an rreducble monc polynomal f F q [x] and t s extenson feld K = F q [x]/ f(x), there exsts a onetoone, onto mappng from the b b projectons of Cf to K b b that preserves zero sums,.e. U C f V = 0 ff φ( U C f V ) = φ(u C f V ) = 0. Proof. The prevous dscusson shows that the mappng UC f V UV T from b b projectons of Cf onto rank one matrces over K s onetoone. Let u k, and v k,j be the th row of U k and and the jth column of V k, respectvely. Let P be a matrx, whose exstence follows from lemma 9, such that P 1 ρ(v)p = ρ(v) T. Assume t k=1 U kc f V k = 0. Then usng lemma 8 and propertes of ρ k=1 u T k, C f v k,j = 0 u T k,ρ(v k,j ) = 0 u T k,p P 1 ρ(v k,j )P P 1 = 0 k=1 u T k,p ρ(v k,j ) T P 1 = 0 k=1 k=1 (u T k,p )ρ(v k,j ) T = 0 k=1 ũ k, v k,j = 0, where ũ k, = (u T k,p ). k=1 Let Ũk be the vector whose th row s ũ k, then the correspondng sum of outer projects t k=1 ŨkVk T = 0. Because P s nvertble, the argument can be done n reverse, and for any zero sum of rank one matrces over K we can construct the correspondng sum of projectons equal to zero. Thus the probablty that =1 U C f V = 0 s the probablty that randomly selected tterm outer products over K sum to zero. The next lemma on rank one updates provdes basc results leadng to these probabltes. Lemma 11. Let r, s 0 be gven and consder rank one updates to A = I r 0 s. For conformally blocked column vectors u = (u T 1, u T 2 ) T, v = (v1 T, v2 T ) T F r F s. we have that rank(a + uv T ) = r 1 f and only f u T 1 v 1 = 1 and u 2, v 2 are both zero, and rank(a + uv T ) = r + 1 f and only f u 2, v 2 are both nonzero. Proof. Wthout loss of generalty (orthogonal change of bass) we may restrct attenton to the case that u 1 = αe r and u 2 = βe r+1, where e s the th unt vector, α = 0 f 8
9 u 1 = 0 and α = 1 otherwse, and smlarly for β vs a vs u 2. Suppose that n ths bass v = (w 1,..., w r, z r+1,..., z n ) T. Then (I r 0) + uv T = αw αw r αz r+1... αz n βw 1... βw r βz r+1... βz n The rank of I r + u 1 v1 T s r 1 just n case u T 1 v 1 = 1 (Meyer, 2000). In our settng ths condton s that αw r = 1. We see that, for a rank of r 1, we must have that αw r = 1 and β, z both zero. For rank r + 1 t s clearly necessary that both( of β, z are nonzero. ) 1 + αwr αz It s also suffcent because for z 0 the order r + 1 mnor I r 1 has βw r βz determnant βz 0. These condtons translate nto the statements of the lemma before the change of bass. Corollary 12. Let A F n n q be of rank r, and let u, v be unformly random n F n q. Then, 1. the probablty that rank(a + uv T ) = r 1 s D(r) = qr 1 (q r 1) q 2n, 2. the probablty that rank(a + uv T ) = r + 1 s 3. the probablty that rank(a + uv T ) = r s wth equalty when r = 0. U(r) = (qn r 1) 2 q 2(n r), N(r) = 1 D(r) U(r) 2qn 1 q 2n, Proof. There exst nonsngular R, S such that RAS = I r 0 and R(A + uv T )S = I r 0 + (Ru)(S T v) T. Snce Ru and S T v are unformly random when u, v are, we may assume wthout loss of generalty that A = I r 0. For part 1, by corollary 12, the rank of I r 0 + uv T s less than r only f both u, v are zero n ther last n r rows and u T v = 1. For u, v F r q, u T v = 1 only when u 0 and we have, for the frst such that u 0, that v = u 1 j u jv j. Countng, there are q r 1 possble u and then q r 1 v s satsfyng the condtons. The stated probablty follows. For part 2, by the precedng lemma, the rank s ncreased only f the last n r rows of u and v are both nonzero. The probablty of ths s (qn r 1) 2. q 2(n r) For the part 3 nequalty, f the sgn s changed and 1 s added to both sdes, the nequalty becomes D(r) + U(r) ( q n 1 q n ) 2. Note that U(r) = ( q n q r q n ) 2 and D(r) ( q r 1 q n ) 2. Let 9
10 ( a = q n q r q n ) and b = ( q r 1 q n ). Note that a and b are postve. Thus, t s obvous that a 2 + b 2 (a + b) 2. That s, ( q n q r ) 2 ( q r ) 2 ( 1 q n ) 2 1 U(r) + D(r) +. Therefore, N(r) = 1 D(r) U(r) 2qn 1 q 2n. q n Defnton 13. For u, v unformly random n F b q, and A = t =1 u v T F n n q, let Q q,n,t (r) denote the probablty that rank(a) = r. Corollary 14. Let A = u v T, for unformly random u, v F n q, and let D(r), U(r), and =1 N(r) be defned as descrbed n corollary 12. Let Q t (r) = Q q,n,t (r) (defnton 13). Then, Q t (r) satsfes the recurrence relaton 0, f r < 0 or r > mn(t, n) Q t (r) = 1, f r = 0 and t = 0 φ t 1 (r), otherwse, where φ t (r) = Q t (r 1)U(r 1) + Q t (r)n(r) + Q t (r + 1)D(r + 1); and U(r), N(r), D(r) are defned as they are n corollary 12. Proof. The general recurrence s evdent from the fact that a rank one update can change the rank by at most one, and that Q 0 (0) = 1. The rank of the sum of t rank one matrces cannot be greater than ether t or n, nor less than zero. These probabltes apply as well to the premage of our mappng (block projectons of drect sums of companon matrces), whch leads to the next theorem. Theorem 15. Let f F q [x] be an rreducble polynomal of degree d, and let A = s =1 C f F n n q. Then, P q,b (A) = 1 Q s (0) 1 Q 1 (0), where Q s (r) = Q qd,b,s(r) (defnton 13). Proof. By lemmas 7 and 10, the probablty that a b b projecton of A fals s precsely Q s (0). For the nequalty, n all cases Q s (1) 1 Q s (0). Therefore, q n q n Let g(x) = x 2qdb q d q 2db Q s+1 (0) = Q s (0) 2qdb 1 q 2db + Q s (1) qd 1 q 2db Q s (0) 2qdb 1 q 2db + (1 Q s (0)) qd 1 q 2db = Q s (0) 2qdb q d q 2db + qd 1 q 2db. + qd 1 q 2db. Snce q, d, b are postve ntegers, g(x) s lnear wth postve slope. Probablty Q s (0) has range [0,1] and we have Q s+1 (0) g(q s (0)) g(1) = 2qdb 1 q 2db = Q 1 (0). Therefore, Q 1 (0) Q s (0), for all s 1. 10
11 Theorem 15 generalzes theorem 5. That s, P q,b (C f ) = 1 Q q d,b,1(0) = (1 1/q db ) 2, where f F q [x] s an rreducble polynomal of degree d. Theorem 15 makes clear that P q,b ( s =1 C f ) s mnmzed when there s a sngle block, s = 1. The followng theorem summarzes the exact computaton of the probablty that the mnmal polynomal of a matrx s preserved under projecton, n terms of the elementary dvsor structure of the matrx. Theorem 16. Let A F n n q be smlar to J = j J f e,j, where the f are dstnct rreducbles of degree d, and the e,j are postve exponents, nonncreasng wth respect to j. Let s be the number of e,j equal to e,1. Then, P q,b (A) = P q,b (J) = P q,b j J f e,j = ( s ) P q,b C f = (1 Q q d,b,s (0)). Proof. By lemma 1, P q,b (A) = P q,b (J). By theorem 4, P q,b (J) = ( ) P q,b j J f e,j. By ( ) lemma 6, P q,b j J f e,j = P q,b ( s k=1 C f ). Fnally, by theorem 15, P q,b ( s k=1 C f ) = 1 Q q d,b,s (0). Therefore, P q,b (A) = (1 Q q d,b,s (0)). 3.3 Examples Ths secton uses theorem 16 to compute P q,b (A) for several example matrces, and compares the probablty for matrces wth related but not dentcal nvarant factor lsts A 1 = , A = , A = , A 4 = , A 5 = k= where A F Let f(x) and g(x) be the rreducble polynomals (x 2 +3x+6) and (x+4) n F 7 [x]. Let F (A) denote the lst of nvarant factors of A ordered largest to smallest. Thus,, F (A 1 ) = {f(x)g(x), g(x), g(x)}, F (A 2 ) = {f(x) 2 g(x)}, F (A 3 ) = {f(x)g(x), f(x)}, F (A 4 ) = {f(x)g(x) 2, g(x)}, F (A 5 ) = {(x + 2)(x + 3)(x + 4)(x + 5)(x + 6)}. 11
12 By theorem 16, P 7,b (A 1 ) = P 7,b (C f )P 7,b (C g C g C g ) = (1 Q 72,b,1(0))(1 Q 7,b,3 (0)), P 7,b (A 2 ) = P 7,b (J f 2)P 7,b (C g ) = (1 Q 72,b,1(0))(1 Q 7,b,1 (0)), P 7,b (A 3 ) = P 7,b (C f C f )P 7,b (C g ) = (1 Q 72,b,2(0))(1 Q 7,b,1 (0)), P 7,b (A 4 ) = P 7,b (C f )P 7,b (J g 2 C g ) = (1 Q 72,b,1(0))(1 Q 7,b,1 (0)), P 7,b (A 5 ) = 5 5 P 7,b (C x +7 ) = (1 Q 7,b,1 (0)). =1 =1 Table 1: P 7,b (A ) vs b b=1 b=2 b=3 b=4 P 7,b (A 1 ) P 7,b (A 2 ) P 7,b (A 3 ) P 7,b (A 4 ) P 7,b (A 5 ) By part 3 of theorem 5, (1 Q 72,b,1(0)) = (1 1/7 2b ) 2 and (1 Q 7,b,1 (0)) = (1 1/7 b ) 2. Usng the recurrence relaton n corollary 14, we may compute Q 7,b,3 (0) and Q 72,b,2(0). Table 1 shows the resultng probabltes. Observe that P 7,b (A ) ncreases as b ncreases. These fve examples llustrate the effect of varyng matrx structure and block sze on P q,b (A ). By theorem 15, P 7,b (C g Cg Cg ) > P 7,b (C g ) and P 7,b (C f Cf ) > P 7,b (C f ). By theorem 16, P 7,b (J f 2) = P 7,b (C f ) and P 7,b (J g 2 Cg ) = P 7,b (C g ). Therefore, P 7,b (A 1 ) > P 7,b (A 2 ) and smlarly P 7,b (A 3 ) > P 7,b (A 2 ) = P 7,b (A 4 ). Fnally, snce (1 1/7 b ) 2 < 1 and (1 1/7 b ) 2 < (1 1/7 2b ) 2, P 7,b (C h1 Ch2 ) < P 7,b (C g ) and P 7,b (C h ) < P 7,b (C f ), for any lnear h 1 (x), h 2 (x), h(x) F 7 [x]. Therefore, P 7,b (A 5 ) has the mnmal probablty amongst the examples and n fact has the mnmal probablty for any 5 5 matrx. The worst case bound s explored further n the followng secton. 4 Probablty Bounds: Matrx of Unknown Structure Gven the probabltes determned n secton 3 of mnmum polynomal preservaton under projecton, t s ntutvely clear that the lowest probablty of success would occur when there are many elementary dvsors and the degrees of the rreducbles are as small as possble. Ths s true and s precsely stated n theorem 20 below. Frst we need several lemmas concernng drect sums of Jordan blocks. For A F n n q, as before, P q,b (A) denotes the probablty that mnpoly(a) = mnpoly(uāv ), where U Fb n q and V F n b q are unformly random. Lemma 17. Let f be an rreducble polynomal over F q, let e 1 =... = e s > e s+1... e t be a sequence of exponents for f, and let b be the projecton block sze. Then P q,b (J f e 1 + +e t ) P q,b (J f e 1 J f e t ) = P q,b (J f e 1 J f es ) 12
13 Proof. Ths follows from part 3 of theorem 5, and theorems 15 and 16, snce P q,b (J f e 1 + +e t ) = 1 Q 1 (0) 1 Q s (0) = P q,b (J f e 1 J f e t ). Lemma 18. Let f be an rreducble polynomal over F q of degree d, let f 1,..., f e be dstnct rreducble polynomals of degree d over F q, and let b be the projecton block sze. Then P q,b (J f1 J fe ) P q,b (J f e). Proof. Ths follows from theorem 4 and part 3 of theorem 5, snce P q,b (J f1 J fe ) = e =1 P q,b(j f ) and P q,b (J f e) = P q,b (J f ) = (1 1/q db ) 2 < 1. Lemma 19. Let f 1 and f 2 be rreducble polynomals over F q of degree d 1 and d 2 respectvely and let b be any projecton block sze. Then, f d 1 d 2, P q,b (J f1 ) P q,b (J f2 ). Proof. The follows agan from Part 3 of theorem 5 snce (1 1/q d1b ) 2 (1 1/q d2b ) 2. Recall the defnton: P q,b (n) = mn({p q,b (A) A F n n q }). Ths s the worst case probablty that an n n matrx has mnmal polynomal preserved by unformly random projecton to a b b sequence. In vew of the above lemmata, for the lowest probablty of success we must look to matrces wth the maxmal number of elementary dvsors. Defne L q (m) to be the number of monc rreducble polynomals of degree m n F q [x]. By the well known formula of Gauss (1981), L q (m) = 1/m µ(m/d)q d, d m where µ s the Möbus functon. Asymptotcally L q (m) converges to q m /m. By defnton, µ(a) = ( 1) k for square free a wth k dstnct prme factors and µ(a) = 0 otherwse. The degree of the product of all the monc rreducble polynomals of degree d s then dl q (d). When we want to have a maxmal number of rreducble factors n a product of degree n, we wll use L q (1), L q (2),..., L q (m 1) etc., untl the contrbuton of L q (m) no longer fts wthn the degree n. In that case we fnsh wth as many of the degree m rreducbles as wll ft. For ths purpose we adopt the notaton ( m 1 r L q (n, m) := mn L q (m),, for r = n dl q (d). m ) Theorem 20. Let F = F q be the feld of cardnalty q. For the m such that m 1 d=1 dl q(d) n < m d=1 dl q(d), m P q,b (n) = (1 1/q db ) 2Lq(n,m). d=1 Let r = n m d=1 dl q(m, d). When r 0 (mod m), the mnmum occurs for those matrces whose elementary dvsors are rreducble (not powers thereof), dstnct, and wth degree as small as possble. When r 0 (mod m) the mnmum occurs when the elementary dvsors nvolve exactly the same rreducbles as n the r 0 (mod m) case, but wth some elementary dvsors beng powers so that that the total degree s brought to n. d=1 13
14 Proof. Let A F n n q and let f e1 1,..., f t et be rreducble powers equal to the nvarant factors of A. If P q,b (A) s mnmal, then by lemmas 17,18,19 we can assume that the f are dstnct and have as small degrees as possble. Snce m 1 d=1 dl q(d) n < m d=1 dl q(d), ths assumpton mples that all rreducbles of degree less than m have been exhausted. If addtonal polynomals of degree m can be added to obtan an n n matrx, ths wll lead to the mnmal probablty snce addng any rreducbles of hgher degree wll, by theorem 5, reduce the total probablty by a lesser amount. In ths case all of the exponents, e wll be equal to one. If r s not 0, then an n n matrx can be obtaned by ncreasng some of the exponents, e, wthout changng the probablty. Ths, agan by theorem 5, wll lead to a smaller probablty than those obtaned by removng smaller degree polynomals and addng a polynomal of degree m or hgher. 4.1 Approxmatons Theorem 20 can be smplfed usng the approxmatons L q (m) q m /m and (1 1/a) a 1/e. Corollary 21. For feld cardnalty q, matrx dmenson n, and projecton block dmenson b, P q,b (n) e 2 q b Hm, where H m s the mth harmonc number. Also, for large prmes, the formula of theorem 20 smplfes qute a bt because there are plenty of small degree rreducbles. In the next corollary we consder (a) the case n whch there are n lnear rreducbles and (b) a stuaton n whch the worst case probablty wll be defned by lnear and quadratc rreducbles. Corollary 22. For feld cardnalty q, matrx dmenson n, and projecton block dmenson b, f q n then P q,b (n) = (1 1/q b ) 2n e 2n/qb. If n > q n 1/2 then P q,b (n) = (1 1/q b ) 2q (1 1/q 2b ) n q e (2/qb 1 +(n q)/q 2b). 4.2 Example Bound Calculatons and Comparson to Prevous Bounds When b = 1 and we are only concerned wth projecton on one sde, the frst formula of corrolary 22 smplfes to (1 1/q) n = (1 n/q +...). The bound gven by Kaltofen and Pan (Kaltofen and Pan, 1991; Kaltofen and Saunders, 1991) for the probablty of mnpoly(uāv) = mnpoly(āv) s the frst two terms of ths expanson, though developed wth a very dfferent proof. For small prmes, Wedemann (1986)(proposton 3) treats the case b = 1 and he fxes the projecton on one sde because he s nterested n lnear system solvng and thus n the sequence Āb, for fxed b. For small q, hs formula, 1/(6 log q(n)), computed wth some approxmaton, s nonetheless qute close to our exact formula. However as q approaches n the dscrepancy wth our exact formula ncreases. At the large/small crossover, q = n, 14
15 Kaltofen/Pan s lower bound s 0, Wedemann s s 1/6, and ours s 1/e. The Kaltofen/Pan probablty bound mproves as q grows larger from n. The Wedemann bound becomes more accurate as q goes down from n. But the area q n s of some practcal mportance. In nteger matrx algorthms where the fnte feld used s a choce of the algorthm, sometmes practcal consderatons of effcent feld arthmetc encourages the use of prmes n the vcnty of n. For nstance, exact arthmetc n double precson and usng BLAS (Dumas et al., 2008) works well wth q Sparse matrces of order n n that range are tractable. Our bound may help justfy the use of such prmes. Fgure 1: Probablty of Falure to Preserve Mnmal Polynomal (1 P q,b (10 8 )) vs Block Sze and Feld Cardnalty But the prmary value we see n our analyss here s the understandng t gves of the value of blockng, b > 1. Fgure 1 shows the bounds for the worst case probablty that a random projecton wll preserve the mnmal polynomal of a matrx A F q for varous felds and projecton block szes. It shows that the probablty of fndng the mnmal polynomal correctly under projecton converges rapdly to 1 as the projected block sze ncreases. 5 Concluson We have drawn a precse connecton between the elementary dvsors of a matrx and the probablty that a random projecton, as done n the (blocked or unblocked) Wedemann algorthms, preserves the mnmal polynomal. We provde sharp formulas both for the case where the elementary dvsor structure of the matrx s known (theorem 4 and theorem 16) 15
16 and for the worst case (theorem 20). As ndcated n fgure 1 for the worst case, a blockng sze of 22 assures probablty of success greater than for all fnte felds and all matrx dmensons up to The probablty decreases very slowly as matrx dmenson grows and, n fact, further probablty computatons show that the one n a mllon bound on falure apples to blockng sze 22 wth much larger matrx dmensons as well. Lookng forward, t would be worthwhle to extend the analyss to apply to the determnaton of addtonal nvarant factors. Blockng s known to be useful for fndng and explotng them. For example, some rank and Frobenus form algorthms are based on block Wedemann (Eberly, 2000a,b). Also, we have not addressed precondtoners. The precondtoners such as dagonal, Toepltz, butterfly (Chen et al., 2002), ether apply only for large felds or have only large feld analyses. One can generally use an extenson feld to get the requste cardnalty, but the computatonal cost s hgh. Block algorthms hold much promse here and analyss to support them over small felds wll be valuable. References Chen, L., Eberly, W., Kaltofen, E., Turner, W., Saunders, B. D., Vllard, G., Effcent matrx precondtoners for black box lnear algebra. LAA , 2002, Coppersmth, D., Solvng homegeneous lnear equatons over GF (2) va block Wedemann algorthm. Mathematcs of Computaton 62 (205), Dumas, J.G., Gorg, P., Pernet, C., Dense lnear algebra over wordsze prme felds: the FFLAS and FFPACK packages. ACM Trans. Math. Softw. 35 (3), Eberly, W., 2000a. Asymptotcally effcent algorthms for the Frobenus form. Techncal report, Department of Computer Scence, Unversty of Calgary. Eberly, W., 2000b. Black box Frobenus decompostons over small felds. In: ISSAC 00. ACM Press, pp Proc. of Eberly, W., Gesbrecht, M., Gorg, P., Storjohann, A., Vllard, G., Solvng sparse ratonal lnear systems. In: Proc. of ISSAC 06. ACM Press, pp Gauss, C. F., Untersuchungen Chelsea. Über Höhere Arthmetk, second edton, reprnted. Gorg, P., Jeannerod, C.P., Vllard, G., On the complexty of polynomal matrx computatons. In: Proc. of ISSAC 03. pp Kaltofen, E., Analyss of Coppersmth s block Wedemann algorthm for the parallel soluton of sparse lnear systems. Mathematcs of Computaton 64 (210), Kaltofen, E., Pan, V., Processor effcent parallel soluton of lnear systems over an abstract feld. In: Thrd annual ACM Symposum on Parallel Algorthms and Archtectures. ACM Press, pp Kaltofen, E., Saunders, B. D., On Wedemann s method of solvng sparse lnear systems. In: Proc. AAECC9. Vol. 539 of Lect. Notes Comput. Sc. Sprnger Verlag, pp
17 Kaltofen, E., Yuhasz, G., Oct On the matrx berlekampmassey algorthm. ACM Trans. Algorthms 9 (4), 33:1 33:24. URL Meyer, C. D. (Ed.), Matrx analyss and appled lnear algebra. Socety for Industral and Appled Mathematcs, Phladelpha, PA, USA. Robnson, D. W., The generalzed jordan canoncal form. The Amercan Mathematcal Monthly 77 (4), , contrbutor:. URL Taussky, O., Zassenhaus, H., On the smlarty transformaton between a matrx and ts transpose. Pacfc J. Math. 9 (3), URL Vllard, G., Further analyss of Coppersmth s block Wedemann algorthm for the soluton of sparse lnear systems. In: Internatonal Symposum on Symbolc and Algebrac Computaton. ACM Press, pp Vllard, G., Block soluton of sparse lnear systems over GF(q): the sngular case. SIGSAM Bulletn 32 (4), Wedemann, D., Solvng sparse lnear equatons over fnte felds. IEEE Trans. Inform. Theory 32,