UNCONDITIONAL CLASS GROUP TABULATION TO 2 40 Anton Mosunov (University of Waterloo) Michael J. Jacobson, Jr. (University of Calgary) June 11th, 2015
AGENDA Background Motivation Previous work Class number tabulation Out-of-core multiplication Class group tabulation Performance Future work
BACKGROUND Consider a binary quadratic form of discriminant = b 2 4ac < 0, Denote it by. ax 2 + bxy + cy 2 The substitution, yields another (a 0,b 0,c 0 ) (a, b, c) a, b, c, x, y 2 Z. x = x 0 + y 0 y = x 0 + y 0 form. If =1, then the backward substitution exists. In that case, we call these two forms equivalent. we can define an operation of composition under which the set of all equivalence classes forms a finite abelian group, i.e. (a, b, c) (a 0,b 0,c 0 )=(a 00,b 00,c 00 )
GOAL This finite abelian group for a fixed discriminant is called the class group, and is denoted by Cl( ). The cardinality of Cl( ) is called the class number, and is denoted by h( ). Goal. Tabulate class groups for every fundamental discriminant < 2 40.
MOTIVATION Not much known about class groups. It is hard to compute them, in a sense that there is no polynomial time algorithm for the class group computation. Want to provide an extensive computational evidence in support of the Cohen-Lenstra heuristics and the Littlewood s bounds. Certain cryptosystems, such as the Buchmann-Williams key exchange protocol, rely on them. Want to have enough evidence that they hold.
PREVIOUS WORK In late 90s, Buell tabulated to 2.2 10 9, using algorithm for enumerating reduced forms. After computing all h( ), he produced Cl( ) by resolving structures of each p-group. In 2006, Ramachandran tabulated to using Buchmann- Jacobson-Teske algorithm. The algorithm computes right away. However, it is conditional, and requires verification. We follow Buell s approach. In order to compute all h( ), we use the algorithm due to Hart, Tornaria and Watkins, who used an outof-core polynomial multiplication technique to tabulate all congruent 10 12 numbers to. 2 10 11 Cl( )
CLASS NUMBER TABULATION Why do class numbers help us to resolve the group structure faster? Consider the factorization of h( ): h( )=p e 1 1 pe 2 2... pe k k If e i =1, then the -group is cyclic, so we can ignore it. p i Up to 2 40, 85.13% of h( ) have non-square factors. p For more than 57% of them this factor exceeds h( ).
CLASS NUMBER TABULATION Let r(q) = For the Hurwitz class number relations hold: (a) (b) (c) 1X n=0 # 3 (q) =1+2 1X n=0 1X n=0 q n(n+1) 2 =1+q + q 3 + q 6 + q 10 +... 1X n=0 q n2 =1+2q +2q 4 +2q 9 +2q 16 +... H( ) H( 16n 8)q n = r 2 (q 2 )# 3 (q) H( 16n 4)q n = 1 2 r(q2 )# 2 3(q) 1X H( 8n 3)q n = 1 3 r3 (q) n=0 the following
CLASS NUMBER TABULATION Compute (a) and (b) to 2 36, and (c) to 2 37. This allows us to produce all, except 1(mod 8). For the Hurwitz class number H( ) the following relations hold: (a) (b) (c) 1X H( 16n 8)q n = r 2 (q 2 )# 3 (q) 1X H( 16n 4)q n = 1 2 r(q2 )# 2 3(q) n=0 n=0 1X H( 8n 3)q n = 1 3 r3 (q) n=0
OUT-OF-CORE MULTIPLICATION We want to compute the product of two polynomials, h(x) =f(x) g(x) each of length 2 36. Each coefficient is of size 4 bytes, so in total we require at least 3 4 2 36 bytes = 768 Gb of memory. Need to store intermediate results on hard disk. Need multithreaded environment. For our computations, we utilize an out-of-core Fast Fourier Transform with Chinese Remainder Theorem (non-trivial).
CLASS GROUP TABULATION We used the Buchmann-Jacobson-Teske algorithm to compute the structure of a group. The algorithm requires h( ), or the lower bound h : h apple h( ) apple 2h For 1(mod 8), we computed with the Bach s averaging method (conditional). To unconditionally verify our results, we used the Eichler-Selberg trace formula (previously used by Ramachandran). h
PERFORMANCE We were using Westgrid s Hungabee supercomputer. For each multiplication, we requested 64 Intel Xeon 2.67GHz processors with 8Gb of memory per core. (a) to 2 36 terminated in 8 h 48 min (859 Gb) (b) to 2 36 terminated in 11 h 13 min (893.4 Gb) (c) to 2 37 terminated in 25 h 35 min (1855 Gb)
PERFORMANCE For class group tabulation, we requested 1008 processors for and 64 processors for : 1(mod 8) 6 1(mod 8) 6 1(mod 8) 1(mod 8) CPU time Real time # of processors 265d 4h 31m 4d 3h 27m 64 1657h 22h 12m 39h 29m 1008 got computed over 6.25 times faster; 6 1(mod 8) With class number tabulation and verification, 6 1(mod 8) got computed over 4.72 times faster.
SOME EXOTIC GROUPS DISCOVERED 3-rank = 4: = 1074734433547,Cl ( ) = C(3 3 ) C(3 3 ) C(3) C(3) 4-rank = 5: = 940830993327,Cl( ) = C(2 4 3 3 ) C(2 4 ) C(2 2 ) C(2 2 ) C(2 2 ) Doubly non-cyclic (19-rank = 2, 29-rank = 2): = 915336787039,Cl( ) = C(19 29) C(19 29) Trebly non-cyclic (5-rank = 2, 7-rank = 2, 17-rank = 2): = 658234953151,Cl ( ) = C(5 7 17) C(5 7 17) Quadruply non-cyclic (4-rank = 2, 3-rank = 2, 5-rank = 2, 13-rank = 2) = 616825494863,Cl( ) = C(4 3 5 13) C(4 3 5 13)
CONCLUSION Ramachandran s approach allows to compute class groups right away. However, for 64 processors it would take at least 4 months (vs. 1 month). Moreover, the result is dependent on Extended Riemann Hypothesis, and requires an additional verification step. Out-of-core multiplication approach is unconditional, and with 64 processors allows to produce 2/3 of all class numbers to 2 40 in less than 2 days! We gathered an unconditional numerical evidence in support of Littlewood s bounds and the Cohen-Lenstra heuristics.
FUTURE WORK Find a better way of tabulating class numbers for discriminants 1(mod 8) unconditionally. We believe that Sutherland s p-group resolution algorithms can also speed up our computations. The tabulation of class groups with positive discriminant is currently work in progress.
SOURCES The data is available at lmfdb.org. The source code is available at github.com/amosunov. The paper is soon to appear in Mathematics of Computation. Also available at arxiv.org:1502.07953.
THANK YOU VERY MUCH FOR YOUR ATTENTION