Constructive Algebra in Functional Programming and Type Theory AB C

Constructive Algebra in Functional Programming and Type Theory Master of Science Thesis in the Programme Computer Science Algorithms, Languages and Logic AB C DEF A E C E F D F C F

AE E F DEF A E F C E E E E E F F F F F F E AE E F F EF E E E F E E F F F EF E F E F F EF F E F AE E EF E F E E E F E F F F E F F F E E F F E F E E EF F E F E F E F F E E E F F E EF E E EF F F F E E F DEF A E F C E E F F F F E Constructive Algebra in Functional Programming and Type Theory Anders Mörtberg Anders Mörtberg, May 2010 Examiner: Prof. Thierry Coquand Chalmers University of Technology University of Gothenburg Department of Computer Science and Engineering SE-412 96 Göteborg Sweden Telephone + 46 (0)31-772 1000 Cover: An exact sequence defining that the module M is finitely presented. This is related to the notion of coherent rings presented in chapter 3. Department of Computer Science and Engineering C F

Abstract This thesis considers abstract algebra from a constructive point of view. The central concept of study is coherent rings algebraic structures in which it is possible to solve homogeneous systems of linear equations. Three different algebraic theories are considered; Bézout domains, Prüfer domains and polynomial rings. The first two of these are non-noetherian analogues of classical notions. The polynomial rings are presented from a constructive point of view with a treatment of Gröbner bases. The goal of the thesis is to study the proofs that these theories are coherent and explore how the proofs can be implemented in functional programming and type theory.

Acknowledgments First of all I would like to thank Thierry Coquand for all help and support during the work on this thesis. I would also like to thank Bassel Mannaa for interesting discussions and help with the implementation. The comments presented during the opposition was also very helpful. Finally I would like to thank everyone that has read and given constructive and helpful comments on this thesis.

Contents 1 Introduction 1 1.1 Background.............................. 1 1.2 Method................................ 2 1.3 Previous work............................. 2 1.4 Outline................................ 3 2 Introduction to ring theory 5 2.1 Rings................................. 5 2.2 Ideals................................. 8 2.3 Discrete and strongly discrete rings................. 10 2.4 Noetherian rings and Dedekind domains.............. 10 3 Coherent rings 11 3.1 Definition and properties....................... 11 3.2 Coherence and strongly discrete rings............... 14 4 Bézout domains 15 4.1 Definition............................... 15 4.2 Euclidean domains.......................... 16 4.3 Coherence of Bézout domains.................... 17 4.4 Bézout domains and strong discreteness.............. 18 4.5 GCD domains and fields of fractions................ 18 5 Prüfer domains 21 5.1 Definition............................... 21 5.2 Principal localization matrices.................... 22 5.3 Invertible ideals and coherence of Prüfer domains......... 25 5.4 Ideal arithmetic............................ 27 5.5 Examples of Prüfer domains..................... 27 5.6 Prüfer domains and strong discreteness............... 31 6 Polynomial rings 33 6.1 Monomials and monomial orderings................ 33 6.2 Polynomial rings........................... 34 6.3 Properties of ideals in k[x 1,...,x n ]................. 36 6.4 Gröbner bases............................. 36 6.5 Coherence of k[x 1,...,x n ]...................... 38 6.6 Strong discreteness of k[x 1,...,x n ]................. 39 i

7 Conclusions 41 7.1 Implementation............................ 41 7.2 Discussion............................... 45 7.3 Further work............................. 46 ii

Chapter 1 Introduction 1.1 Background It is important to keep in mind that constructive algebra is algebra; in fact it is a generalization of algebra in that we do not assume the law of excluded middle. [16] Why is it that elementary algebra is so full of algorithms while advanced algebra is so full of nonconstructive arguments? Elementary algebra has factorization of polynomials, equation solving and matrix inversion. Advanced algebra on the other hand has notions such as arbitrary ideals of rings, prime and maximal ideals and Noetherian assumptions on rings [4]. For example, both the existence of maximal and prime ideals are usually proved using Zorn s lemma. Zorn s lemma relies on the axiom of choice which in turn implies the law of excluded middle [19]. Modern abstract algebra begun with the introduction of algebraic structures in the end of the 19th century. In the first half of the 20th century nonconstructive methods dominated. In 1967 Erret Bishop published a book called Foundations of Constructive Analysis which aimed to show that analysis could be approached constructively. This, together with increasingly powerful computers, led to a renaissance of constructive mathematics [16]. One of the main reasons to study constructive algebra is that it can give rise to new algorithms and ways to explore algebra using computers. The notion of computation is at the core of constructive mathematics. A constructive proof of the existence of a mathematical object gives a way to construct the object while a nonconstructive proof just proves the existence of such an object without necessarily giving a way to construct it. For example, a constructive proof that a polynomial can be factorized as a product of irreducible polynomials must provide the factorization while a nonconstructive proof just have to prove the existence of such a factorization without giving any witness of it. Another reason to study constructive algebra is that it makes it possible to represent advanced algebra in type theory and thus to verify the correctness of mathematical proofs using computers. The reasons for this are the proofs-asprograms correspondence and the Brouwer-Heyting-Kolmogorov interpretation of intuitionistic logic which together give a way for representing mathematical propositions as types and proofs as programs. Note that it is also possible to ver- 1

ify classical mathematics using computers, but the point is that in constructive mathematics the proofs correspond to algorithms. This thesis explores the question of how advanced algebra can be made constructive by considering classical structures where the assumptions of Noetherianity has been dropped. This will be defined and discussed further in the introduction to ring theory in chapter 2. In linear algebra one of the main questions is how to solve homogeneous systems of linear equations, but in linear algebra the central notion is vector spaces which relies on the assumption that all nonzero elements has a multiplicative inverse. One main aim of this thesis is to look at what happens if this assumption is dropped. The proofs of the results should be constructive and be implemented in a functional programming language and eventually also verified in a constructive proof system. 1.2 Method The results presented in this thesis has been implemented in the pure lazy functional programming language Haskell. The reason for using Haskell is that it has a powerful type system and the features that makes it suitable for implementing algebraic theories are mainly polymorphism and the type class system. In order to specify the axioms of algebraic structures the automated testing tool QuickCheck [2] is used, since the axioms are natural to represent as QuickCheck properties and implementations of specific instances of the algebraic structures easily can be tested. The final goal of the thesis is to represent the work in type theory, as an implementation in a logical proof system based on intuitionistic type theory, e.g. Agda 1 or Coq 2. Due to time limitations this has not been done yet. 1.3 Previous work There has been some previous work on computational algebra systems in Haskell. The HaskellForMaths 3 project by David Amos implements many important algorithms from combinatorics, group theory and commutative algebra. This project does not have a representation of algebraic structures and instead it uses the standard Haskell type classes. It also has an implementation of multivariate polynomials and the Buchberger algorithm. A project that focuses on representing algebraic structures in Haskell is the numeric-prelude project 4. This library contains many different structures like groups, rings, fields, modules, vector spaces, etc. In type theory there are many examples of libraries for constructive algebra. The main interest of this project has been in implementations in Agda and Coq. The standard library of Agda contains representations of some basic algebraic structures but as far as I know there has been no larger projects in constructive algebra developed in Agda. The situation in Coq is quite different. 1 http://wiki.portal.chalmers.se/agda/ 2 http://www.lix.polytechnique.fr/coq/ 3 http://hackage.haskell.org/package/haskellformaths 4 http://www.haskell.org/haskellwiki/numeric_prelude 2

In [12] a framework for representing algebraic structures in Coq is presented. This is done as part of a project to give a formalized proof of the fundamental theorem of algebra. This is a part of the Constructive Coq Repository at Nijmegen 5 which is a large library containing formalized mathematics focusing mostly on constructive real numbers. Another implementation of algebraic structures in Coq is the Mathematical Components project 6. This is based on the ssreflect extension to Coq which was used in the formal proof of the Four-Color Theorem by Georges Gonthier [14]. In [11] possible ways to represent algebraic structures as part of this project is discussed together with problems related to the complexity of the representation. Both of the references on implementations of algebraic theories in Coq has many further references to other work on representing constructive mathematics in type theory, but none of them implement neither Bézout domains nor Prüfer domains. Polynomial rings, on the other hand, has been represented in Coq together with a verified implementation of the Buchberger algorithm for computing Gröbner bases [17]. All of the major computer algebra systems like Maple, Mathematica and Matlab implement algorithms for solving systems of linear equations and computing Gröbner bases. These systems are based on less general algorithms than the algorithms presented in this thesis. Instead they focus on more specialized algorithms in order to be able to do as much optimization as possible. The results on Bézout domains and Prüfer domains is based on the work presented in the PhD thesis of Maimouna Salou [18]. As part of it many of the proofs has been represented in the Axiom computer algebra system. A project that studies generalized linear algebra is the homalg project 7. It is a project that aims to translate as much homological algebra as possible into computer programs. This project is implemented using object oriented programming. All of the web pages that has been referred to in this section has been visited in May 2010. 1.4 Outline In chapter 2 ring theory is introduced for a reader without a background in abstract algebra. The most basic definitions that are necessary in order to read the thesis are introduced. This chapter can be read very briefly by a reader who is already familiar with ring theory. The discussions on implementation of the concepts in functional programming and type theory are probably interesting even if the reader already know the subject. Chapter 3 presents coherent rings from a constructive point of view. Traditionally these are considered in terms of module theory but here they are described in terms of solving equations à la linear algebra. The following two chapters discusses Bézout domains and Prüfer domains which are constructive analogies of principal ideal domains and Dedekind domains. Just as in classical mathematics where principal ideal domains are a subset of Dedekind domains are Bézout domains a subset of Prüfer domains. 5 http://c-corn.cs.ru.nl/ 6 http://www.msr-inria.inria.fr/projects/math-components 7 http://homalg.math.rwth-aachen.de/ 3

The high-point of these chapters are the proofs that the classes of rings are coherent and thus that it is possible to solve homogeneous systems of equations over them. In chapter 6 polynomial rings are presented, these are rings of polynomials with coefficients from an underlying ring. The theory of Gröbner bases is presented from a constructive point of view together with the famous Buchberger algorithm used to compute these. In the end of the chapter there is a proof that these rings are also coherent. Finally the results of the implementation are presented together with some examples and a discussion on limitations and further work. 4

Chapter 2 Introduction to ring theory This chapter should serve as a short introduction to the concepts of ring theory that are necessary in order to understand the thesis. It does not claim to be a complete and thorough presentation of all concepts of basic ring theory. For a good introduction to general abstract algebra see [9] and for an introduction to some of the more advanced concepts see [1]. If the reader already has a good understanding of ring theory this chapter can be read briefly. Most important are the notes about how to define the concepts in functional programming and type theory. 2.1 Rings The most fundamental concept of this thesis is the concept of rings. These can be defined compactly in terms of groups and monoids, but here the definition is a bit more verbose in order to give a summary of all the properties of rings in one place. Definition 2.1. A ring is a set R equipped with two binary operations called addition and multiplication written +, : R R R respectively. The axioms that the triple (R,+, ) must satisfy are 1. Closure under addition: ab R.a+b R 2. Associativity of addition: abc R.(a+b)+c = a+(b+c) 3. Existence of additive identity: 0 R. a R.0+a = a+0 = a 4. Existence of additive inverse: a R. b R.a+b = b+a = 0 5. Commutativity of addition: ab R.a+b = b+a 6. Closure under multiplication: ab R.a b R 7. Associativity of multiplication: abc R.(a b) c = a (b c)d 8. Existence of multiplicative identity: 1 R. a R.1 a = a 1 = a 9. Left distributivity of multiplication over addition: abc R.a (b+c) = (a b)+(a c) 5

10. Right distributivity of multiplication over addition: abc R.(a+b) c = (a c)+(b c) First some conventions. Multiplication is often not written explicitly, so a b is written ab. The additive inverse is often written a b which means a+( b) where ( b) is the additive inverse of b. The set of nonzero elements of a ring is written as R. Examples of rings include (Z,+, ) where + and denote the ordinary addition and multiplication for the integers. Other examples are Q,R,C with the ordinary definitions of addition and multiplication. Note on implementation. In Haskell this can be represented as a type-class: class Ring a where (<+>) :: a a a (< >) :: a a a neg :: a a zero :: a one :: a The ring axioms can also be represented in Haskell. The axioms are represented as functions which should be used to test that an implementation satisfies the laws. For example the property that multiplication is left distributive over addition can be specified as: propleftdist :: (Ring a, Eq a) a a a Bool propleftdist a b c = a < > (b <+> c) (a < > b) <+> (a < > c) In type theory the axioms would be possible to represent at the type level using dependent records. Then the structure would also contain the axioms and the user would have to prove that a structure satisfies them in order to construct an instance. This is better since the implementation would be proved correct and not just randomly tested. Definition 2.2. A commutative ring is a ring (R,+, ) satisfying the axiom that multiplication is commutative: ab R.a b = b a Note on implementation. Commutative rings can be represented as an empty type class in Haskell since they do not introduce any new operations to the structure. But since there is one more axiom that also has to be represented. class Ring a CommutativeRing a propmulcomm :: (CommutativeRing a, Eq a) a a Bool propmulcomm a b = a < > b b < > a This thesis will only consider commutative rings. All of the above examples of rings are commutative. An example of a ring that is not commutative is the ring of n n matrices, written M n (R), where R is the ring of the elements. Here addition and multiplication are the standard operations on matrices and it is easy to construct an example to show that matrix multiplication is non-commutative. 6

The previous examples have all been infinite, but there are also many finite rings. A fundamental class of finite rings are the ring of integers modulo n, written Z n 1. This corresponds to the elements a Z in the same congruence class modulo n, for example Z 3 {0,1,2} since there are three congruence classes modulo 3. Addition and multiplication are defined by using the addition and multiplication of Z and then computing modulo n. Note on implementation. To implement Z n the power of dependent types would be desirable to have. The compiler would then be able to distinguish elements from different rings and verify that they are for instance not multiplied. The reason is that the type of Z n depends on the value of n. It is possible to represent integers at the type level in Haskell but it is a bit cumbersome and having real dependent types is preferable. Another example of rings is polynomial rings, written R[x 1,...,x n ] where R is the ring of the coefficients and x 1,...,x n are variables. It is easy to see that these rings are commutative if R is. A concrete example of an element of Z[x,y] is 3x+7y 2. This special class of rings has many applications and are considered in more detail in chapter 6. Definition 2.3. An integral domain is a commutative ring satisfying: ab R.(ab = 0 a = 0 b = 0) All of Z,Q,R,C form integral domains with the usual definitions of addition and multiplication. For Z n things is a bit more complicated. For example Z 6 is not an integral domain since 2 3 = 0 modulo 6. In fact Z n is an integral domain iff n is prime. Note on implementation. Representing integral domains is a bit more difficult. Just as commutative rings they do not introduce any new operations, but how should the property be tested? One way to do it is to test if ab = 0 and then check that either a or b are zero and also test the axioms for commutative rings. If ab 0 it should only be checked that the axioms for commutative rings are satisfied. Here type theory would be superior to functional programming, because the probability to generate a random counter example can be fairly small, since the product of most elements probably will not be zero. So having a proof that this holds would be much better. The fact that Z n is only an integral domain iff n is prime is another motivation that this structure is best captured by type theory. Using a representation of integers at the type level in Haskell it is possible to define primality testing, but this is hard and terribly slow. So having a language designed to do computation at the type level would be much better. Definition 2.4. A field is an integral domain in which all nonzero elements has a multiplicative inverse: 1 Often written Z/nZ. a R. b R.ab = 1 7

This is often written using standard division notation, so a/b or a b actually means a (b 1 ) where b 1 is the multiplicative inverse of b. Some of the infinite rings that presented so far are fields, these are Q, R, C. Some finite fields have also been presented. These are, just as for integral domains, Z n where n is a prime number. Note on implementation. The representation of fields is very similar to the representation of rings. The new operation that fields add is the ability to compute multiplicative inverses. This can be implemented and specified as: class IntegralDomain a Field a where inv :: a a propmulinv :: (Field a, Eq a) a Property propmulinv a = a zero = inv a < > a one This is the final definition of this section. It is possible to make these definitions more fine-grained by having several intermediate structures like semi rings with and without a one or starting from monoids and groups to construct rings. The reason not to do this is simply that the concepts presented here is sufficiently complex for the following chapters. For a general abstract algebra library the approach of having more structures would be much more sensible. 2.2 Ideals The concept of ideals is very important in commutative algebra. They are generalizations of many concepts of the integers like even number or prime number. Since this thesis only consider commutative rings are all ideals twosided, that is left ideals are equal to right ideals. For non-commutative rings it would have been possible to define left and right ideals instead. Definition 2.5. For a commutative ring (R,+, ) and ideal I is a subset I R such that: 1. Closure of addition: ab I.a+b I 2. Closure of multiplication by an element of R: a I. b R. ab I In short the ideals can be described as the additive subgroups of R which are closed under multiplication by any element of R. The two canonical ideals are the zero-ideal {0} and the whole ring R. A more interesting example is the even integers, 2Z, which form an ideal of Z since the addition of two even numbers is even and the result of multiplying any number with an even number is even. This defines arbitrary ideals of rings and is not suited for constructive algebra. The interesting ideals are instead the ideals which are finitely generated. Definition 2.6. An ideal I of a ring R is finitely generated if there exist a finite subset X I such that all elements of I can be written as a linear combination of the elements of X = {x 1,...,x n } and R. That is: a I. r 1,...,r n R.a = x 1 r 1 + +x n r n 8

The ideal generators are not written using the standard set notation but with.... Examples of finitely generated ideals are both of the canonical ideals where the zero ideal is generated by 0 and the whole ring is generated by 1. The even integers are generated by 2, but they are also generated by 2,4. So 2 and 2,4 generate the same subset in Z and are thus equal. One important property of the ideals in Z is that they all can be generated by one element. This property of ideals have a special name. Definition 2.7. A principal ideal is an ideal generated by only one element. Rings like Z in which all ideals are principal are classically called principal ideal domains. But constructively this definition is not suitable. Instead we would only want to consider rings in which all finitely generated ideals are principal. These rings are called Bézout domains and are considered in chapter 4. Note on implementation. Finitely generated ideals can be represented by its set of generators. In Haskell this can be written as: data CommutativeRing a Ideal a = Id [a] In type theory it would also be possible to consider the ideals that are not finitely generated, since in type theory it would be possible to represent ideals by their logical properties. Now some some operations on ideals and fundamental properties of ideals will be considered. Definition 2.8. The sum of two ideals I and J is the set of all x + y where x I and y J. So if I = x 1,...,x n and J = y 1,...,y m then I +J = x 1,...,x n,y 1,...,y m Definition 2.9. The product of two ideals I and J is the ideal IJ generated by all products xy where x I and y J. The intersection of two ideals is also an ideal. But there is no general method for computing the set of generators for the intersection of two ideals in arbitrary rings. In chapter 3 it will be established that if the intersection of two finitely generated ideals is finitely generated, the ring is coherent. In fact the ideals form a complete lattice with respect to inclusion with the sum and intersection operations. A lattice is a partially ordered set where every pair of elements have a least upper bound and a greatest lower bound. This lattice need not be distributive, that is the operators need not distribute over each other, but in chapter 5 it will be proved that one way to define Prüfer domains is by this property. Now an example of ideal operations. Consider 4 and 6 in Z, that is the sets generated by all multiples 4 and 6. The sum 4 + 6 = 4,6 = 2 since 2 = 4 (2)+6 ( 1). The product is 4 6 = 24 and the intersection 4 6 is the set generated by the lowest common multiple which will be proved in the next chapter. Thus the intersection of 4 and 6 is 12. 9

2.3 Discrete and strongly discrete rings This section will consider some rings that are especially relevant for constructive mathematics. Definition 2.10. A ring is called discrete if equality is decidable. All of the rings studied in the thesis will be discrete. But there are many examples of rings that are not discrete. For example R is not discrete since it is not possible to decide if two irrational numbers are equal in finite time. Another example of rings that do not need to be discrete are formal power series rings; rings of polynomials with and infinite number of terms. Definition 2.11. A ring is called strongly discrete if ideal membership is decidable. This property is very strong. Many of the rings we have seen so far are strongly discrete, this is in fact tightly connected to whether division is decidable in the ring. In section 3.2 we will see that strong discreteness and coherence give us the possibility of solving arbitrary systems of the type AX = B. 2.4 Noetherian rings and Dedekind domains This section will establish some classical notions that will play an important rôle throughout the thesis. Definition 2.12. A ring is called Noetherian if every ideal is finitely generated. This notion is not suitable for constructive mathematics since it relies on quantification over arbitrary subsets of the ring. It is not possible to formulate in first-order logic [4]. One of the goals of this thesis is to consider structures that are non-noetherian analogues to classical notions. Definition 2.13. A Dedekind domain is an integral domain in which every fractional ideal is invertible. An ideal is invertible if there exists another ideal such that the product of the ideals is principal. Dedekind domains imply Noetherianity and is thus not suitable for constructive mathematics. Instead one consider Prüfer domains which are non-noetherian analogues of Dedekind domains. These and invertible ideals will be considered in chapter 5. 10

Chapter 3 Coherent rings All rings in this section are integral domains. One of the main aims of the following sections will be to prove that different rings are coherent. That means that it is possible to solve systems of equations in them. 3.1 Definition and properties An elementary application of linear algebra is to solve systems of linear equations in many variables. But in linear algebra the central concept of study is vector spaces which implies that the underlying structures are fields. So when computing the solution of a system of equations one is free to use the assumption that the elements are invertible. But what happens if you drop the assumption of invertibility and just look at integral domains? This is one of the motivations to study coherence. Definition 3.1. A ring R is coherent if every finitely generated ideal is finitely presented. This means that given a matrix M R 1 n there exist a matrix L R n m for m N such that ML = 0 and MX = 0 Y R m 1. X = LY This means that it is possible to compute a set of generators for solutions of equations in a coherent ring. In other words that the module of solution for MX = 0 is finitely generated. Note on implementation. The property of coherence is quite hard to capture in Haskell. The content that can be captured is that it is possible to compute the matrix L given M such that ML = 0. This can be represented as: class IntegralDomain a Coherent a where solve :: Vector a Matrix a propcoherent :: (Coherent a, Eq a) Vector a Bool propcoherent m = issolution m (solve m) Here issolution just check that all elements in the product ML are zero. The logical aspects of coherence is harder to represent. But in type theory this would be possible and it would be interesting to see what this would give compared to what the Haskell approach gives. 11

Not only is it possible to solve equations in a coherent ring, but in fact it is possible to compute generators for any homogeneous system of equations in a coherent ring. The proof of this proposition is a translation of the proof of proposition 1.2 in [18]. Proposition 3.2. In a coherent ring it is possible to solve a system MX = 0 where M R m n and X R n 1. Proof. Let M i R 1 n be the rows of M. By coherence it is possible to solve M 1 X = 0 and get L 1 R n p1 such that M 1 X = 0 Y R p1 1. X = L 1 Y Now replace X in M 2 X = 0 by L 1 Y and get M 2 L 1 Y = 0. By coherence we now obtain a new matrix L 2 R p1 p2 such that M 1 X = M 2 X = 0 Y R p1 1. X = L 1 Y and M 2 L 1 Y = 0 Z R p2 1. X = L 1 L 2 Z By iterating this method the solutionx = L 1 L 2...L m Z withl i R pi 1 pi, p 0 = n and Z R pm 1 can be computed. Now we will consider the intersection of finitely generated ideals in coherent rings. This gives another way of characterizing coherent rings in terms of the intersection of ideals. For the more general formulation of this in terms of modules, that is vector spaces over arbitrary rings and not just fields, see theorem 2.4 on page 82 in [16]. Proposition 3.3. The intersection of two finitely generated ideals in a coherent ring R is finitely generated. Proof. Let I = a 1,...,a n and J = b 1,...,b m be two finitely generated ideals in R. Consider the system AX BY = 0 whereais the1 n matrix(a 1,...,a n ) andb is the1 m matrix(b 1,...,b m ). Since the ring is coherent it is possible to compute a finite number of generators (X 1,Y 1 ),...,(X p,y p ) of the solution. This mean AX 1 = BY 1. AX p = BY p To say that α I J means that α I α J. This means that there exist x i and y i such that α = a 1 x 1 + + a n x n and α = b 1 y 1 + + b m y m. Which means that a 1 x 1 + +a n x n = b 1 y 1 + +b m y m which is exactly what the generators above give. Thus are one set of generators for the intersection AX 1,...,AX p and another set of generators are BY 1,...,BY p. In fact this statement can be turned around to give the other direction also. The following proposition is the most important in this section and all of the coherence proofs will rely on this. 12

Proposition 3.4. If R is an integral domain such that the intersection of two finitely generated ideals is finitely generated then R is coherent. Proof. The proof is by induction on the length of the system to solve. First consider ax = 0 Here the only solution is the trivial solution. Now assume that it is possible to solve a system in n 1 variables and consider the case with n 2 variables: a 1 x 1 +...+a n x n = 0 If a 1 = 0 one set of solutions to the system is generated by (1,0,...,0), but it is also possible to use the induction hypothesis and get the generators v i2,...,v in for the system with x 2,...,x n and the solutions of the system with n unknowns are generated by (0,v i2,...,v in ) and (1,0,...,0). If a 1 0 the set (0,v i2,...,v in ) of solutions can be found by the induction hypothesis again. Further, by hypothesis it is possible to find t 1,...,t p such that a 1 a 2,..., a n = t 1,...t p where t i = a 1 w i1 = a 2 w i2... a n w in. So if a 1 x 1 +...+a n x n = 0 then a 1 x 1 = a 2 x 2... a n x n. We also have u i such that a 1 x 1 = a 2 x 2... a n x n = p u i t i This implies that a 1 x 1 = p i=1 u it i = p i=1 u ia 1 w i1 which by the cancellation property give that x 1 = p i=1 u iw i1. Similarly p p a 2 x 2... a n x n = u i t i = u i ( a 2 w i2... a n w in ) i=1 i=1 Some reorganization gives ) p a 2 (x 2 u i w i2 +...+a n (x n i=1 i=1 ) p u i w in = 0 This gives that (w i1,...,w in ) and (0,v i2,...,v in ) generate the module of solution. This gives a method for proving that rings are coherent. Now the only thing to prove is how to compute the intersection of finitely generated ideals and then this will imply that the ring is coherent. This also shows that coherent rings can be characterized only in terms of the intersection of finitely generated ideals. Note on implementation. One thing worth emphasizing here is the dependence on the witnesses of the intersection. That is given two finitely generated ideals I = x 1,...,x n and J = y 1,...,y n the functions that compute the intersection must also give a set of witnesses. If the intersectioni J = z 1,...,z l then the function should give a ij and b ij such that i=1 z k = a k1 x 1 +...+a kn x n = b k1 y 1 +...+b km y m 13

Note that this only gives the witnesses in one direction, that is if x I J then x I and x J. 3.2 Coherence and strongly discrete rings The property of strong discreteness is very strong (as indicated by the name). If the ring is strongly discrete and coherent, not only is it possible to solve systems like MX = 0 but it is also possible to solve general systems of the kind MX = A. For ideal membership to be decidable constructively there has to be a method to test if x x 1,...,x n which also should give a witness if this is the case. The witness should be a list of w i such that i w ix i = x. In other words, it should be possible to write x as a linear combination of the w i. Proposition 3.5. If R is a strongly discrete coherent integral domain then it is possible to solve arbitrary linear systems. Given MX = A it is possible to compute X 0 and L such that ML = 0 and MX = A Y.X = LY +X 0 Proof. The solution L to the system MX = 0 can be computed by proposition 3.2. The particular solution X 0 can be computed using the same method as in that proof. The base case when M has only one row is clear since R is strongly discrete. That is if M = (x 1,...,x n ) and A = (a) then the decidability of ideal membership give that if a x 1,...,x n one get witnesses w i such that x 1 w 1 + +x n w n = a. This section establishes the properties of the rings to be studied throughout the thesis. One of the goals of the rest of the thesis is to give constructive proofs that different rings are coherent. This means that in the end there will be many examples of rings in which it is possible to solve systems of linear equations. 14

Chapter 4 Bézout domains This section will consider a class of integral domains called Bézout domains. These are non-noetherian analogues of principal ideal domains. Examples of Bézout domains are Z and k[x]. The goal of this section is to give some motivation and examples of Bézout domains and then prove that they are coherent. Finally there will be some discussion on what is required for them to be strongly discrete. 4.1 Definition One interesting class of integral domains is principal ideal domains, that are integral domains in which all ideals are principal. This means that all principal ideal domains are Noetherian, but as mentioned in chapter 2 is Noetherianity not suitable for constructive mathematics. Thus are principal ideal domains not suitable and we instead introduce Bézout domains. Definition 4.1. An integral domain R is a Bézout domain iff every finitely generated ideal is principal. Note on implementation. This definition means that it is possible to compute t such that a 1,...,a n = t. The method should also compute witnesses that a 1,...,a n t and a 1,...,a n t. This can be represented in Haskell as class IntegralDomain a BezoutDomain a where toprincipal :: Ideal a (Ideal a,[a],[a]) Here the first list is the witness that there exists u i such that t = a 1 u 1 + +a n u n The second list is the witness that there exists v i for each a i such that a i = tv i 15

4.2 Euclidean domains Many examples of Bézout domains are Euclidean domains. These are rings with additional structure, namely an Euclidean function. This allows a generalization of the Euclidean algorithm on integers. Definition 4.2. An Euclidean domain is an integral domain R with a function f : R N with the property that for a,b R there exist q,r R such that a = bq +r where either r = 0 or f(r) < f(b). Euclidean domains are integral domains where it is possible to perform division with remainder. Examples include Z with the absolute value function and the ring of polynomials k[x] with the degree function. In order to show that Euclidean domains are Bézout domains some lemmas are needed. In Euclidean domains the Euclidean algorithm for computing the greatest common divisor of two elements can be applied. There is a generalized version which computes the greatest common divisor of more than two elements. Lemma 4.3. The generalized greatest common divisor (ggcd) of n elements can be computed by recursively applying the algorithm for computing the gcd of two elements. More specifically: ggcd(a 1,...,a n ) = gcd(a 1,gcd(a 2,... gcd(a n 1,a n )...)) Proof. Consider the case with a,b,c R. Let d = ggcd(a,b,c) then we have that d b and d c by the definition of gcd, so gcd(b,c) = kd for some k. We also have that d a, so a = jd for some j. Now let u = gcd(k,j), by the construction of u we have that ud a,b,c, but since d is the greatest common divisor u must be a unit. Thus d = gcd(a, gcd(b, c)). By induction the other cases follow directly. In Euclidean domains it is also possible to compute the extended Euclidean algorithm; given a,b R it is possible to compute x,y R such that ax+by = gcd(a,b). This algorithm can also be generalized. Lemma 4.4. Given a 1,...,a n R it is possible to compute x 1,...,x n R such that a 1 x 1 + +a n x n = ggcd(a 1,...,a n ) Proof. Given a,b,c R we want to compute x,y,z R such that ax+by+cz = ggcd(a,b,c). We can compute m,n R such that bm + cn = gcd(b,c). Then there are l,k R such that ak+gcd(b,c) l = gcd(a,gcd(b,c)) which by lemma 4.3 is equal to ggcd(a,b,c). So we get that ak + bml + cnl = ggcd(a,b,c) and thus x = k, y = ml and z = nl. The equations for more than three variables follow by induction. Now it is possible to show that all Euclidean domains are Bézout domains. This gives a rich source of examples of Bézout domains. Proposition 4.5. Euclidean domains are Bézout domains. Proof. Follow directly from lemma 4.4. Given a 1,...,a n we can compute t = ggcd(a 1,...,a n ) such that t = a 1 x 1 + + a n x n and thus we have that a 1,...,a n = ggcd(a 1,...,a n ). 16

Note on implementation. The implementation of this proof will have to compute the witnesses also. One direction follow directly from the proof since the extended Euclidean algorithm is used. The other direction is also direct since division is decidable and t divides all a i. 4.3 Coherence of Bézout domains The coherence of Bézout domains can be proved by considering the intersection of ideals. Since all finitely generated ideals are principal it is sufficient to consider only principal ideals. Proposition 4.6. Given two ideals I = a and J = b the intersection is I J = lcm(a,b). Where lcm is the lowest common multiple of a and b. Proof. The equality is proved by considering both inclusions. : If f a b then f a and f b. So a f and b f. But there exist a lowest common multiple of a and b, lcm(a,b), such that a lcm(a,b) and b lcm(a,b) so f must be a multiple of the lowest common multiple and thus f lcm(a,b). : If f lcm(a,b) then lcm(a,b) f. Since lcm(a,b) is a multiple of both a and b f must be a multiple of a and b also. This means that f a b. This is what we need in order to know that Bézout domains are coherent. Theorem 4.7. Bézout domains are coherent. Proof. Direct consequence of propositions 3.4 and 4.6. Note on implementation. The implementation of this proof relies on the computation of the witnesses. Again it is sufficient to only consider the case where the ideals are principal. Given a and b we should compute lcm(a,b), u and v such that lcm(a, b) = au lcm(a, b) = bv using the knowledge that we can compute gcd(a,b) = g, u 1, u 2, v 1 and v 2 such that To compute lcm(a,b) use that g = au 1 +bu 2 a = gv 1 b = gv 2 lcm(a,b) = So lcm(a,b) can be computed as ab gcd(a, b) ( a lcm(a,b) = gv 1 v 2 = g g b ) = ab g g ( ) 17

Now that the lowest common multiple has been computed the witnesses has to be computed. But this is easy since by ( ) we get that u = b g = v 2 v = a g = v 1 4.4 Bézout domains and strong discreteness Recall that a ring is called strongly discrete if ideal membership is decidable. In order to decide ideal membership for Bézout domains we need to be able to decide divisibility. Proposition 4.8. A Bézout domain R is strongly discrete if division is decidable in R. Proof. To test if x x 1,...,x n we need to find w i such that x = w i x i. Since R is a Bézout domain we can find g such that g = x 1,...,x n. Now, since divisibility is decidable, test if g x and if this is the case we know that x g. This should also give us q such that x = qg and since g x 1,...,x n we have u i such that g = u i x i. The witness that x x 1,...,x n can now be computed by and thus w i = qu i. x = qg = q u i x i = (qu i )x i This means that we can solve arbitrary linear systems MX = A for Bézout domains with decidable division and in particular over Z and k[x]. 4.5 GCD domains and fields of fractions In classical mathematics a structure that is studied is called unique factorization domains (UFD) which are integral domains in which all elements can be written uniquely as a product of irreducible elements. For example Z is an UFD by the fundamental theorem of arithmetic. But as for principal ideal domains (PIDs) this relies on Noetherianity. In classical mathematics we have the following chain of inclusions. Euclidean domains PIDs UFDs Integral domains But without the Noetherian assumption we get Bézout domains instead of PIDs. One can show that an integral domain in which any two nonzero elements have a greatest common divisor is a non-noetherian analogue of the UFDs. These rings are called greatest common divisor (GCD) domains and complete the corresponding chain of inclusions in constructive mathematics. Euclidean domains Bézout domains GCD domains Integral domains The inclusion Bézout domains GCD domains is easy to see. But note that not all GCD domains are coherent. 18

One reason to look at GCD domains is that they provide a good setting in which to implement the field of fractions for an integral domain. As the name indicates it is a field in which the integral domain can be embedded. Definition 4.9. The field of fractions of an integral domain R is the set of equivalence classes of pairs (a,s) where a,s R and s 0 under the equivalence relation: (a,s) (b,t) iff at = bs An integral domain can be embedded by the map a (a,1). Addition and multiplication can be defined as (a,s)+(b,t) = (at+bs,st) (a, s)(b, t) = (ab, st) The inverse of (a,s) where a,s 0 is (s,a). It is easy to verify that this satisfies the conditions for a field. The equivalence classes can be viewed as fractions. Thus is the construction of fields of fractions a generalization of the construction of Q from Z to arbitrary integral domains. Another special construction is the field of fractions for k[x] (k discrete field) which is called the field of rational functions and is denoted by k(x). This is the field where the elements are fractions of polynomials in one variable which will be important in section 5.5.2 when looking at an example of a Prüfer domains that is not a Bézout domains. This can be generalized to multivariate polynomials and one then get the field of rational functions of a polynomial ring k[x 1,...,x n ] denoted by k(x 1,...,x n ). The reason that GCD domains are a good setting for constructing the field of fractions is that it is possible to restrict the equivalence classes to those in which gcd(a,b) = 1. So all fractions will be in normal form. This is a generalization of the fact that 4 2 can be simplified to 1 2 in Q to arbitrary fields of fractions. Note on implementation. To implement GCD domains just follow the usual method without forgetting the witnesses. Given nonzero a, b R we should compute gcd(a,b), x and y such that a = gcd(a, b)x b = gcd(a, b)y 1 = gcd(x,y) This makes it very easy to represent the field of fractions over a GCD domain. It can be represented by a pair where the second element always should be nonzero. The reduction to normal form of a pair (a,b) works by computing gcd(a,b) and if it is 1 everything is fine. If it is not equal to 1 then return (x,y). Using this it is trivial to implement Q with the help of a suitable implementation of Z and if one also have an implementation of k[x] it is trivial to implement k(x). This method could be used as a basis for implementing the rational numbers in type theory. For instance this is how it is implemented in the C-CORN library mentioned in the section on previous work. 19

In this section we have seen that Bézout domains are coherent and given some examples of them. But the assumption that all finitely generated ideals are principal is quite strong and there are many examples of coherent rings in which this assumption does not hold. The next section will look at a superset of Bézout domains that also are coherent. These rings are called Prüfer domains and there will be some examples of Prüfer domains that are not Bézout domains for which it is not clear that it is possible to solve systems of equations over. 20

Chapter 5 Prüfer domains This chapter describes another class of coherent rings called Prüfer domains. First one of the many characterizations of Prüfer domains will be presented followed by some constructions leading up to the coherence proof. Next there will be some examples of what can be done in terms of ideal arithmetic in Prüfer domains and finally there will be some examples of Prüfer domains that are not Bézout domains. This whole section follow [8, 18]. 5.1 Definition The classical definition of Prüfer domains is that they are non-noetherian generalization of Dedekind domains. Just as Bézout domains inherit many properties from principal ideal domains Prüfer domains inherit many properties from Dedekind domains. A first observation is that Prüfer domains allow many different classifications concerning aspects like: localization, structural and arithmetical properties of ideals and polynomial rings. The classification that will be considered in this thesis is a simple first-order condition. Definition 5.1. An integral domain R is a Prüfer domain iff xy. uvw. ux = vy (1 u)y = wx ( ) Note on implementation. As for the other algebraic structures this can be represented in the Haskell type class system as class IntegralDomain a PruferDomain a where calcuvw :: a a (a,a,a) propcalcuvw :: (PruferDomain a, Eq a) a a Bool propcalcuvw x y = u < > x v < > y && (one <-> u) < > y w < > x where (u,v,w) = calcuvw x y In type theory this would be possible to represent as a dependent record with the properties as part of the structure. A commutative ring satisfying( ) is called arithmetical. In fact many properties can be proved at the level of arithmetical rings (discussed further in [8, 18]). As mentioned above are Bézout domains a subset of Prüfer domains. The following proposition will give a way of finding examples Prüfer domains. 21

Proposition 5.2. Bézout domains are Prüfer domains. Proof. Since we have Bézout domain we can compute g, a and b such that g = gcd(x, y) x = ag y = bg We can also compute c and d such that ca+db = 1 Now let u = db. Then we shall find v such that dbx = vy Which simplifies to Take dbag = vbg v = ad Now we want to compute w such that wx = (1 u)y = (1 db)y = cay = cagb Since x = ag we get that w = bc Now we have found u, v and w satisfying the Prüfer condition and thus the proof is complete. Now that we have found a source of examples of Prüfer domains we will continue to look at some useful constructions possible in Prüfer domains which will lead to the coherence proof. 5.2 Principal localization matrices A key concept in the proof of coherence for Prüfer domains are principal localization matrices. Definition 5.3. A principal localization matrix for a finitely generated ideal x 1,...,x n is a matrix A = (a ij ) such that { aii = 1 a lj x i = a li x j i,j,l {1,...,n} Before considering how to compute the principal localization matrix in a Prüfer domain first consider how to do it for the simpler case of Bézout domains. 22

Proposition 5.4. Let R be a Bézout domain and let I = x 1,...,x n be an ideal in R. Then I has a principal localization matrix. Proof. Since R is a Bézout domain we can compute g, u i and v i such that g = x 1,...,x n n g = x i u i i=1 x i = gv i i {1,...,n} Let a ij = u i v j, this give for all i,j,l {1,...,n} We also have a lj x i = u l v j gv i = u l v i gv j = a li x j g = n x i u i = i=1 n n gv i u i = g So 1 = a ii and thus (a ij ) is principal localization matrix for I. i=1 The next step is to generalize this to Prüfer domains and thus show that principal localization matrices for ideals are computable in Prüfer domains. Proposition 5.5. Let R be a Prüfer domain and let I = x 1,...,x n be a finitely generated ideal of R. Then I has a principal localization matrix. Proof. First an alternative equivalent condition on Prüfer domains i=1 a ii xy. uvwt. u+t = 1 ux = vy wx = ty Now the proof proceeds by induction on n. For the case of n = 2 let the matrix be [ ] u v w t This obviously satisfies the requirements for being a principal localization matrix. For n > 2 assume that it holds for n 1 and thus there is a principal localization matrix B = (b ij ) 1 i,j n 1 such that n 1 b ii = 1 i=1 b lj x i = b li x j It is possible for (x i,x n ) where i {1,...,n 1} to compute (u i,v i,w i,t i ) such that u i x n = v i x i w i x n = t i x i u i +t i = 1 Using this we will proceed by showing how a ii and a nn, a ij where i j {1,...,n 1} and finally a ni and a in can be computed using B. 23