Introduction to lambda calculus Part 6 Antti-Juhani Kaijanaho 2017-02-16 1 Untyped lambda calculus 2 Typed lambda calculi 2.1 Dynamically typed lambda calculus with integers 2.2 A model of Lisp 2.3 Simply typed lambda calculus 2.4 A monomorphic model of ML and Haskell At their core, ML (including Standard ML and OCaml) and Haskell are Lisps with a static type system. A fundamental part of them is a (parametrically) polymorphic type discipline due (independently of each other) to Hindley (1969) and Milner (1978). Minor changes made afterward. Last changed 2017-02-28 09:44:01+02:00. 1
D TypeName T, U Type T, U ::= D Num T U x, y, z Var c Const A, B, C Constructor t, u Term t, u ::= x C c t + u t u λx t let x = t in u case t of m m Match m ::= p t m p Pattern p ::= C p x Γ, x : T x : T Γ, C : T C : T Γ c : Num Γ t : Num Γ u : Num Γ t + u : Num Γ t : U T Γ u : U Γ t u : T Γ, x : U t : T Γ λx t : U T Γ, x : T t : T Γ, x : T u : U Γ let x = t in u : U Γ t : T Γ m : T U Γ case t of m : U Γ C : T 1 T n T Γ, x 1 : T 1,, x n : T n u : U Γ m : T U Γ C x 1 x n u m : T U Γ : T U 2 (16) (17) (18) (19) (20) (21) (22) (23) (24) (25)
2.5 Hindley Milner type discipline The monomorphic system discussed above has a problem: terms do not have unique types. Consider the following type derivation 1 schema using the monomorphic ML/Haskell type system: x : T x : T (λx x) : T T (26) This is not a type derivation, as T is not a type but a metavariable standing for types. We can make a valid type derivation by replacing T uniformly in it by some actual type. For example: Or: Or: x : Num x : Num (λx x) : Num Num x : NumList x : NumList (λx x) : NumList NumList x : (Num Num) x : (Num Num) (λx x) : (Num Num) (Num Num) But there is something inherently right about the derivation schema (26): it somehow captures the essence of the identity function that it takes anything and results in that same anything. We can try to capture this idea by adding to our type language the notion of a type variable (which we usually write using lowercase Greek letters). Similarly, there are many functions (such as the one computing the length of the list) that do not care what the element type of a list is. But remember, the list of numbers was defined earlier as the algebraic type the algebraic type NumList having the constructors Nil : NumList, Cons : Num NumList NumList. We can write the length function for these lists easily enough: let length = λl case l of Cons x xs 1 + (length xs) in length Nil 0 and it is straigthforward to determine that, given the above constructors of NumList in the type environment, this term has the type NumList Num. But if Cons and Nil were instead defined for StringList, this exact same term with no change! would have the type StringList Num. Since a constructor just packs pointers to its arguments to a memory object, there is no real need for these constructors to care what the element type is, so long as the element type information is preserved. Thus, we might as well also allow the use of type variables as parameters to type names so that we can write List α instead of just IntList. 1 Earlier I called these type inference tree but I will be using type inference for something else soon. The term type derivation is better here. 3
We now arrive in the following changed abstract syntax. There is no need to make any changes to the term language, and so I will not repeat its abstract syntax. α, β, γ, δ, TypeVar D TypeName d ParametrizedTypeName d ::= D d T T, U Type T, U ::= d α Num T U I will assume in the following that all constructors have sensible types. In a more thorough development I would introduce the idea of a kind system (a type system for types), but I will ignore that wrinkle here. 2.5.1 Principal types There is a natural partial order between types: α α is quite obviously more general than Num Num and more general than (β β) (β β). To make this more precise, notice that we can produce both of the latter by replacing type variables by more complex types: (α α)[α := Num] = Num Num (α α)[α := β β] = (β β) (β β) In the general case, there may be more than one type variable; the same idea works so long as the substitution is understood to be simultaneous for all type variables involved. For example: (α β)[α := β Int, β := Int] = (β Int) Int Formally we define a type substitution as any function σ : TypeVar Type such that the set { α TypeVar σ(α) α } is finite. The bracket-substitution notation is defined as { T i if β = α i, [α 1 := T 1,, α n := T n ](β) = (27) β otherwise. and the application of a substitution σ to a type (written by appending the substitution to the type) is defined by structural recursion: D σ = D (28) (d α)σ = (dσ)(σ(α)) (29) α σ = σ(α) (30) Num σ = Num (31) (T U)σ = T σ Uσ (32) Example 18 Let us compute the following substitution: (α β)[α := β Int, β := Int] 4
= α[α := β Int, β := Int] β[α := β Int, β := Int] by (32) = [α := β Int, β := Int](α) [α := β Int, β := Int](β) by (30) = (β Int) [α := β Int, β := Int](β) by (27) = (β Int) Int by (27) Exercise 14 Compute the substitution (α α)[α := β β]. Now, we will define that a type T is at most as general as a type U, written as T U, if there is some type substitution σ for which T = Uσ holds. Further, we will define that a type substitution σ is a renaming if for all type variables α it holds that σ(α) is a type variable and if for all distinct type variables α and β it holds that σ(α) σ(β). Now, types T and U are identical up to renaming, written T U, if there is a renaming σ such that T σ = U. Example 19 α β β α, because (α β)[α := β, β := α] and [α := β, β := α] is a renaming. Exercise 15 Prove: 1. T T holds for all types T. 2. If T U and U T, then T U, for all types T and U. 3. If T 1 T 2 and T 1 T 3, then T 1 T 3, for all types T 1, T 2, and T 3. Now, it is possible to prove that for any set of types T there is at least one generalization of those types, that is, a type that is at least as general as every type in T. Further, it is possible to prove that there is a unique (up to renaming) least general generalization, that is, some generalization of T that is at most as general as any other generalization of T. Thus, for any term there is a unique (up to renaming) principal type, which is the least general generalization of all types that the monomorphic theory provides. The way to find the principal type of a term is essentially this: 1. Start with a type judgment stating that the type (with respect to an appropriate type environment) of the term is some type variable that does not occur elsewhere in the type judgment (a fresh type variable). 2. Build a type derivation as usual, with the following two provisos: Each time you need to invent a type, use a fresh type variable. Each time you need to determine whether two types are identical, proceed as if they are, but record this identity requirement in a separate list of type equations. 3. Finally, solve the set of type equations. Example 20 Let us proceed with this process, up to but not including solving the equations, for determining the principal type of λf λx f(f x). Start with an empty environment and a fresh type variable: λf λx f(f x) : α We use rule (21), using fresh type variables β and γ for T and U, respectively, and record a type equation to make the rule come out right: f : β λx f(f x) : γ λf λx f(f x) : α α = β γ 5
We use rule (21) again, using fresh type variables δ and ε for T and U, respectively, and record a type equation to make the rule come out right: f : β, x : δ f(f x) : ε f : β λx f(f x) : γ λf λx f(f x) : α α = β γ ε Next, we use rule (20), using the fresh type variable ϕ as U: f : β, x : δ f : ϕ ε f : β, x : δ f x : ϕ f : β, x : δ f(f x) : ε f : β λx f(f x) : γ λf λx f(f x) : α α = β γ ε We can now finish one branch of this derivation tree by using axiom (16) and recording a type equation: f : β, x : δ f : ϕ ε f : β, x : δ f x : ϕ f : β, x : δ f(f x) : ε f : β λx f(f x) : γ λf λx f(f x) : α α = β γ ε β = ϕ ε The other branch requires another use of rule (20) and the use of a fresh type variable, this time ζ, for U: f : β, x : δ f : ϕ ε f : β, x : δ f : ζ ϕ f : β, x : δ x : ζ f : β, x : δ f x : ϕ f : β, x : δ f(f x) : ε f : β λx f(f x) : γ λf λx f(f x) : α α = β γ ε β = ϕ ε We can close the two open branches by using axiom (16) and recording the appropriate equations: f : β, x : δ f : ϕ ε f : β, x : δ f : ζ ϕ f : β, x : δ x : ζ f : β, x : δ f x : ϕ f : β, x : δ f(f x) : ε f : β λx f(f x) : γ λf λx f(f x) : α α = β γ ε β = ϕ ε β = ζ ϕ Example 21 Let us proceed with this process, up to but not including solving the equations, for determining the principal type of let x = Cons 1 x in x where, almost like before, Cons : α List α List α. We start with the type environment Γ = Cons : α List α List α and a fresh type variable (which now cannot be α): Use rule (22) and a fresh type variable γ: Γ (let x = Cons 1 x in x) : β Γ, x : γ (Cons 1 x) : γ Γ, x : γ x : β Γ (let x = Cons 1 x in x) : β 6
Use rule (20) and a fresh type variable δ: Γ, x : γ (Cons 1) : δ γ Γ, x : γ x : δ Γ, x : γ (Cons 1 x) : γ Γ, x : γ x : β Γ (let x = Cons 1 x in x) : β Use rule (20) and a fresh type variable ε: Γ, x : γ Cons : ε δ γ Γ, x : γ 1 : ε Γ, x : γ (Cons 1) : δ γ Γ, x : γ x : δ Γ, x : γ (Cons 1 x) : γ Γ, x : γ x : β Γ (let x = Cons 1 x in x) : β Now, close the leftmost branch by using axiom (17) and recording the appropriate type equation: Γ, x : γ Cons : ε δ γ Γ, x : γ 1 : ε Γ, x : γ (Cons 1) : δ γ Γ, x : γ x : δ Γ, x : γ (Cons 1 x) : γ Γ, x : γ x : β Γ (let x = Cons 1 x in x) : β α List α List α = ε δ γ Close the leftmost open branch by using axiom (18) and recording the appropriate type equation: Γ, x : γ Cons : ε δ γ Γ, x : γ 1 : ε Γ, x : γ (Cons 1) : δ γ Γ, x : γ x : δ Γ, x : γ (Cons 1 x) : γ Γ, x : γ x : β Γ (let x = Cons 1 x in x) : β α List α List α = ε δ γ Close the remaining open branches by using axiom (16) and recording the appropriate type equations: Γ, x : γ Cons : ε δ γ Γ, x : γ 1 : ε Γ, x : γ (Cons 1) : δ γ Γ, x : γ x : δ Γ, x : γ (Cons 1 x) : γ Γ, x : γ x : β Γ (let x = Cons 1 x in x) : β 2.5.2 Solving type equations α List α List α = ε δ γ γ = β The problem of solving these sorts of equations comes up in many different subfields of classical computer science and is known as the syntactic unification problem. The first algorithm was proposed in the field of automatic reasoning by Robinson (1965); a rather clearer nondeterministic algorithm was described by Martelli and Montanari (1982). I will describe the latter. 2 2 Martelli and Montanari (1982) also show how this algorithm can be refined to be easy to implement efficiently on a deterministic computer. I will not discuss it here, as I am dealing with concepts, not efficient implementation. 7
The input to this algorithm is a set of type equations; the output is either a solution to the set of type equations or an indication of failure. The solution will be in the form of a set of type equations, each equation having a type variable on the left hand side and a type on the right hand side, and no type variable being the left hand side of more than one type equation of the solution. The algorithm proceeds by picking nondeterministically some equation matching the left hand side of one of the following rules and replacing it with the equations on the right hand side of that same rule, mutatis mutandis. If the right hand side is empty, the equation is simply deleted. T = α α = T (33) α = α (34) Num = Num (35) D = D (36) T 1 U 1 = T 2 U 2 T 1 = T 2, U 1 = U 2 (37) d T = d U d = d, T = U (38) There is one other rule, the variable elimination rule, which proceeds as follows. Choose an equation of the form α = T, where α does not occur in T but does occur in at least one other equation in the set. Do not change this equation; but modify every other equation by applying the substitution [α := T ] to its both sides. This algorithm terminates when no equation matches any rule. If the resulting equation set has none of the following error conditions, the set is a solution: 3 There is an equation of the form α = T, where α occurs in T but is not T. This indicates that there is an infinitely long type involved, which is not allowed. This error condition is often called an occurs check failure. There is an equation of the form T 1 T 2 = U or U = T 1 T 2, where U is not a function type nor a type variable. This indicates a type error where something that is not a function (whatever gave rise to U) is being used as a function. There is an equation of the form D = T or T = D, where T is not D nor a type variable. This indicates a type error involving the algebraic type D. There is an equation of the form Num = T or T = Num, where T is not Num nor a type variable. This indicates that something that is not a number (whatever gave rise to T ) is being used as a number. Note that none of the rules delete an error condition, so an implementation may terminate with error as soon as it finds one even if there are rule-matching equations remaining. Example 22 In example 20, we derived that the type of λf λx f(f x) is α, as constrained by following set of type equations: α = β γ ε β = ϕ ε β = ζ ϕ 3 Corrected on 2017-02-21. Thanks to a student for pointing out the errors. 8
Let us solve this equation set: α = β γ α = β γ ε γ = ζ ε β = ϕ ε β = ϕ ε β = ζ ϕ β = ζ ϕ α = β (ζ ε) γ = ζ ε β = ϕ ε β = ζ ϕ α = (ζ ϕ) (ζ ε) γ = ζ ε ζ ϕ = ϕ ε β = ζ ϕ α = (ζ ϕ) (ζ ε) γ = ζ ε ζ = ϕ ϕ = ε β = ζ ϕ α = (ζ ε) (ζ ε) γ = ζ ε ζ = ε ϕ = ε ζ = ε ϕ = ε β = ε ε δ = ε β = ζ ε α = (ε ε) (ε ε)γ = ε ε eliminate δ eliminate γ eliminate β (3rd eq) by (37) eliminate ϕ eliminate ζ which is a solution. Thus, the principal type of λf λx f(f x) is (ε ε) (ε ε), that is, (ε ε) ε ε. Example 23 In example 21, we derived that the type of let x = Cons 1 x in x (in a type environment defining Cons : α List α List α) is β as constrained by following set of type equations: α List α List α = ε δ γ 9 γ = β
Let us solve this type equation set: α List α List α = ε δ γ γ = β α List α List α = ε δ β β = δ γ = β α List α List α = ε δ δ β = δ α List α List δ δ β = δ List α List α = δ δ β = δ List α = δ β = δ δ = List α β = δ δ = List Num β = δ eliminate γ (4th eq) eliminate β eliminate ε by (37) by (37) (see below) by (33) eliminate α 10
δ = List Num β = List Num γ = List Num eliminate δ which is a solution. Note that when applying rule (37) to the equation List α List α = δ δ, the resulting two equations are identical and, in a set, are not repeated. Now, thus we have the principal type of let x = Cons 1 x in x: it is List Num. Note that the solution equation that the algorithm gives can be interpreted as a type substitution. It is sometimes called the most general unifier or mgu of the original set of type equations. Exercise 16 Determine the principal type of λx x + 1 using this technique. 2.5.3 Type schemes This puzzle is not quite solved yet. Suppose that we have defined an ordered pair as the parameric algebraic type Pair α β having the constructor Pair : α β Pair α β. I will call an environment containing this constructor Γ. Now, let us type Pair (Pair 1 2) 3: 4 Γ Pair : δ Num γ Γ Pair : Num Num δ Γ 1 : Num Γ Pair 1 : Num δ Γ Pair 1 2 : δ Γ Pair (Pair 1 2) : Num γ Γ Pair (Pair 1 2) 3 : γ { } α β Pair α β = δ Num γ Γ 2 : Num Γ 3 : Num α β Pair α β = Num Num δ Now, let us solve the resulting set of type equations: { } α β Pair α β = δ Num γ α β Pair α β = Num Num δ α = δ β Pair α β = Num γ α β Pair α β = Num Num δ α = δ β = Num Pair α β = γ α β Pair α β = Num Num δ α = δ β = Num Pair α β = γ β Pair α β = Num δ by (37) (1st eq) by (37) (2nd eq) by (37) (4th eq) 4 For simplicity, I will not invent a type variable where the only possible type is obvious, that is, when typing numeric constants. 11
α = δ β = Num Pair α β = γ Pair α β = δ Num = δ β = Num Pair Num β = γ Pair Num β = δ Num = δ β = Num Pair Num Num = γ Pair Num Num = δ δ = Num β = Num Pair Num Num = γ Pair Num Num = δ δ = Num β = Num Pair Num Num = γ Pair Num Num = Num by (37) (5th eq) eliminate α (4th eq) eliminate β by (33) (1st eq) eliminate δ (1st eq) The algorithm terminates with an error condition due to the fifth equation in the final set. Thus Pair (Pair 1 2) 3 has no type at all; but surely it should! The problem here is that the α and β in the environment s Pair : α β Pair α β can have only one type each, but this problem term uses Pair polymorphically. The solution adopted in the Hindley Milner type discipline is to allow variables and constructors to have a truly generic type in the type environment one that signals that, each time the variable or constructor is used, it may have a different instantiation of its principal type. These truly generic types are called in the literature type schemes: types together with a list of type variables that may be instantiated differently in different situations. We write a type scheme using a logic-borrowed notation: α α α means that the α may stand for different types in different uses of the variable or constructor. Thus, we add to the abstract syntax: Σ TypeScheme Σ ::= α 1,, α n T Note that a type scheme is not recursive, and the qualifier can thus appear only once. We allow now type environments to specify variables and constructors a type scheme instead of a type; the idea is that all constructor definitions and all toplevel variable definitions will generalize 12
all type variables in the principal type. We now give the following two typing rules that instantiate the type variables at each use: Γ, x : α 1,, α n T x : T [α 1 := β 1,, α n := β n ] (39) Γ, x : α 1,, α n T C : T [α 1 := β 1,, α n := β n ] (40) Note that in both rules, the β 1,, β n must be truly fresh: they must be type variables that do not yet occur anywhere in the type derivation (not just in the type judgment). Otherwise the typing may fail. In practice we use these rules to generate type equations. Thus, if we have a type judgment x : α 1,, α n T x : U we generate from it the following type equation (where β 1,, β n are fresh type variables): T [α 1 := β 1,, α 1 := β 1 ] = U Now, using the environment Pair : α, β α β Pair α β. we may successfully type Pair (Pair 1 2) 3: Γ Pair : δ Num γ Γ Pair : Num Num δ Γ 1 : Num Γ Pair 1 : Num δ Γ Pair 1 2 : δ Γ Pair (Pair 1 2) : Num γ Γ Pair (Pair 1 2) 3 : γ { } ε ϕ Pair ε ϕ = δ Num γ Γ 2 : Num Γ 3 : Num ζ η Pair ζ η = Num Num δ Solve the set of equations: { ε ϕ Pair ε ϕ = δ Num γ } ζ η Pair ζ η = Num Num δ ϕ Pair ε γ ζ η Pair ζ η = Num Num δ Pair ε ϕ = γ ζ η Pair ζ η = Num Num δ Pair ε ϕ = γ η Pair ζ η = Num δ by (37) (1st eq) by (37) (2nd eq) by (37) (4th eq) 13
Pair ε ϕ = γ η = Num Pair ζ η = δ Pair δ ϕ = γ η = Num Pair ζ η = δ Pair δ ϕ = γ η = Num Pair Num η = δ Pair δ ϕ = γ η = Num Pair Num Num = δ Pair δ Num = γ η = Num Pair Num Num = δ γ = Pair δ Num η = Num δ = Pair Num Num by (37) (5th eq) eliminate ε eliminate ζ eliminate η eliminate ϕ by (33) (3rd and 6th eq) 14
ε = Pair Num Num γ = Pair (Pair Num Num) Num η = Num δ = Pair Num Num eliminate δ The algorithm terminates and the result is a solution (most general unifier). Thus, we have the principal type of Pair (Pair 1 2) 3: it is Pair (Pair Num Num) Num. 2.5.4 Further developments The classical Hindley Milner system supports generalization of local variables. It turns out this feature is rarely used, and thus Vytiniotis, Peyton Jones, and Schrijvers (2010) argue that implicit generalization of local variables should be abandoned. I have followed their advice above. Odersky, Sulzmann, and Wehr (1999) proposed a general framework, called HM(X), for Hindley Milner style type disciplines, based on an extensible constraint system that is a generalization of the type equation approach I have discussed. The basic Hindley Milner remains a special case of this framework. A number of modern type discipline developments (many in actual use in Haskell at this time) have been formulated and reported using this framework. References Hindley, R. (1969). The Principal Type-Scheme of an Object in Combinatory Logic. In: Transactions of the American Mathematical Society 146, pp. 29 60. doi: 10.2307/1995158. Martelli, Alberto and Ugo Montanari (1982). An Efficient Unification Algorithm. In: ACM Transactions on Programming Languages and Systems 4.2, pp. 258 282. doi: 10.1145/357162.357169. Milner, Robin (1978). A Theory of Type Polymorphism in Programming. In: Journal of Computer and System Sciences 17.3, pp. 348 375. doi: 10.1016/0022-0000(78)90014-4. Odersky, Martin, Martin Sulzmann, and Martin Wehr (1999). Type Inference with Constrained Types. In: Theory and Practice of Object Systems 5.1, pp. 35 55. doi: 10.1002/(SICI)1096-9942(199901/03)5:1<35::AID-TAPO4>3.0.CO;2-4. url: https://infoscience.epfl.ch/ record/64388. Robinson, J. A. (1965). A Machine-Oriented Logic Based on the Resolution Principle. In: Journal of the ACM 12.1, pp. 23 41. doi: 10.1145/321250.321253. Vytiniotis, Dimitrios, Simon Peyton Jones, and Tom Schrijvers (2010). Let Should Not Be Generalized. In: Proceedings of the 5th ACM SIGPLAN Workshop on Types in Language Design and Implementation, pp. 39 50. doi: 10.1145/1708016.1708023. 15