Typed Lambda Calculi Nikos Tzeveλekos Queen Mary, University of London 1 / 23
What is the Lambda Calculus A minimal formalism expressing computation with functionals s ::= x λx.s ss All you need is lambda. (Simon Peyton-Jones) The aim of the course is to provide an introduction to the lambda calculus along with a selection of results on its operational and denotational semantics. 2 / 23
Material Mostly following: A. D. Ker. Lambda Calculus and Types (2009). http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.157.3171 Background reading: Henk Barendregt. The Lambda Calculus: Its Syntax and Semantics. North-Holland, revised edition (1984). J. Roger Hindley. Basic Simple Type Theory. CUP (1997). Freely downloadable: J.-Y. Girard. Proofs and Types (1990). Barendregt and Barendsen. An Introduction to the Lambda Calculus (1994). R. Loader. Notes on Simply Typed Lambda Calculus (1998). Peter Selinger. Lecture notes on the Lambda Calculus (2007).... 3 / 23
What is the Lambda Calculus for? it depends on whom you ask Church, Curry (1920s): it is the theory of functions. Turing (1930s): it is the definition of effective computability. Brouwer, Heyting, Kolmogorov,... : it is a representation of logical proofs. (Curry-Howard correspondence; 1920s to present day) McCarthy, Scott,... : it is a basis for the definition of functional programming languages (e.g. haskell). (1950s-60s) Lambek, lots of other category theorists,... : it is the internal language of Cartesian Closed Categories. (1970s) 4 / 23
What is the Lambda Calculus for? it depends on whom you ask Church, Curry (1920s): it is the theory of functions. Turing (1930s): it is the definition of effective computability. It is also a vital part of any computer scientist s vocabulary. There Brouwer, are a number Heyting, of Kolmogorov, varieties of λ-calculi,... : it is asuited representation to different oftasks. logical proofs. (Curry-Howard correspondence; 1920s to present day) McCarthy, Scott,... : it is a basis for the definition of functional programming languages (e.g. haskell). (1950s-60s) Lambek, lots of other category theorists,... : it is the internal language of Cartesian Closed Categories. (1970s) 4 / 23
Intuitively, What is the Lambda Calculus? It is a formal language to reason about functions, the terms of which sometimes look halfway between a computer program and a mathematical formula. For example, λy.(y +3) and λy.(y y) 5 / 23
Intuitively, What is the Lambda Calculus? It is a formal language to reason about functions, the terms of which sometimes look halfway between a computer program and a mathematical formula. For example, λy.(y +3) and λy.(y y) represent the add 3 function and the squaring function respectively. Compare with: φ(y) = y +3 and φ(y) = y y 5 / 23
Functions The key fact is that a function becomes an object which can be passed as a parameter just like a number can. Here is a function which returns a function as output: λx.(ifx > 0then(λy.(y +3))else(λy.(y y))) 6 / 23
Function Application To use a function we apply it to an argument, e.g. (λy.(y +3)) }{{} the function the argument {}}{ 10 = 13 Compare with φ(y) = y +3 φ(10) = 13 7 / 23
Function Application To use a function we apply it to an argument, e.g. (λy.(y +3)) }{{} the function the argument {}}{ 10 = 13 Compare with φ(y) = y +3 φ(10) = 13 A more complicated example: ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 = (λy.(y y))3 = 9 Perhaps you can already see how the λ-calculus helps bridge the gap between mathematics and computer programs 7 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! 8 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 8 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 8 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 (if 2 > 0then(λy.(y +3))else(λy.(y y)))3 8 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 (if 2 > 0then(λy.(y +3))else(λy.(y y)))3 (λy.(y y))3 8 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 (if 2 > 0then(λy.(y +3))else(λy.(y y)))3 (λy.(y y))3 8 / 23
Equality What do we mean by = in the previous examples? We mean equality modulo computation/reduction! ((λx.(ifx > 0then(λy.(y +3))else(λy.(y y))))( 2))3 (if 2 > 0then(λy.(y +3))else(λy.(y y)))3 (λy.(y y))3 3 3 9 8 / 23
Currying What about functions with many arguments? 9 / 23
Currying What about functions with many arguments? For example, can be represented as φ(x,y) = x+y λx.(λy.(x+y)) so (λx.(λy.(x+y))2)5 (λy.(2+y))5 7 9 / 23
Currying What about functions with many arguments? For example, can be represented as φ(x,y) = x+y λx.(λy.(x+y)) so (λx.(λy.(x+y))2)5 (λy.(2+y))5 7 We usually write λx.(λy.s) as λxy.s. This technique, which allows us to get by with functions with only one argument, is due to Schönfinkel (1924). (But named after H. B. Curry) 9 / 23
Some examples λx.x it takes an input and returns that as the output in other words it is the identity function. 10 / 23
Some examples λx.x it takes an input and returns that as the output in other words it is the identity function. λx.xx it takes an input and applies it to itself (?!) 10 / 23
Some examples λx.x it takes an input and returns that as the output in other words it is the identity function. λx.xx it takes an input and applies it to itself (?!) λx.(λy.x) it takes an inputxand returns the constant function which takes an input, ignores it, and always returns x. 10 / 23
Some examples λx.x it takes an input and returns that as the output in other words it is the identity function. λx.xx it takes an input and applies it to itself (?!) λx.(λy.x) it takes an inputxand returns the constant function which takes an input, ignores it, and always returns x. We usually write the latter functionλxy.x. We can now see that it is just the curried version of a projection function. 10 / 23
Terms of the Untyped λ-calculus Let s start! 11 / 23
Terms of the Untyped λ-calculus Let us assume a countably infinite setv of variable identifiers. Terms will be finite strings made up of variable identifiers and the symbols: λ ( ). Definition. The set of termsλof the untypedλ-calculus is defined inductively by the rules: (var) x Λ x V (app) s Λ t Λ (st) Λ (abs) s Λ (λx.s) Λ x V 11 / 23
Terms of the Untyped λ-calculus Let us assume a countably infinite setv of variable identifiers. Terms will be finite strings made up of variable identifiers and the symbols: λ ( ). Definition. The set of termsλof the untypedλ-calculus is defined inductively by the rules: (var) x Λ x V (app) s Λ t Λ (st) Λ (abs) s Λ (λx.s) Λ x V E.g. (λx.x) (xy) ((xy)z) (x(yz)) ((λx.x)(λy.(zz))) (x(λy.(((λz.z)x)y)))... 11 / 23
Omitting parentheses 1. Outermost parentheses can be omitted: st means (st) 2. A sequence of applications associates to the left: xyz means ((xy)z) 12 / 23
Omitting parentheses 1. Outermost parentheses can be omitted: st means (st) 2. A sequence of applications associates to the left: xyz means ((xy)z) 3. Nested abstractions associate to the right and can be gathered under a singleλ. The body of an abstraction is as much as possible of the rest of the term (unless otherwise bracketed). E.g. λxy.yx means (λx.(λy.(yx))) 12 / 23
Omitting parentheses 1. Outermost parentheses can be omitted: st means (st) 2. A sequence of applications associates to the left: xyz means ((xy)z) 3. Nested abstractions associate to the right and can be gathered under a singleλ. The body of an abstraction is as much as possible of the rest of the term (unless otherwise bracketed). E.g. λxy.yx means (λx.(λy.(yx))) Examples: ((((xy)z)t)u) can be written xyztu but(x(y(z(tu)))) has to be written x(y(z(tu))) 12 / 23
Omitting parentheses 1. Outermost parentheses can be omitted: st means (st) 2. A sequence of applications associates to the left: xyz means ((xy)z) 3. Nested abstractions associate to the right and can be gathered under a singleλ. The body of an abstraction is as much as possible of the rest of the term (unless otherwise bracketed). E.g. λxy.yx means (λx.(λy.(yx))) Examples: ((((xy)z)t)u) can be written xyztu but(x(y(z(tu)))) has to be written x(y(z(tu))) (λx.(λy.(λz.(x(yz))))) is written λxyz.x(yz) (λx.(λy.(λz.((xy)z)))) is written λxyz.xyz 12 / 23
Free Variables Each termshas a set of free variables, FV(s), defined by: FV(x) = {x} FV(st) = FV(s) FV(t) FV(λx.s) = FV(s)\{x} E.g: FV(x(λy.(λz.z)xy)) = {x} 13 / 23
Free Variables Each termshas a set of free variables, FV(s), defined by: FV(x) = {x} FV(st) = FV(s) FV(t) FV(λx.s) = FV(s)\{x} E.g: FV(x(λy.(λz.z)xy)) = {x} A term with no free variables is called closed. The set of closed terms is denotedλ 0. Closed terms are sometimes called combinators. Some standard combinators: i := λx.x,k := λxy.x,ω := (λx.xx)(λx.xx). Variables appearing insunder aλare called bound. (λ is a binder) 13 / 23
Alpha Conversion Definition. Two termssandtare said to beα-convertible, written s α t or justs t, if: one can derive the same term from both purely by renaming bound variables to fresh variables. Examples: λx.x λy.y, x(λx.x) x(λy.y),... 14 / 23
Alpha Conversion Definition. Two termssandtare said to beα-convertible, written s α t or justs t, if: one can derive the same term from both purely by renaming bound variables to fresh variables. Examples: λx.x λy.y, x(λx.x) x(λy.y),... We consider terms which are α-convertible to be identical at the syntactic level to us they are the same term! We adopt the variable convention (aka the Barendregt convention): In any definition, theorem or proof in which only finitely or countably many terms appear, we silently α-convert them so that the bound variables of each term are not the same as the bound variables of any other term, or the free variables of any term. 14 / 23
Alpha Conversion formally Formally, we define variable swapping on terms recursively as follows. (y x) z y ifz x x ifz y z otherwise (y x) st ((y x) s)((y x) t) (y x) λz.s λ((y x) z).((y x) s) 15 / 23
Alpha Conversion formally Formally, we define variable swapping on terms recursively as follows. (y x) z y ifz x x ifz y z otherwise (y x) st ((y x) s)((y x) t) (y x) λz.s λ((y x) z).((y x) s) E.g.(y x) λxy.x(λz.yz) λyx.y(λz.xz) 15 / 23
Alpha Conversion formally Formally, we define variable swapping on terms recursively as follows. (y x) z y ifz x x ifz y z otherwise (y x) st ((y x) s)((y x) t) (y x) λz.s λ((y x) z).((y x) s) E.g.(y x) λxy.x(λz.yz) λyx.y(λz.xz) Then, α is the relation on terms defined by: x α x s α s t α t st α s t (y x) s α (y x ) s λx.s α λx.s y / FV(ss ) 15 / 23
Substitution What happens when (λx.s)t? 16 / 23
Substitution The effect of applying a function to an argument is defined by the scheme: (λx.s)t s[t/x] for any termssandtand variablex, where s[t/x] means inssubstitutet for every free occurrence of the variable x. 16 / 23
Substitution The effect of applying a function to an argument is defined by the scheme: (λx.s)t s[t/x] for any termssandtand variablex, where s[t/x] means inssubstitutet for every free occurrence of the variable x. Formally: y[t/x] { y ifx y t ifx y (su)[t/x] (s[t/x])(u[t/x]) (λy.s)[t/x] λy.(s[t/x]) assumingy x andy FV(t) 16 / 23
Substitution The effect of applying a function to an argument is defined by the scheme: (λx.s)t s[t/x] for any termssandtand variablex, where s[t/x] means inssubstitutet for every free occurrence of the variable x. Formally: y[t/x] { y ifx y t ifx y (su)[t/x] (s[t/x])(u[t/x]) We can assume that y x and y / FV(t) because of the variable convention. This kind of substitution is usually termed as capture-avoiding. (λy.s)[t/x] λy.(s[t/x]) assumingy x andy FV(t) 16 / 23
Substitution The effect of applying a function to an argument is defined by the scheme: (λx.s)t s[t/x] for any termssandtand variablex, where s[t/x] means inssubstitutet for every free occurrence of the variable x. Formally: y[t/x] { y ifx y t ifx y (su)[t/x] (s[t/x])(u[t/x]) We can assume that y x and y / FV(t) because of the variable convention. This kind of substitution is usually termed as capture-avoiding. (λy.s)[t/x] λy.(s[t/x]) assumingy x andy FV(t) For example: x[λxy.y/x] λxy.y, (λx.y)[λyz.z/y] λxyz.z, (λx.x)[λxy.y/x] (λz.z)[λxy.y/x] λz.z λx.x,... 16 / 23
Notions of Reduction A notion of reduction overλ, also called a redex rule, is just a binary relation onλ, i.e. a subset ofλ Λ. From a notion of reductionrwe can derive the relations onλ Λ: The one-stepr-reduction is the relation R defined by: s R t s,t R s R t su R tu s R t us R ut s R t λx.s R λx.t We read s R t as sr-reduces totin one step 17 / 23
Notions of Reduction A notion of reduction overλ, also called a redex rule, is just a binary relation onλ, i.e. a subset ofλ Λ. From a notion of reductionrwe can derive the relations onλ Λ: The one-stepr-reduction is the relation R defined by: s R t s,t R s R t su R tu s R t us R ut s R t λx.s R λx.t We read s R t as sr-reduces totin one step = R is the reflexive closure of R + R is the transitive closure of R R is the reflexive, transitive closure of R Reads R t as sr-reduces tot = R is the reflexive, symmetric, transitive closure of R Reads = R t as sr-converts tot (or s isr-convertible tot ) 17 / 23
Notions of Reduction (rules) (R) s = R t s,t R s + R t s,t R s R t s,t R s = R t s,t R (l-app) s = R t su = R tu + s R t su + R tu s R t su R tu s = R t su = R tu (r-app) s = R t us = R ut + s R t us + R ut s R t us R ut s = R t us = R ut (abs) s = R t λx.s = R λx.t + s R t λx.s + R λx.t s R t λx.s R λx.t s = R t λx.s = R λx.t (refl) s = R s s R s s = R s (trans) s + R u u + R t s + R t s R u u R t s R t s = R u u = R t s = R t (sym) t = R s s = R t 18 / 23
Alternative characterisations Lemma. LetRbe a notion of reduction overλand lets,t be terms. s + R t if and only if for somen 1 and termss 0,s 1,...,s n, s s 0 R s 1 R R s n t 19 / 23
Alternative characterisations Lemma. LetRbe a notion of reduction overλand lets,t be terms. s + R t if and only if for somen 1 and termss 0,s 1,...,s n, s s 0 R s 1 R R s n t s R t if and only if for somen 0 and termss 0,s 1,...,s n, s s 0 R s 1 R R s n t 19 / 23
Alternative characterisations Lemma. LetRbe a notion of reduction overλand lets,t be terms. s + R t if and only if for somen 1 and termss 0,s 1,...,s n, s s 0 R s 1 R R s n t s R t if and only if for somen 0 and termss 0,s 1,...,s n, s s 0 R s 1 R R s n t s = R t if and only if for somen 0 ands 0,s 1,...,s n,t 0,t 1,...,t n, R t 0 t 1 R R... t n 1 R s s 0 s 1... s n t n t R 19 / 23
Beta-reduction The notion of reductionβ is the relation: β = { redex contractum {}}{{}}{ (λx.s)t, s[t/x] s,t Λ} E.g. one-step β-reduction is explicitly given by: (λx.s)t β s[t/x] s β t su β tu s β t us β ut s β t λx.s β λx.t 20 / 23
Beta-reduction The notion of reductionβ is the relation: β = { redex contractum {}}{{}}{ (λx.s)t, s[t/x] s,t Λ} E.g. one-step β-reduction is explicitly given by: (λx.s)t β s[t/x] s β t su β tu s β t us β ut s β t λx.s β λx.t For example: (λx.x)(λyz.z) β λyz.z λx.x((λy.y)x) β λx.xx (λx.xx)((λx.xx)y) β yy(yy) reduction paths? (λz.zz)((λx.x)y) β yy β (λz.z)((λx.xx)y) so (λz.zz)((λx.x)y) = β (λz.z)((λx.xx)y) 20 / 23
Beta-reduction The notion of reductionβ is the relation: β = { redex contractum {}}{{}}{ (λx.s)t, s[t/x] s,t Λ} E.g. one-step β-reduction is explicitly given by: (λx.s)t β s[t/x] s β t su β tu s β t us β ut s β t λx.s β λx.t For example: (λx.x)(λyz.z) β λyz.z λx.x((λy.y)x) β λx.xx (λx.xx)((λx.xx)y) β yy(yy) reduction paths? (λz.zz)((λx.x)y) β yy β (λz.z)((λx.xx)y) so (λz.zz)((λx.x)y) = β (λz.z)((λx.xx)y) recall Ω := (λx.xx)(λx.xx), thus Ω β Ω β Ω β Ω β 20 / 23
Fixed Point Combinators Definition. For termsf andu,uis said to be a fixed point off if fu = β u. Theorem (First Recursion Theorem). Letf be a term. Then there is a termuwhich is a fixed point off. In fact,ucan be computed within the λ-calculus itself. 21 / 23
Fixed Point Combinators Definition. For termsf andu,uis said to be a fixed point off if fu = β u. Theorem (First Recursion Theorem). Letf be a term. Then there is a termuwhich is a fixed point off. In fact,ucan be computed within the λ-calculus itself. Proof. A fixed point combinator is a closed termssuch that for all terms u: u(su) = β su In other words,su is a fixed point ofu, for everyu. 21 / 23
Fixed Point Combinators Definition. For termsf andu,uis said to be a fixed point off if fu = β u. Theorem (First Recursion Theorem). Letf be a term. Then there is a termuwhich is a fixed point off. In fact,ucan be computed within the λ-calculus itself. Proof. A fixed point combinator is a closed termssuch that for all terms u: u(su) = β su In other words,su is a fixed point ofu, for everyu. Two well-known FPC s are: y := λf.(λx.f(xx))(λx.f(xx)) θ := (λxy.y(xxy))(λxy.y(xxy)) due to Curry and Turing respectively. 21 / 23
Recursion Fixed point combinators allow us to encode recursion within the lambda calculus: yg β g((λx.g(xx))λx.g(xx)) β g(yg) so yg = β g(yg) θg β g(θg) Let us define (assuming we have specified notation for booleans and arithmetic): fact := y(λf.(λx.ifx = 0then1elsex f(x 1))) 22 / 23
Recursion Fixed point combinators allow us to encode recursion within the lambda calculus: yg β g((λx.g(xx))λx.g(xx)) β g(yg) so yg = β g(yg) θg β g(θg) Let us define (assuming we have specified notation for booleans and arithmetic): fact := y(λf.(λx.ifx = 0then1elsex f(x 1))) Then, fact satisfies the property: fact = β λx.ifx = 0then1elsex fact(x 1) 22 / 23
Recursion Fixed point combinators allow us to encode recursion within the lambda calculus: yg β g((λx.g(xx))λx.g(xx)) β g(yg) so yg = β g(yg) θg β g(θg) Let us define (assuming we have specified notation for booleans and arithmetic): fact := y(λf.(λx.ifx = 0then1elsex f(x 1))) Then, fact satisfies the property: fact = β λx.ifx = 0then1elsex fact(x 1) In particular: factx = β ifx = 0then1elsex fact(x 1) 22 / 23
Exercises 1. List all the free variables in: (a) (b) (c) (d) (e) λxy.(λu.uvxy)z λxy.z(λu.uvxy) λwx.z(λu.uvwx) λvw.z(λz.uvvw) λyx.z(λu.uwyx) Which of these five terms are identified byα-conversion (i.e. which are actually the same as each other)? 2. Perform the following substitutions: (a) (b) (c) (λx.yx)[yz/x] (λy.xy)[yx/x] (λz.(λx.yx)xz)[zx/x] 3. Show that, for any variable x distinct from y, any terms s and t, and term u not containing x as a free variable, s[t/x][u/y] s[u/y][t[u/y]/x]. You will need to use induction on the structure of the term s. 5 Letmgs be the combinatormgs := TT T, wheret is: }{{} 26 λabc... xyz.z(midlands graduate school rulez) Prove that mgs is a fixed point combinator. 6 Construct: (a) (b) for each closed terms, a closed termt 1 such thatt 1 = β t 1 s. for each closed term s, a closed term t 2 such that t 2 (λx.x)ss = β t 2 s. [Hint: try to render these equations to some formu = β (...)u and then use the First Recursion Theorem.] 7 Show that there is no term f such that for all terms s and t, f(st) = β s. [Hint: Use the First Recursion Theorem. You may assume that= β is consistent, that is, there exist termss,t such thats β t.] 8 Show that every fixed point combinator can be characterised as a fixed point of a termg. FindG. 4. (a) (b) Find a term s such that, for all terms t and u, stu = β ut. Show that exists a termssuch that, for all termst,st = β ss. [Hint: do not use the First Recursion Theorem. Use common sense. What does s do with its argument(s) in each case?] 23 / 23