Lambda Calculus

killingform

July 25, 2018

These notes (unﬁnished, obviously) were of a course I took on lambda cal-

culus, proof, and category theory.

Contents

I Introduction 2

1. What is a function? . . . . . . . . . . . . . . . . . . . . . . . . . 2

II Untyped λ-calculus 3

1. Terms and variables . . . . . . . . . . . . . . . . . . . . . . . . . 3

2. Change of bound variables and α-congruence . . . . . . . . . . . 4

3. Equality, =, and β-equivalence . . . . . . . . . . . . . . . . . . . 5

4. Combinatory calculus . . . . . . . . . . . . . . . . . . . . . . . . 7

1

I Introduction

1. What is a function?

There are two main roots of the theory of functions:

(a) analytic geometry (Descartes).

(b) the development of calculus (Leibniz, 1673, in a letter to Bernoulli).

Euler and Dirichlet concerned themselves more with numerical functions, and

in the 19th century, Bolzano, Cauchy, Weierstrass, and Fourier all had the

understanding of an analytic representation of functions:

ϕ(x) = . . . ,

for some formula. It was Fourier who ﬁrst broke it! The modern deﬁnition is

due to set theory, namely Dedekind and Cantor. Bourbaki’s deﬁnition is that a

function from A → B is a triple (A, R, B) such that:

(i) A and B are sets.

(ii) R ⊆ A × B (the “graph”).

(iii) ∀x ∈ A∃!y ∈ B . (x, y) ∈ R.

Intuitively, “R(x, y)” ≡ “f (x) = y”. We now say that (A, R, B) = (A

0

, B

0

, R

0

)

if and only if A = A

0

, B = B

0

, and R = R

0

⊆ A × B. So the modern functions

are equivalence classes of such triples:

f = g ⇐⇒ ∀x ∈ A . f(x) = g(x).

Consider the expressions:

2 + 3 = 5, 5 = 5.

The left is dynamic, however the right is tautological.

In λ-calculus, we remove A, B from the triple. So if we wrote f as:

f : A → B : x 7→ f(x),

λ-calculus only concerns itself with the rule of transformation, x 7→ f(x). In

the way, we write f now as:

λx.f(x),

the λ indicating that x is the scope of the function, and that we just replace x

in the rule f(x) with whatever we apply λx.f (x) to. A few explicit examples:

λx.x

2

, λx.2x, λx. sin x.

To apply a function f in Λ-calculus to something a, we write f ‘a. This conven-

tion is due to Bertrand Russell. A few more examples:

(λx.x

2

)‘a = x

2

[a/x] = a

2

, (λx.2x)‘b = 2x [b/x] = 2b,

where the notation x

2

[a/x] says “substitute a for x in x

2

”.

Since we are merely just doing substitution, we have for anything that x‘y

or x‘x even makes sense! This is the untyped λ-calculus.

2

II Untyped λ-calculus

1. Terms and variables

We deﬁne the untyped λ-calculus by structural induction, deﬁning the terms.

The class of variables, x

!

, x

2

, x

4

, . . . , is denoted by Var (we assume it is count-

able). We deﬁne the terms as follows:

1. Every variable is a term (this is the base case of our structural induction).

2. If M, N are terms, so is M‘N.

3. If M is a term, x ∈ Var, then λx.M is a term.

4. Nothing else is a term.

In this way, we have deﬁned everything we consider a “term” (a function).

In BNF (Backus-Naur Form), this is written:

MN ::= x | M ‘N | λx.M

If you like sets a lot (I do), you can think of λx.ϕ(x) as the set {x | ϕ(x)}. In

this way, we view x‘y as meaning x ∈ y.

Denote by F V (M) the set of free variables of a term M, deﬁned by in-

duction:

1. If x ∈ Var, then F V (x) := {x}.

2. F V (M‘N) = F V (M) ∪ F V (N).

3. F V (λx.M) = F V (M) \ {x}.

Similarly, we can deﬁne the bound variables inductively as follows:

1. If x ∈ Var, then BV (x) := ∅.

2. BV (M‘N) − BV (M) ∪ BV (N).

3. BV (λx.M) = BV (M) ∪ {x}.

We say that a term M is closed if and only if F V (M) = ∅. Another name for

closed terms is combinators. Some examples of combinators:

K = λx.λy.x, S = λx.λy.λz.(x‘z)‘(y‘z), I = λx.x.

We should probably discuss some notational kinks and quirks that we will use

frequently. Firstly, we associate function application (‘) to the left, that is:

((a‘b)‘c = a‘b‘c,

and more so, we might even drop the indicator ‘, so it further becomes:

(a‘b)‘c = a‘b‘c = abc.

3

With repeated bindings λx., we act from the inside out, sometimes dropping

the consecutive lambdas to save space and eliminate (or sometimes create) con-

fusion:

λx.(λy.(λz.M)) = λxyz.M.

So for example, our combinators from before with these conventions appear as:

K = λxy.x, S = λxyz.(xz)(yz), I = λx.x.

An example, we can ‘reduce’ the term SKK as follows:

SKKx = ((SK)K)x

= (Kx)(Kx)

= x

= Ix,

where we see that if we had the axiom of extensionality:

∀x . (fx = gx)

f = g

,

then we could conclude that SKK = I. Otherwise, the most we can say is that

SKKx = Ix.

2. Change of bound variables and α-congruence

The idea of α-congruence is that:

λx.t =

α

λy.t[y/x],

where y is some new/fresh variable. That is, the bound variable can be changed,

and we say that the functions are the same still—this is α-congruence. A priori,

we can’t say that these are equal without considering α-congruence as the bound

variables are diﬀerent. They are not constructively the same, but we would like

to say they are the same function at the end of the day.

Deﬁne t[M/x] for x ∈ Var and terms t, M, inductively by, y ∈ Var:

1. y[M/x] =

(

M, x = y;

y, otherwise.

2. (M ‘N)[y/x] = (M[y/x])‘(N[y/x]).

3. (λx.M )[y/x] =

(

λy.M[y/x], x = z;

λz.M[y/x], x 6= z.

If we apply α-congruence to λ-calculus, we get that terms are equivalent up

to change of bound variables.

4

3. Equality, =, and β-equivalence

The phenomenon of α-congruence brings to mind that we need to rigorously

deﬁne what it means to be equal.

Equality (=) is an equivalence relation, so we have the usual reﬂexivity,

symmetry, and transitivity:

M = M,

M = N

N = M

,

M = N, N = P

M = P

.

In addition, we have the following:

M = N

M‘R = N‘R

,

M = N

R‘M = R‘N

,

M = N

λx.M = λx.N

(ξ),

where the last one there is the ξ-rule. In addition, we also have the α-congruence

rule, discussed above:

λx.M = λy.M[y/x], (α)

and what is known as β-reduction:

(λx.M)‘N = M[N/x], (β)

which says that equality respects application. Finally, we have the η-rule:

λx.(fx) = f, for x 6∈ F V (f). (η)

Proposition 1. The η-rule is equivalent to extensionality.

Proof. Suppose η and note:

f‘x = g‘x =⇒

(ξ)

λx.(f‘x) = λx.(g‘x) =⇒

(η)

f = g.

Conversely, suppose EXT, and note:

(λx.(f‘x))‘x = f‘x, (β)

for some x 6∈ F V (f), and so by EXT: f = λx.(f‘x).

A few examples of β-reduction:

(λx.xx)‘a =

β

aa, (λx.xx)‘(λx.xx) =

β

(λx.xx)‘(λx.xx),

and as my professor said after showing us this, “We’re no longer in Kansas,

Toto”.

Theorem 1 (Fixed point theorem). For every F there exists an X such that

F ‘X = X.

Proof. Let W := λx.F (xx) and X := WW . Note:

X = W W = (λx.F (xx))W =

β

F (W W) = F (X).

5

This gets us close to Russell’s paradox, indeed only a little bit extra is needed.

Consider the λ-term:

R := λx.NOT(xx),

where “NOT” is just the negation function (we will see this at a later date).

Then we try β-reducing RR:

RR = (λx.NOT(xx))R = NOT(RR).

Clearly not in Kansas.

More generally, deﬁne the Y -combinator:

Y := λf.(λx.f(xx))(λx.f(xx)),

so we see that:

Y f = (λx.f(xx))(λx.f (xx))

= f(λx.f(xx))(λx.f (xx))

= f(Y f).

If we try to “type” this (by assigning domains and codomains to our function),

we get:

Y : A

A

→ A : f 7→ Y (f ).

Historical Remark 1. It was Church who ﬁrst used the notation λx.f(x),

and he said this was because they initially they used ˆx.f(x) which then changed

slightly to be ∧x.f(x), and then ﬁnally λx.f(x) in order to accommodate print-

ers.

Another story about the early years of λ-calculus is that Church and Kleene

sought to ﬁnd functions for any computable functions. The issue they came

across was that they couldn’t ﬁgure out how to λ-tize the predecessor function:

n 7→ n − 1.

Church had nearly almost convinced himself that it was not possible, but then

his student Kleene took a trip to the dentist. There, he had a Eureka moment,

and he ﬁgured out the trick to deﬁning the predecessor function in λ-calculus!

Apart from silly stories, we can recant a little about the history of λ-calculus.

Both λ-calculus and combinatory logic (logic studied via combinators) sprung

to life in the 1920s. Moses Sch¨onﬁnkel was probably the ﬁrst to study combi-

natory logic. A Russian logician, he gave a talk to Hilbert and the gang on 7

December 1920 about combinatory logic. Heinrich Behmann revised and pub-

lished Sch¨onﬁnkel’s notes in 1924 (also, Jean van Heijenoort’s “From Frege to

G¨odel” has it). He sadly only had one other paper published, and then he re-

turned to Moscow. In 1927, he was reportedly mentally ill and in the conﬁnes

of a sanatorium. He died poor in Moscow in 1942, all his papers burned by his

neighbours for heating.

6

Sch¨onﬁnkel wrote that:

A × B → C ≡ A → C

B

,

something which we will see later is a feature of a Cartesian closed category!

Interestingly, Sch¨onﬁnkel was a pupil of Samuel Shatunovsky, a mathemati-

cian who had correspondence with Brouwer on cosntructivist ideas. Even more

interestingly, Shatunovsky did not have a formal school certiﬁcate so he never

really had a regular research career in mathematics.

Curry dabbled in combinatory logic during 1922, and von Neumann’s thesis

on set theory had vague combinator-like structures in it (he also had close

correspondence with the logic group at G¨ottingen who were likely to pass on

Sch¨onﬁnkel’s wisdom). Curry’s invention of combinatory logic likely was entirely

a re-invention, independent to Europe, as Curry was in the USA.

Curry focused his attention on substitution, which at the time was not a well-

analysed feature of logic. Indeed, Russell and Whitehead just forgot to deﬁne

it in Principia Mathematica, and Hilbert and Ackermann’s textbook had an

incorrect statement for it concerning predicate logic. Curry studied in particular

how to break substitution up into smaller steps, currying. This lead him to

combinatory logic.

It was only in 1928 that Alonzo Church invented the λ-calculus. Church’s

untyped λ-calculus appeared in print in 1932 with unrestricted quantiﬁcation, no

law of the excluded middle. However, it was found quite quickly after that it had

a contradiction. He ﬁxed it, and at the time hoped that G¨odel’s incompleteness

theorems for Russell and Whitehead’s system did not extend to this system (see:

History of Lambda-calculus and Combinatory Logic, Cardone & Hindley).

4. Combinatory calculus

Recall that a combinator is a closed λ-term. We have the following examples of

combinators:

S = λxyz.(xz)(yz), K = λxy.x, I = λx.x, B = λxyz.x(yz), C = λxyz.xzy.

We have already seen that SK = I (assuming extensionality), and indeed B

and C can also be obtained from S and K alone.

Exercise 1. Find, using only S and K, an expression equivalent to B and C.

It turns out that any lambda term can be expressed in terms of the combi-

nators S and K, and hence to any computable function (Church thesis). There

exists algorithms for determining the SK-expression, however we will not cover

that as it is fun ﬁnding your own ways.

Historical Remark 2. These days, combinatory logic is increasingly popular

with the rise of functional programming where it is actively applied.

Note that B can be seen as composition of functions if you alter the deﬁnition

a little bit:

Bfg := f ◦ g, f ◦ g = λx.f(gx).

7

Exercise 2. Show that The set of λ-terms with application and identity I form

a monoid. Do you need η? This might actually be for function composition

using the B combinator, not using application. Need to try.

One can make combinators into a form of logic by viewing K as “true”, and

then K

∗

:= λxy.y as false. You are encouraged to ﬁnd all your favourite boolean

operators.

Note that both Church and Curry hated η and extensionality since it has no

computational meaning.

Exercise 3. Let M be a λ-term. Associate M to a directed multigraph, possibly

with loops, called G(M), where:

(i) The vertices is the set {N | M β-reduces to N}.

(ii) The edges are each possible β-reduction.

Example 1. If Ω := (λx.xx)(λx.xx), then G(Ω) is:

Ω

while G(KxΩ) is:

Ω x

Find λ-terms M so that G(M) is of the following shapes:

(a) • •

(b) • • •

(c)

•

•

•

8

(d)

•

•

•

An equational theory is said to be inconsistent if and only if every equation

is derivable (that is, every model is a singleton).

Proposition 2. λ-calculus with K = S is inconsistent.

Proof. Suppose that K = S, so Kabc = Sabc for all a, b, c. Hence we have

ac = (ac)(bc), so in particular, II = (II)(bI), which reduces to I = bI for all

b. Let b = Kf, and then we have I = KfI ⇐⇒ I = f for any term f. If we

choose a diﬀerent b = Kg, we similarly get I = g, and hence f = g for any two

terms.

Remark that there are other possible equations one might add to guarantee

inconsistency. Church gives one, forming a theory where we quotient out by

terms without a normal form (with respect to β-reduction). If we deﬁne:

T := {M = N | M, N λ-terms without normal form},

then T is inconsistent. A hint to show this is to let M = λx.((xK)Ω) and

N = λx.((xS)Ω). If M = N , show that S = K!

9