Set Theory, Part 2: Constructing the Ordinals

This is a continuation of my earlier set theory post. In this post, I’ll describe the next three axioms of ZF and construct the ordinal numbers.

1. The Previous Axioms

As review, here are the natural descriptions of the five axioms we covered in the previous post.

Axiom 1 (Extensionality) Two sets are equal if they have the same elements.

Axiom 2 (Empty Set Exists) There exists an empty set {\varnothing} which contains no elements

Axiom 3 (Pairing) Given two elements {x} and {y}, there exists a set {\{x,y\}} containing only those two elements. (It is permissible to have {x=y}, meaning that if {x} is a set then so is {\{x\}}.)

Axiom 4 (Union) Given a set {a}, we can create {\cup a}, the union of the elements of {a}. For example, if {a = \{ \{1,2\}, \{3,4\} \}}, then {z = \{1,2,3,4\}} is a set.

Axiom 5 (Power Set) Given any set {x}, the power set {\mathcal P(x)} is a set.

I’ll comment briefly on what these let us do now. Let {V_0 = \varnothing}, and recursively define {V_{n+1} = \mathcal P(V_n)}. So for example,

\displaystyle  \begin{aligned} 	V_0 &= \varnothing \\ 	V_1 &= \{\varnothing\} \\ 	V_2 &= \{ \varnothing, \{\varnothing\} \} \\ 	V_3 &= \Big\{ \varnothing, \{\varnothing\}, 	\{\{\varnothing\}\}, \big\{\varnothing, \{\varnothing\} \big\}\Big\} \\ 	&\phantom=\vdots \end{aligned}

Now let’s drop the formalism for a moment and go on a brief philosophical musing. Suppose we have a universe {V_\omega} (I’ll explain later what {\omega} is) where the only sets are those which appear in some {V_n}. You might then see, in fact, that the sets in {V_\omega} actually obey all five axioms above. What we’ve done is provide a model for which the five axioms are consistent.

But this is a pretty boring model right now for the following reason: even though there are infinitely many sets, there are no infinite sets. In a moment I’ll tell you how we can add new axioms to make infinite sets. But first let me tell you how we construct the natural numbers.

2. The Axiom of Foundation

We’re about to wade into the territory of the infinite, so first I need an axiom to protect us from really bad stuff from happening. What I’m going to do is forbid infinite {\in} chains.

Axiom 6 (Foundation) Loosely, there is no infinite chain of sets

\displaystyle  x_0 \ni x_1 \ni x_2 \ni \dots.

You can see why this seems reasonable: if I take a random set, I can hop into one of its elements. That’s itself a set, so I can jump into that guy and keep going down. In the finite universe {V_\omega}, you can see that eventually I’ll hit {\varnothing}, the very bottom of the universe. I want the same to still be true even if my sets are infinite.

This isn’t the actual statement of the axiom. The way to say this in machine code is that for any nonempty set {x}, there exists a {y \in x} such that {z \notin y} for any {z \in x}. We can’t actually write about something like {x_0 \ni x_1 \ni \dots} in machine code (yet). Nevertheless this suffices for our axioms.

There’s an important consequence of this.

Theorem 1 {x \notin x} for any set {x}.

Proof: For otherwise we would have {x \ni x \ni x \ni \dots} which violates Foundation. \Box

3. The Natural Numbers

Note: in set theory, {0} is considered a natural number.

Now for the fun part. If we want to encode math statements into the language of set theory, the first thing we’d want to do is encode the numbers {0}, {1}, {\dots} in there so that we can actually do arithmetic. How might we do that?

What we’re going to do is construct a sequence of sets of sizes {0}, {1}, {\dots} and let these correspond to the natural numbers. What sets should we choose? Well, there’s only one set of size {0}, so we begin by writing

\displaystyle  0 \overset{\text{def}}{=} \varnothing.

I’ll give away a little more than I should and then write

\displaystyle  1 \overset{\text{def}}{=} \{\varnothing\}, \quad 2 \overset{\text{def}}{=} \{\varnothing, \{\varnothing\} \}.

Now let’s think about {3}. If we want to construct a three-element set and we already have a two-element set, then we just need to add another element to {1} that’s not already in there. In other words, to construct {3} I just need to pick an {x} such that {x \notin 2}, then let {3 = \{x\} \cup 2}. (Or in terms of our axioms, {3 = \cup \left\{ 2, \{x\} \right\}}.) Now what’s an easy way to pick {x} such that {x \notin 2}? Answer: pick {x=2}. By the earlier theorem, we always {2 \notin 2}.

Now the cat’s out of the bag! We define

\displaystyle  \begin{aligned} 	0 &= \varnothing \\ 	1 &= \left\{ 0 \right\} \\ 	2 &= \left\{ 0, 1 \right\} \\ 	3 &= \left\{ 0, 1, 2 \right\} \\ 	4 &= \left\{ 0, 1, 2, 3 \right\} \\ 	&\phantom= \vdots \end{aligned}

And there you have it: the nonnegative integers. You can have some fun with this definition and write things like

\displaystyle  \left\{ x \in 8 \mid x \text{ is even} \right\} = \left\{ 0, 2, 4, 6 \right\}

now. Deep down, everything’s a set.

4. Finite Ordinals

We’re currently able to do some counting now, because we’ve defined the sequence of sets

\displaystyle  0, 1, 2, \dots

by {0 = \varnothing} and {n+1 = \{0,\dots,n\}}. This sequence is related by

\displaystyle  0 \in 1 \in 2 \in \dots.

Some properties of these “numbers” I’ve made are:

  • They are well-ordered by {\in} (which corresponds exactly with the {<} which we're familiar with; that's a good motivation for choosing this construction, as the well-ordering property is one of the most important properties of {\mathbb N}, and using {\in} for this purpose lets us do this ordering painlessly). That means if I take the elements of {n}, then I can sort them in a transitive chain like I've done above: for any {x} and {y}, either {x \in y} or {y \in x}. For example, the elements of {4} are {0}, {1}, {2}, {3} and {0 \in 1 \in 2 \in 3}. It also means that any subset has a “minimal'' element, which would just be the first element of the chain. Here is the complete definition.
  • The set {n} is transitive. What this means that it is a “closed universe” in the sense that if I look at an element {a} of {n}, all the elements of {a} are also in {n}. For example, if I take the element {3} of {5 = \{0,1,2,3,4\}}, all the elements of {3} are in {5}. Looking deeply into {n} won’t find me anything I didn’t see at the top level.

    In other words, a set {S} is transitive if for every {T \in S}, {T \subseteq S}.

A set which is both transitive and well-ordered by {\in} is called an ordinal, and the numbers {0,1,2,\dots} are precisely the finite ordinals. But now I’d like to delve into infinite numbers. Can I define some form of “infinity”?

5. Infinite Ordinals

To tell you what a set is, I only have to tell you who its elements are. And so I’m going to define the set

\displaystyle  \omega = \left\{ 0, 1, 2, \dots \right\}.

And now our counting looks like this.

\displaystyle  0, 1, 2, \dots, \omega.

We just tacked on an infinity at the end by scooping up all the natural numbers and collecting them in a set. You can do that? Sure you can! All I’ve done is written down the elements of the set, and you can check that {\omega} is indeed an ordinal.

Well, okay, there’s one caveat. We don’t actually know whether the {\omega} I’ve written down is a set. Pairing and Union lets us collect any finite collection of sets into a single set, but it doesn’t let us collect things into an infinite set. In fact, you can’t prove from the axioms that {\omega} is a set.

For this I’m going to present another two axioms. These are much more technical to describe, so I’ll lie to you about what their statements are. If you’re interested in the exact statements, consult the lecture notes linked at the bottom of this post.

Axiom 7 (Replacement) Loosely, let {f} be a function on a set {X}. Then the image of {f} is a set:

\displaystyle  \exists Y : \quad \forall y, \; y \in Y \iff \exists x : f(x) = y.

Axiom 8 (Infinity) There exists a set {\omega = \{0,1,2,\dots\}}.

With these two axioms, we can now write down the first infinite ordinal {\omega}. So now our list of ordinals is

\displaystyle  0, 1, 2, \dots, \omega.

But now in the same way we constructed {3} from {2}, we can construct a set

\displaystyle  \omega + 1 \overset{\text{def}}{=} 	\omega \cup \{\omega\} 	= \left\{ 0, 1, \dots, \omega \right\}

and then

\displaystyle  \omega + 2 \overset{\text{def}}{=} 	(\omega+1) \cup \{\omega+1\} 	= \left\{ 0, 1, \dots, \omega, \omega+1 \right\}

and so on — all of these are also transitive and well-ordered by {\in}. So now our list of ordinals is

\displaystyle  0, 1, 2, \dots, \omega, \omega+1, \omega+2, \dots.

Well, can we go on further? What we’d like is to define an “{\omega+\omega} or {2 \cdot \omega}”, which would entail capturing all of the above elements into a set. Well, I claim we can do so. Consider a function {f} defined on {\omega} which sends {n} to {\omega+n}. So {f(0) = \omega}, {f(3) = \omega+3}, {f(2014) = \omega+2014}. (Strictly, I have to prove that set-encoding of this function, namely {\{(0,\omega), (1,\omega+1), \dots\}}, is actually a set. But that’s put that aside for now.) Then Replacement tells me that I have a set

\displaystyle  	\left\{ f(0), f(1), \dots \right\} 	= \left\{ \omega, \omega+1, \omega+2, \dots \right\}

From here we can union this set with {\omega} to achieve the set {\left\{ 0,1,2,\dots,\omega,\omega+1,\omega+2,\dots \right\}}. And we can keep turning this wheel repeatedly, yielding the ordinal numbers.

\displaystyle  \begin{aligned} 	0, & 1, 2, 3, \dots, \omega \\ 	& \omega+1, \omega+2, \dots, \omega+\omega \\ 	& 2\omega+1, 2\omega+2, \dots, 3\omega \\ 	& \vdots \\ 	& \omega^2 + 1, \omega^2+2, \dots \\ 	& \vdots \\ 	& \omega^3, \dots, \omega^4, \dots, \omega^\omega \\ 	& \vdots, \\ 	& \omega^{\omega^{\omega^{\dots}}} \\ \end{aligned}

I won’t say much more about these ordinal numbers since the post is already getting pretty long, but I’ll mention that the ordinals might not correspond to a type of counting that you’re used to, in the sense that there is a bijection between the sets {\omega} to {\omega+1}. It might seem like different numbers should have different “sizes”. For this you stumble into the cardinal numbers: it turns out that a cardinal number is just defined as an ordinal number which is not in bijection with any smaller ordinal number.

6. A Last Few Axioms

I’ll conclude this exposition on ZFC with a few last axioms. First is the axiom called Comprehension, though it actually can be proven from the Replacement axiom.

Axiom 9 (Comprehension) Let {\phi} be some property, and let {S} be a set. Then the notion

\displaystyle  \left\{ x \in S \mid \phi(x) \right\}

is a set. More formally,

\displaystyle  \exists X \forall x \in X: (x \in S) \land (\phi(x)).

Notice that the comprehension must be restricted: we can only take subsets of existing sets. From this one can deduce that there is no set of all sets; if we had such a set {V}, then we could use Comprehension to generate {\{x \in V : x \notin x\}}, recovering Russel’s Paradox.

So anyways, this means that we can take list comprehensions.

Finally, I’ll touch a little on why the Axiom of Choice is actually important. You’ve probably heard the phrasing that “you can pick a sock out of every drawer” or some cute popular math phrasing like that. Here’s the precise statement.

Axiom 10 (Choice) Let {\mathcal F} be a set such that {\varnothing \notin \mathcal F}. Then we can define a function {g} on {\mathcal F} such that {g(y) \in y} for every {y \in \mathcal F}. The function {g} is called a choice function; given a bunch of sets {\mathcal F}, it chooses an element {g(y)} out of every {y}. In other words, for any {\mathcal F} with {\varnothing \notin \mathcal F}, there exists a set

\displaystyle  \left\{ (y,g(y)) \mid y \in \mathcal F \right\}.

with {g(y) \in y} for every {y}.

In light of the discussion in these posts, what’s significant is not that we can conceive such a function (how hard can it be to take an element out of a nonempty set?) but that the resulting structure is actually a set. The whole point of having the ZF axioms is that we have to be careful about how we are allowed to make new sets out of old ones, so that we don’t run ourselves into paradoxes. The Axiom of Choice reflects that this is a subtle issue.

So there you have it, the axioms of ZFC and what they do.

Thanks to Peter Koellner, Harvard, for his class Math 145a, which taught me this material. My notes for this course can be downloaded from my math website.

Thanks to J Wu for pointing out a typo in Replacement and noting that I should emphasize how {\in} leads to the well-ordering for the ordinals.