# Models of ZFC

Model theory is really meta, so you will have to pay attention here.

Roughly, a “model of ${\mathsf{ZFC}}$” is a set with a binary relation that satisfies the ${\mathsf{ZFC}}$ axioms, just as a group is a set with a binary operation that satisfies the group axioms. Unfortunately, unlike with groups, it is very hard for me to give interesting examples of models, for the simple reason that we are literally trying to model the entire universe.

## 1. Models

Prototypical example for this section: ${(\omega, \in)}$ obeys ${\mathrm{PowerSet}}$, ${V_\kappa}$ is a model for ${\kappa}$ inaccessible (later).

Definition 1 A model ${\mathscr M}$ consists of a set ${M}$ and a binary relation ${E \subseteq M \times M}$. (The ${E}$ relation is the “${\in}$” for the model.)

Remark 2 I’m only considering set-sized models where ${M}$ is a set. Experts may be aware that I can actually play with ${M}$ being a class, but that would require too much care for now.

If you have a model, you can ask certain things about it. For example, you can ask “does it satisfy ${\mathrm{EmptySet}}$?”. Let me give you an example of what I mean, and then make it rigorous.

Example 3 (A Stupid Model) Let’s take ${\mathscr M = (M,E) = \left( \omega, \in \right)}$. This is not a very good model of ${\mathsf{ZFC}}$, but let’s see if we can make sense of some of the first few axioms.

1. ${\mathscr M}$ satisfies ${\mathrm{Extensionality}}$, which is the sentence

$\displaystyle \forall x \forall y \forall a : \left( a \in x \iff a \in y \right) \implies x = y.$

This just follows from the fact that ${E}$ is actually ${\in}$.

2. ${\mathscr M}$ satisfies ${\mathrm{EmptySet}}$, which is the sentence

$\displaystyle \exists a : \forall x \; \neg (x \in a).$

Namely, take ${a = \varnothing \in \omega}$.

3. ${\mathscr M}$ does not satisfy ${\mathrm{Pairing}}$, since ${\{1,3\}}$ is not in ${\omega}$, even though ${1, 3 \in \omega}$
4. Miraculously, ${\mathscr M}$ satisfies ${\mathrm{Union}}$, since for any ${n \in \omega}$, ${\cup n}$ is ${n-1}$ (unless ${n=0}$). The Union axiom statements that

$\displaystyle \forall a \exists z \quad \forall x \; (x \in z) \iff (\exists y : x \in y \in z).$

An important thing to notice is that the “${\forall a}$” ranges only over the sets in the model of the universe, ${\mathscr M}$.

Example 4 (Important: This Stupid Model Satisfies ${\mathrm{PowerSet}}$) Most incredibly of all: ${\mathscr M = (\omega, \in)}$ satisfies ${\mathrm{PowerSet}}$. This is a really important example. You might think this is ridiculous. Look at ${2 = \{0,1\}}$. The power set of this is ${\{0, 1, 2, \{1\}\}}$ which is not in the model, right?

Well, let’s look more closely at ${\mathrm{PowerSet}}$. It states that:

$\displaystyle \forall x \exists a \forall y (y \in a \iff y \subseteq x).$

What happens if we set ${x = 2 = \{0,1\}}$? Well, actually, we claim that ${a = 3 = \{0,1,2\}}$ works. The key point is “for all ${y}$” — this only ranges over the objects in ${\mathscr M}$. In ${\mathscr M}$, the only subsets of ${2}$ are ${0 = \varnothing}$, ${1 = \{0\}}$ and ${2 = \{0,1\}}$. The “set” ${\{1\}}$ in the “real world” (in ${V}$) is not a set in the model ${\mathscr M}$.

In particular, you might say that in this strange new world, we have ${2^n = n+1}$, since ${n = \{0,1,\dots,n-1\}}$ really does have only ${n+1}$ subsets.

Example 5 (Sentences with Parameters) The sentences we ask of our model are allowed to have “parameters” as well. For example, if ${\mathscr M = (\omega, \in)}$ as before then ${\mathscr M}$ satisfies the sentence

$\displaystyle \forall x \in 3 (x \in 5).$

## 2. Sentences and Satisfaction

With this intuitive notion, we can define what it means for a model to satisfy a sentence.

Definition 6 Note that any sentence ${\phi}$ can be written in one of the following five forms:

• ${x \in y}$
• ${x = y}$
• ${\neg \psi}$ (“not ${\psi}$”) for some shorter sentence ${\psi}$
• ${\psi_1 \lor \psi_2}$ (“${\psi_1}$ or ${\psi_2}$”) for some shorter sentences ${\psi_1}$, ${\psi_1}$
• ${\exists x \psi}$ (“exists ${x}$”) for some shorter sentence ${\psi}$.

Ques 7 What happened to ${\land}$ (and) and ${\forall}$ (for all)? (Hint: use ${\neg}$.)

Often (almost always, actually) we will proceed by so-called “induction on formula complexity”, meaning that we define or prove something by induction using this. Note that we require all formulas to be finite.

Now suppose we have a sentence ${\phi}$, like ${a = b}$ or ${\exists a \forall x \neg (x \in a)}$, plus a model ${\mathscr M = (M,E)}$. We want to ask whether ${\mathscr M}$ satisfies ${\phi}$.

To give meaning to this, we have to designate certain variables as parameters. For example, if I asked you “Does ${a=b}$?” the first question you would ask is what ${a}$ and ${b}$ are. So ${a}$, ${b}$ would be parameters: I have to give them values for this sentence to make sense.

On the other hand, if I asked you “Does ${\exists a \forall x \neg (x \in a)}$?” then you would just say “yes”. In this case, ${x}$ and ${a}$ are not parameters. In general, parameters are those variables whose meaning is not given by some ${\forall}$ or ${\exists}$.

In what follows, we will let ${\phi(x_1, \dots, x_n)}$ denote a formula ${\phi}$, whose parameters are ${x_1}$, \dots, ${x_n}$. Note that possibly ${n=0}$, for example all ${\mathsf{ZFC}}$ axioms have no parameters.

Ques 8 Try to guess the definition of satisfaction before reading it below. (It’s not very hard to guess!)

Definition 9 Let ${\mathscr M=(M,E)}$ be a model. Let ${\phi(x_1, \dots, x_n)}$ be a sentence, and let ${b_1, \dots, b_n \in M}$. We will define a relation

$\displaystyle \mathscr M \vDash \phi[b_1, \dots, b_n]$

and say ${\mathscr M}$ satisfies the sentence ${\phi}$ with parameters ${b_1, \dots, b_n}$.

The relationship is defined by induction on formula complexity as follows:

• If ${\phi}$ is “${x_1=x_2}$” then ${\mathscr M \vDash \phi[b_1, b_2] \iff b_1 = b_2}$.
• If ${\phi}$ is “${x_1\in x_2}$” then ${\mathscr M \vDash \phi[b_1, b_2] \iff b_1 \; E \; b_2}$.
(This is what we mean by “${E}$ interprets ${\in}$”.)
• If ${\phi}$ is “${\neg \psi}$” then ${\mathscr M \vDash \phi[b_1, \dots, b_n] \iff \mathscr M \not\vDash \phi[b_1, \dots, b_n]}$.
• If ${\phi}$ is “${\psi_1 \lor \psi_2}$” then ${\mathscr M \vDash \phi[b_1, \dots, b_n]}$ means ${\mathscr M \vDash \psi_i[b_1, \dots, b_n]}$ for some ${i=1,2}$.
• Most important case: suppose ${\phi}$ is ${\exists x \psi(x,x_1, \dots, x_n)}$. Then ${\mathscr M \vDash \phi[b_1, \dots, b_n]}$ if and only if

$\displaystyle \exists b \in M \text{ such that } \mathscr M \vDash \psi[b, b_1, \dots, b_n].$

Note that ${\psi}$ has one extra parameter.

Notice where the information of the model actually gets used. We only ever use ${E}$ in interpreting ${x_1 \in x_2}$; unsurprising. But we only ever use the set ${M}$ when we are running over ${\exists}$ (and hence ${\forall}$). That’s well-worth keeping in mind: The behavior of a model essentially comes from ${\exists}$ and ${\forall}$, which search through the entire model ${M}$.

And finally,

Definition 10 A model of ${\mathsf{ZFC}}$ is a model ${\mathscr M = (M,E)}$ satisfying all ${\mathsf{ZFC}}$ axioms.

We are especially interested in models of the form ${(M, \in)}$, where ${M}$ is a transitive set. (We want our universe to be transitive, otherwise we would have elements of sets which are not themselves in the universe, which is very strange.) Such a model is called a transitive model. If ${M}$ is a transitive set, the model ${(M, \in)}$ will be abbreviated to just ${M}$.

Definition 11 An inner model of ${\mathsf{ZFC}}$ is a transitive model satisfying ${\mathsf{ZFC}}$.

## 3. The Levy Hierarchy

Prototypical example for this section: ${\mathtt{isSubset}(x,y)}$ is absolute. The axiom ${\mathrm{EmptySet}}$ is ${\Sigma_1}$, ${\mathtt{isPowerSetOf}(X,x)}$ is ${\Pi_1}$.

A key point to remember is that the behavior of a model is largely determined by ${\exists}$ and ${\forall}$. It turns out we can say even more than this.

Consider a formula such as

$\displaystyle \mathtt{isEmpty}(x) : \neg \exists a (a \in x)$

which checks whether a given set ${x}$ has a nonempty element. Technically, this has an “${\exists}$” in it. But somehow this ${\exists}$ does not really search over the entire model, because it is bounded to search in ${x}$. That is, we might informally rewrite this as

$\displaystyle \neg (\exists x \in a)$

which doesn’t fit into the strict form, but points out that we are only looking over ${a \in x}$. We call such a quantifier a bounded quantifier.

We like sentences with bounded quantifiers because they designate properties which are absolute over transitive models. It doesn’t matter how strange your surrounding model ${M}$ is. As long as ${M}$ is transitive,

$\displaystyle M \vDash \mathtt{isEmpty}(\varnothing)$

will always hold. Similarly, the sentence

$\displaystyle \mathtt{isSubset}(x,y) : x \subseteq y \text { i.e. } \forall a \in x (a \in y).$

Sentences with this property are called ${\Sigma_0}$ or ${\Pi_0}$.

The situation is different with a sentence like

$\displaystyle \mathtt{isPowerSetOf}(y,x) : \forall z \left( z \subseteq x \iff z \in y \right)$

which in English means “${y}$ is the power set of ${x}$”, or just ${y = \mathcal P(x)}$. The ${\forall z}$ is not bounded here. This weirdness is what allows things like

$\displaystyle \omega \vDash \text{} \{0,1,2\} \text{ is the power set of }\{0,1\}\text{''}$

and hence

$\displaystyle \omega \vDash \mathrm{PowerSet}$

which was our stupid example earlier. The sentence ${\mathtt{isPowerSetOf}}$ consists of an unbounded ${\forall}$ followed by an absolute sentence, so we say it is ${\Pi_1}$.

More generally, the Levy hierarchy keeps track of how bounded our quantifiers are. Specifically,

• Formulas which have only bounded quantifiers are ${\Delta_0 = \Sigma_0 = \Pi_0}$.
• Formulas of the form ${\exists x_1 \dots \exists x_k \psi}$ where ${\psi}$ is ${\Pi_n}$ are consider ${\Sigma_{n+1}}$.
• Formulas of the form ${\forall x_1 \dots \forall x_k \psi}$ where ${\psi}$ is ${\Sigma_n}$ are consider ${\Pi_{n+1}}$.

(A formula which is both ${\Sigma_n}$ and ${\Pi_n}$ is called ${\Delta_n}$, but we won’t use this except for ${n=0}$.)

Example 12 (Examples of ${\Delta_0}$ Sentences) ${\empty}$

1. The sentences ${\mathtt{isEmpty}(x)}$, ${x \subseteq y}$, as discussed above.
2. The formula “${x}$ is transitive” can be expanded as a ${\Delta_0}$ sentence.
3. The formula “${x}$ is an ordinal” can be expanded as a ${\Delta_0}$ sentence.

Exercise 13 Write out the expansions for “${x}$ is transitive” and “${x}$ is ordinal” in a ${\Delta_0}$ form.

Example 14 (More Complex Formulas) ${\empty}$

1. The axiom ${\mathrm{EmptySet}}$ is ${\Sigma_1}$; it is ${\exists a (\mathtt{isEmpty}(a))}$, and ${\mathtt{isEmpty}(a)}$ is ${\Delta_0}$.
2. The formula “${y = \mathcal P(x)}$” is ${\Pi_1}$, as discussed above.
3. The formula “${x}$ is countable” is ${\Sigma_1}$. One way to phrase it is “${\exists f}$ an injective map ${x \hookrightarrow \omega}$”, which necessarily has an unbounded “${\exists f}$”.
4. The axiom ${\mathrm{PowerSet}}$ is ${\Pi_3}$:

$\displaystyle \forall y \exists P \forall x (x\subseteq y \iff x \in P).$

## 4. Substructures, and Tarski-Vaught

Let ${\mathscr M_1 = (M_1, E_1)}$ and ${\mathscr M_2 = (M_2, E_2)}$ be models.

Definition 15 We say that ${\mathscr M_1 \subseteq \mathscr M_2}$ if ${M_1 \subseteq M_2}$ and ${E_1}$ agrees with ${E_2}$; we say ${\mathscr M_1}$ is a substructure of ${\mathscr M_2}$.

That’s boring. The good part is:

Definition 16 We say ${\mathscr M_1 \prec \mathscr M_2}$, or ${\mathscr M_1}$ is an elementary substructure of ${\mathscr M_2}$, if for every sentence ${\phi(x_1, \dots, x_n)}$ and parameters ${b_1, \dots, b_n \in M_1}$, we have

$\displaystyle \mathscr M_1 \vDash \phi[b_1, \dots, b_n] \iff \mathscr M_2 \vDash \phi[b_1, \dots, b_n].$

In other words, ${\mathscr M_1}$ and ${\mathscr M_2}$ agree on every sentence possible. Note that the ${b_i}$ have to come from ${M_1}$; if the ${b_i}$ came from ${\mathscr M_2}$ then asking something of ${\mathscr M_1}$ wouldn’t make sense.

Let’s ask now: how would ${\mathscr M_1 \prec \mathscr M_2}$ fail to be true? If we look at the possibly sentences, none of the atomic formulas, nor the “${\land}$” and “${\neg}$”, are going to cause issues.

The intuition you should be getting by now is that things go wrong once we hit ${\forall}$ and ${\exists}$. They won’t go wrong for bounded quantifiers. But unbounded quantifiers search the entire model, and that’s where things go wrong.

To give a “concrete example”: imagine ${\mathscr M_1}$ is MIT, and ${\mathscr M_2}$ is the state of Massachusetts. If ${\mathscr M_1}$ thinks there exist hackers at MIT, certainly there exist hackers in Massachusetts. Where things go wrong is something like:

$\displaystyle \mathscr M_2 \vDash \text{} \exists x : x \text{ is a course numbered }> 50\text{''}.$

This is true for ${\mathscr M_2}$ because we can take the witness ${x = \text{Math 55}}$, say. But it’s false for ${\mathscr M_1}$, because at MIT all courses are numbered ${18.701}$ or something similar. The issue is that the witness for statements in ${\mathscr M_2}$ do not necessarily propagate up down to witnesses for ${\mathscr M_1}$, even though they do from ${\mathscr M_1}$ to ${\mathscr M_2}$.

The Tarski-Vaught test says this is the only impediment: if every witness in ${\mathscr M_2}$ can be replaced by one in ${\mathscr M_1}$ then ${\mathscr M_1 \prec \mathscr M_2}$.

Lemma 17 (Tarski-Vaught) Let ${\mathscr M_1 \subseteq \mathscr M_2}$. Then ${\mathscr M_1 \prec \mathscr M_2}$ if and only if for every sentence ${\phi(x, x_1, \dots, x_n)}$ and parameters ${b_1, \dots, b_n \in M_1}$: if there is a witness ${\tilde b \in M_2}$ to ${\mathscr M_2 \vDash \phi(\tilde b, b_1 \dots, b_n)}$ then there is a witness ${b \in M_1}$ to ${\mathscr M_1 \vDash \phi(b, b_1, \dots, b_n)}$.

Proof: Easy after the above discussion. To formalize it, use induction on formula complexity. $\Box$

## 5. Obtaining the Axioms of ${\mathsf{ZFC}}$

Extending the above ideas, one can obtain without much difficulty the following. The idea is that almost all the ${\mathsf{ZFC}}$ axioms are just ${\Sigma_1}$ claims about certain desired sets, and so verifying an axiom reduces to checking some appropriate “closure” condition: that the witness to the axiom is actually in the model.

For example, the ${\mathrm{EmptySet}}$ axiom is “${\exists a (\mathtt{isEmpty}(a))}$”, and so we’re happy as long as ${\varnothing \in M}$, which is of course true for any nonempty transitive set ${M}$.

Lemma 18 (Transitive Sets Inheriting ${\mathsf{ZFC}}$) Let ${M}$ be a nonempty transitive set. Then

1. ${M}$ satisfies ${\mathrm{Extensionality}}$, ${\mathrm{Foundation}}$, ${\mathrm{EmptySet}}$.
2. ${M \vDash \mathrm{Pairing}}$ if ${x,y \in M \implies \{x,y\} \in M}$.
3. ${M \vDash \mathrm{Union}}$ if ${x \in M \implies \cup x \in M}$.
4. ${M \vDash \mathrm{PowerSet}}$ if ${x \in M \implies \mathcal P(x) \cap M \in M}$.
5. ${M \vDash \mathrm{Replacement}}$ if for every ${x \in M}$ and every function ${F : x \rightarrow M}$ which is ${M}$-definable with parameters, we have ${Fx \in M}$ as well.
6. ${M \vDash \mathrm{Infinity}}$ as long as ${\omega \in M}$.

Here, a set ${X \subseteq M}$ is ${M}$-definable with parameters if it can be realized as

$\displaystyle X = \left\{ x \in M \mid \phi[x, b_1, \dots, b_n] \right\}$

for some (fixed) choice of parameters ${b_1,\dots,b_n \in M}$. We allow ${n=0}$, in which case we say ${X}$ is ${M}$-definable without parameters. Note that ${X}$ need not itself be in ${M}$! As a trivial example, ${X = M}$ is ${M}$-definable without parameters (just take ${\phi[x]}$ to always be true), and certainly we do not have ${X \in M}$.

Exercise 19 Verify (i)-(iv) above.

Remark 20 Converses to the statements of Lemma 18 are true for all claims other than (vii).

## 6. Mostowski Collapse

Up until now I have been only talking about transitive models, because they were easier to think about. Here’s a second, better reason we might only care about transitive models.

Lemma 21 (Mostowski Collapse) Let ${\mathscr X = (X,E)}$ be a model such that ${\mathscr X \vDash \mathrm{Extensionality} + \mathrm{Foundation}}$. Then there exists an isomorphism ${\pi : \mathscr X \rightarrow M}$ for a transitive model ${M = (M,\in)}$.

This is also called the transitive collapse. In fact, both ${\pi}$ and ${M}$ are unique.

Proof: The idea behind the proof is very simple. Since ${E}$ is well-founded and extensional, we can look at the ${E}$-minimal element ${x_\varnothing}$ of ${X}$ with respect to ${E}$. Clearly, we want to send that to ${0 = \varnothing}$.

Then we take the next-smallest set under ${E}$, and send it to ${1 = \{\varnothing\}}$. We “keep doing this”; it’s not hard to see this does exactly what we want.

To formalize, define ${\pi}$ by transfinite recursion:

$\displaystyle \pi(x) \overset{\mathrm{def}}{=} \left\{ \pi(y) \mid y \; E \; x \right\}.$

This ${\pi}$, by construction, does the trick. $\Box$

The picture of this is quite “collapsing” the elements of ${M}$ down to the bottom of ${V}$, hence the name.

## 7. Adding an Inaccessible, Skolem Hulls, and Going Insane

Prototypical example for this section: ${V_\kappa}$

At this point you might be asking, well, where’s my model of ${\mathsf{ZFC}}$?

I unfortunately have to admit now: ${\mathsf{ZFC}}$ can never prove that there is a model of ${\mathsf{ZFC}}$ (unless ${\mathsf{ZFC}}$ is inconsistent, but that would be even worse). This is a result called Gödel’s Incompleteness Theorem.

Nonetheless, with some very modest assumptions added, we can actually show that a model does exist: for example, assuming that there exists a strongly inaccessible cardinal ${\kappa}$ would do the trick, it turns out ${V_\kappa}$ will be such a model. Intuitively you can see why: ${\kappa}$ is so big that any set of rank lower than it can’t escape it even if we take their power sets, or any other method that ${\mathsf{ZFC}}$ lets us do.

More pessimistically, this shows that it’s impossible to prove in ${\mathsf{ZFC}}$ that such a ${\kappa}$ exists. Nonetheless, we now proceed under ${\mathsf{ZFC}^+}$ for convenience, which adds the existence of such a ${\kappa}$ as a final axiom. So we now have a model ${V_\kappa}$ to play with. Joy!

Great. Now we do something really crazy.

Theorem 22 (Countable Transitive Model) Assume ${\mathsf{ZFC}^+}$. Then there exists a transitive model ${M}$ of ${\mathsf{ZFC}}$ such that ${M}$ is a countable set.

Start with the set ${X_0 = \varnothing}$. Then for every integer ${n}$, we do the following to get ${X_{n+1}}$.

• Start with ${X_{n+1}}$ containing very element of ${X_n}$.
• Consider a formula ${\phi(x, x_1, \dots, x_n)}$ and ${b_1, \dots, b_n}$ in ${X_n}$. Suppose that ${M}$ thinks there is an ${b \in M}$ for which

$\displaystyle M \vDash \phi[b, b_1, \dots, b_n].$

We then add in the element ${b}$ to ${X_{n+1}}$.

• We do this for every possible formula in the language of set theory. We also have to put in every possible set of parameters from the previous set ${X_n}$.

At every step ${X_n}$ is countable. Reason: there are countably many possible finite sets of parameters in ${X_n}$, and countably many possible formulas, so in total we only ever add in countably many things at each step. This exhibits an infinite nested sequence of countable sets

$\displaystyle X_0 \subseteq X_1 \subseteq X_2 \subseteq \dots$

None of these is a substructure of ${M}$, because each ${X_n}$ by relies on witnesses in ${X_{n+1}}$. So we instead take the union:

$\displaystyle X = \bigcup_n X_n.$

This satisfies the Tarski-Vaught test, and is countable.

There is one minor caveat: ${X}$ might not be transitive. We don’t care, because we just take its Mostowski collapse. $\Box$

Please take a moment to admire how insane this is. It hinges irrevocably on the fact that there are countably many sentences we can write down.

Remark 23 This proof relies heavily on the Axiom of Choice when we add in the element ${b}$ to ${X_{n+1}}$. Without Choice, there is no way of making these decisions all at once.

Usually, the right way to formalize the Axiom of Choice usage is, for every formula ${\phi(x, x_1, \dots, x_n)}$, to pre-commit (at the very beginning) to a function ${f_\phi(x_1, \dots, x_n)}$, such that given any ${b_1, \dots, b_n}$ ${f_\phi(b_1, \dots, b_n)}$ will spit out the suitable value of ${b}$ (if one exists). Personally, I think this is hiding the spirit of the proof, but it does make it clear how exactly Choice is being used.

These ${f_\phi}$‘s have a name: Skolem functions.

The trick we used in the proof works in more general settings:

Theorem 24 (Downward Löwenheim-Skolem Theorem) Let ${\mathscr M = (M,E)}$ be a model, and ${A \subseteq M}$. Then there exists a set ${B}$ (called the Skolem hull of ${A}$) with ${A \subseteq B \subseteq M}$, such that ${(B,E) \prec \mathscr M}$, and

$\displaystyle \left\lvert B \right\rvert < \max \left\{ \omega, \left\lvert A \right\rvert \right\}.$

In our case, what we did was simply take ${A}$ to be the empty set.

Ques 25 Prove this. (Exactly the same proof as before.)

## 8. FAQ’s on Countable Models

The most common one is “how is this possible?”, with runner-up “what just happened”.

Let me do my best to answer the first question. It seems like there are two things running up against each other:

1. ${M}$ is a transitive model of ${\mathsf{ZFC}}$, but its universe is uncountable.
2. ${\mathsf{ZFC}}$ tells us there are uncountable sets!

(This has confused so many people it has a name, Skolem’s paradox.)

The reason this works I actually pointed out earlier: countability is not absolute, it is a ${\Sigma_1}$ notion.

Recall that a set ${x}$ is countable if there exists an injective map ${x \hookrightarrow \omega}$. The first statement just says that in the universe ${V}$, there is a injective map ${F: M \hookrightarrow \omega}$. In particular, for any ${x \in M}$ (hence ${x \subseteq M}$, since ${M}$ is transitive), ${x}$ is countable in ${V}$. This is the content of the first statement.

But for ${M}$ to be a model of ${\mathsf{ZFC}}$, ${M}$ only has to think statements in ${\mathsf{ZFC}}$ are true. More to the point, the fact that ${\mathsf{ZFC}}$ tells us there are uncountable sets means

$\displaystyle M \vDash \exists x \text{ uncountable}.$

In other words,

$\displaystyle M \vDash \exists x \forall f \text{ If } f : x \rightarrow \omega \text{ then } f \text{ isn't injective}.$

The key point is the ${\forall f}$ searches only functions in our tiny model ${M}$. It is true that in the “real world” ${V}$, there are injective functions ${f : x \rightarrow \omega}$. But ${M}$ has no idea they exist! It is a brain in a vat: ${M}$ is oblivious to any information outside it.

So in fact, every ordinal which appears in ${M}$ is countable in the real world. It is just not countable in ${M}$. Since ${M \vDash \mathsf{ZFC}}$, ${M}$ is going to think there is some smallest uncountable cardinal, say ${\aleph_1^M}$. It will be the smallest (infinite) ordinal in ${M}$ with the property that there is no bijection in the model ${M}$ between ${\aleph_1^M}$ and ${\omega}$. However, we necessarily know that such a bijection is going to exist in the real world ${V}$.

Put another way, cardinalities in ${M}$ can look vastly different from those in the real world, because cardinality is measured by bijections, which I guess is inevitable, but leads to chaos.

## 9. Picturing Inner Models

Here is a picture of a countable transitive model ${M}$.

Note that ${M}$ and ${V}$ must agree on finite sets, since every finite set has a formula that can express it. However, past ${V_\omega}$ the model and the true universe start to diverge.

The entire model ${M}$ is countable, so it only occupies a small portion of the universe, below the first uncountable cardinal ${\aleph_1^V}$ (where the superscript means “of the true universe ${V}$”). The ordinals in ${M}$ are precisely the ordinals of ${V}$ which happen to live inside the model, because the sentence “${\alpha}$ is an ordinal” is absolute. On the other hand, ${M}$ has only a portion of these ordinals, since it is only a lowly set, and a countable set at that. To denote the ordinals of ${M}$, we write ${\mathrm{On}^M}$, where the superscript means “the ordinals as computed in ${M}$”. Similarly, ${\mathrm{On}^V}$ will now denote the “set of true ordinals”.

Nonetheless, the model ${M}$ has its own version of the first uncountable cardinal ${\aleph_1^M}$. In the true universe, ${\aleph_1^M}$ is countable (below ${\aleph_1^V}$), but the necessary bijection witnessing this might not be inside ${M}$. That’s why ${M}$ can think ${\aleph_1^M}$ is uncountable, even if it is a countable cardinal in the original universe.

So our model ${M}$ is a brain in a vat. It happens to believe all the axioms of ${\mathsf{ZFC}}$, and so every statement that is true in ${M}$ could conceivably be true in ${V}$ as well. But ${M}$ can’t see the universe around it; it has no idea that what it believes is the uncountable ${\aleph_1^M}$ is really just an ordinary countable cardinal.

## 10. Exercises

Problem 1 Show that for any transitive model ${M}$, the set of ordinals in ${M}$ is itself some ordinal.

Problem 2 Assume ${\mathscr M_1 \subseteq \mathscr M_2}$. Show that

1. If ${\phi}$ is ${\Delta_0}$, then ${\mathscr M_1 \vDash \phi[b_1, \dots, b_n] \iff \mathscr M_2 \vDash \phi[b_1, \dots, b_n]}$.
2. If ${\phi}$ is ${\Sigma_1}$, then ${\mathscr M_1 \vDash \phi[b_1, \dots, b_n] \implies \mathscr M_2 \vDash \phi[b_1, \dots, b_n]}$.
3. If ${\phi}$ is ${\Pi_1}$, then ${\mathscr M_2 \vDash \phi[b_1, \dots, b_n] \implies \mathscr M_1 \vDash \phi[b_1, \dots, b_n]}$.

Problem 3 (Reflection) Let ${\kappa}$ be an inaccessible cardinal such that ${|V_\alpha| < \kappa}$ for all ${\alpha < \kappa}$. Prove that for any ${\delta < \kappa}$ there exists ${\delta < \alpha < \kappa}$ such that ${V_\alpha \prec V_\kappa}$; in other words, the set of ${\alpha}$ such that ${V_\alpha \prec V_\kappa}$ is unbounded in ${\kappa}$. This means that properties of ${V_\kappa}$ reflect down to properties of ${V_\alpha}$.

Problem 4 (Inaccessible Cardinal Produce Models) Let ${\kappa}$ be an inaccessible cardinal. Prove that ${V_\kappa}$ is a model of ${\mathsf{ZFC}}$.

# Set Theory, Part 2: Constructing the Ordinals

This is a continuation of my earlier set theory post. In this post, I’ll describe the next three axioms of ZF and construct the ordinal numbers.

1. The Previous Axioms

As review, here are the natural descriptions of the five axioms we covered in the previous post.

Axiom 1 (Extensionality) Two sets are equal if they have the same elements.

Axiom 2 (Empty Set Exists) There exists an empty set ${\varnothing}$ which contains no elements

Axiom 3 (Pairing) Given two elements ${x}$ and ${y}$, there exists a set ${\{x,y\}}$ containing only those two elements. (It is permissible to have ${x=y}$, meaning that if ${x}$ is a set then so is ${\{x\}}$.)

Axiom 4 (Union) Given a set ${a}$, we can create ${\cup a}$, the union of the elements of ${a}$. For example, if ${a = \{ \{1,2\}, \{3,4\} \}}$, then ${z = \{1,2,3,4\}}$ is a set.

Axiom 5 (Power Set) Given any set ${x}$, the power set ${\mathcal P(x)}$ is a set.

I’ll comment briefly on what these let us do now. Let ${V_0 = \varnothing}$, and recursively define ${V_{n+1} = \mathcal P(V_n)}$. So for example,

\displaystyle \begin{aligned} V_0 &= \varnothing \\ V_1 &= \{\varnothing\} \\ V_2 &= \{ \varnothing, \{\varnothing\} \} \\ V_3 &= \Big\{ \varnothing, \{\varnothing\}, \{\{\varnothing\}\}, \big\{\varnothing, \{\varnothing\} \big\}\Big\} \\ &\phantom=\vdots \end{aligned}

Now let’s drop the formalism for a moment and go on a brief philosophical musing. Suppose we have a universe ${V_\omega}$ (I’ll explain later what ${\omega}$ is) where the only sets are those which appear in some ${V_n}$. You might then see, in fact, that the sets in ${V_\omega}$ actually obey all five axioms above. What we’ve done is provide a model for which the five axioms are consistent.

But this is a pretty boring model right now for the following reason: even though there are infinitely many sets, there are no infinite sets. In a moment I’ll tell you how we can add new axioms to make infinite sets. But first let me tell you how we construct the natural numbers.

2. The Axiom of Foundation

We’re about to wade into the territory of the infinite, so first I need an axiom to protect us from really bad stuff from happening. What I’m going to do is forbid infinite ${\in}$ chains.

Axiom 6 (Foundation) Loosely, there is no infinite chain of sets

$\displaystyle x_0 \ni x_1 \ni x_2 \ni \dots.$

You can see why this seems reasonable: if I take a random set, I can hop into one of its elements. That’s itself a set, so I can jump into that guy and keep going down. In the finite universe ${V_\omega}$, you can see that eventually I’ll hit ${\varnothing}$, the very bottom of the universe. I want the same to still be true even if my sets are infinite.

This isn’t the actual statement of the axiom. The way to say this in machine code is that for any nonempty set ${x}$, there exists a ${y \in x}$ such that ${z \notin y}$ for any ${z \in x}$. We can’t actually write about something like ${x_0 \ni x_1 \ni \dots}$ in machine code (yet). Nevertheless this suffices for our axioms.

There’s an important consequence of this.

Theorem 1 ${x \notin x}$ for any set ${x}$.

Proof: For otherwise we would have ${x \ni x \ni x \ni \dots}$ which violates Foundation. $\Box$

3. The Natural Numbers

Note: in set theory, ${0}$ is considered a natural number.

Now for the fun part. If we want to encode math statements into the language of set theory, the first thing we’d want to do is encode the numbers ${0}$, ${1}$, ${\dots}$ in there so that we can actually do arithmetic. How might we do that?

What we’re going to do is construct a sequence of sets of sizes ${0}$, ${1}$, ${\dots}$ and let these correspond to the natural numbers. What sets should we choose? Well, there’s only one set of size ${0}$, so we begin by writing

$\displaystyle 0 \overset{\text{def}}{=} \varnothing.$

I’ll give away a little more than I should and then write

$\displaystyle 1 \overset{\text{def}}{=} \{\varnothing\}, \quad 2 \overset{\text{def}}{=} \{\varnothing, \{\varnothing\} \}.$

Now let’s think about ${3}$. If we want to construct a three-element set and we already have a two-element set, then we just need to add another element to ${1}$ that’s not already in there. In other words, to construct ${3}$ I just need to pick an ${x}$ such that ${x \notin 2}$, then let ${3 = \{x\} \cup 2}$. (Or in terms of our axioms, ${3 = \cup \left\{ 2, \{x\} \right\}}$.) Now what’s an easy way to pick ${x}$ such that ${x \notin 2}$? Answer: pick ${x=2}$. By the earlier theorem, we always ${2 \notin 2}$.

Now the cat’s out of the bag! We define

\displaystyle \begin{aligned} 0 &= \varnothing \\ 1 &= \left\{ 0 \right\} \\ 2 &= \left\{ 0, 1 \right\} \\ 3 &= \left\{ 0, 1, 2 \right\} \\ 4 &= \left\{ 0, 1, 2, 3 \right\} \\ &\phantom= \vdots \end{aligned}

And there you have it: the nonnegative integers. You can have some fun with this definition and write things like

$\displaystyle \left\{ x \in 8 \mid x \text{ is even} \right\} = \left\{ 0, 2, 4, 6 \right\}$

now. Deep down, everything’s a set.

4. Finite Ordinals

We’re currently able to do some counting now, because we’ve defined the sequence of sets

$\displaystyle 0, 1, 2, \dots$

by ${0 = \varnothing}$ and ${n+1 = \{0,\dots,n\}}$. This sequence is related by

$\displaystyle 0 \in 1 \in 2 \in \dots.$

Some properties of these “numbers” I’ve made are:

• They are well-ordered by ${\in}$ (which corresponds exactly with the ${<}$ which we're familiar with; that's a good motivation for choosing this construction, as the well-ordering property is one of the most important properties of ${\mathbb N}$, and using ${\in}$ for this purpose lets us do this ordering painlessly). That means if I take the elements of ${n}$, then I can sort them in a transitive chain like I've done above: for any ${x}$ and ${y}$, either ${x \in y}$ or ${y \in x}$. For example, the elements of ${4}$ are ${0}$, ${1}$, ${2}$, ${3}$ and ${0 \in 1 \in 2 \in 3}$. It also means that any subset has a “minimal'' element, which would just be the first element of the chain. Here is the complete definition.
• The set ${n}$ is transitive. What this means that it is a “closed universe” in the sense that if I look at an element ${a}$ of ${n}$, all the elements of ${a}$ are also in ${n}$. For example, if I take the element ${3}$ of ${5 = \{0,1,2,3,4\}}$, all the elements of ${3}$ are in ${5}$. Looking deeply into ${n}$ won’t find me anything I didn’t see at the top level.

In other words, a set ${S}$ is transitive if for every ${T \in S}$, ${T \subseteq S}$.

A set which is both transitive and well-ordered by ${\in}$ is called an ordinal, and the numbers ${0,1,2,\dots}$ are precisely the finite ordinals. But now I’d like to delve into infinite numbers. Can I define some form of “infinity”?

5. Infinite Ordinals

To tell you what a set is, I only have to tell you who its elements are. And so I’m going to define the set

$\displaystyle \omega = \left\{ 0, 1, 2, \dots \right\}.$

And now our counting looks like this.

$\displaystyle 0, 1, 2, \dots, \omega.$

We just tacked on an infinity at the end by scooping up all the natural numbers and collecting them in a set. You can do that? Sure you can! All I’ve done is written down the elements of the set, and you can check that ${\omega}$ is indeed an ordinal.

Well, okay, there’s one caveat. We don’t actually know whether the ${\omega}$ I’ve written down is a set. Pairing and Union lets us collect any finite collection of sets into a single set, but it doesn’t let us collect things into an infinite set. In fact, you can’t prove from the axioms that ${\omega}$ is a set.

For this I’m going to present another two axioms. These are much more technical to describe, so I’ll lie to you about what their statements are. If you’re interested in the exact statements, consult the lecture notes linked at the bottom of this post.

Axiom 7 (Replacement) Loosely, let ${f}$ be a function on a set ${X}$. Then the image of ${f}$ is a set:

$\displaystyle \exists Y : \quad \forall y, \; y \in Y \iff \exists x : f(x) = y.$

Axiom 8 (Infinity) There exists a set ${\omega = \{0,1,2,\dots\}}$.

With these two axioms, we can now write down the first infinite ordinal ${\omega}$. So now our list of ordinals is

$\displaystyle 0, 1, 2, \dots, \omega.$

But now in the same way we constructed ${3}$ from ${2}$, we can construct a set

$\displaystyle \omega + 1 \overset{\text{def}}{=} \omega \cup \{\omega\} = \left\{ 0, 1, \dots, \omega \right\}$

and then

$\displaystyle \omega + 2 \overset{\text{def}}{=} (\omega+1) \cup \{\omega+1\} = \left\{ 0, 1, \dots, \omega, \omega+1 \right\}$

and so on — all of these are also transitive and well-ordered by ${\in}$. So now our list of ordinals is

$\displaystyle 0, 1, 2, \dots, \omega, \omega+1, \omega+2, \dots.$

Well, can we go on further? What we’d like is to define an “${\omega+\omega}$ or ${2 \cdot \omega}$”, which would entail capturing all of the above elements into a set. Well, I claim we can do so. Consider a function ${f}$ defined on ${\omega}$ which sends ${n}$ to ${\omega+n}$. So ${f(0) = \omega}$, ${f(3) = \omega+3}$, ${f(2014) = \omega+2014}$. (Strictly, I have to prove that set-encoding of this function, namely ${\{(0,\omega), (1,\omega+1), \dots\}}$, is actually a set. But that’s put that aside for now.) Then Replacement tells me that I have a set

$\displaystyle \left\{ f(0), f(1), \dots \right\} = \left\{ \omega, \omega+1, \omega+2, \dots \right\}$

From here we can union this set with ${\omega}$ to achieve the set ${\left\{ 0,1,2,\dots,\omega,\omega+1,\omega+2,\dots \right\}}$. And we can keep turning this wheel repeatedly, yielding the ordinal numbers.

\displaystyle \begin{aligned} 0, & 1, 2, 3, \dots, \omega \\ & \omega+1, \omega+2, \dots, \omega+\omega \\ & 2\omega+1, 2\omega+2, \dots, 3\omega \\ & \vdots \\ & \omega^2 + 1, \omega^2+2, \dots \\ & \vdots \\ & \omega^3, \dots, \omega^4, \dots, \omega^\omega \\ & \vdots, \\ & \omega^{\omega^{\omega^{\dots}}} \\ \end{aligned}

I won’t say much more about these ordinal numbers since the post is already getting pretty long, but I’ll mention that the ordinals might not correspond to a type of counting that you’re used to, in the sense that there is a bijection between the sets ${\omega}$ to ${\omega+1}$. It might seem like different numbers should have different “sizes”. For this you stumble into the cardinal numbers: it turns out that a cardinal number is just defined as an ordinal number which is not in bijection with any smaller ordinal number.

6. A Last Few Axioms

I’ll conclude this exposition on ZFC with a few last axioms. First is the axiom called Comprehension, though it actually can be proven from the Replacement axiom.

Axiom 9 (Comprehension) Let ${\phi}$ be some property, and let ${S}$ be a set. Then the notion

$\displaystyle \left\{ x \in S \mid \phi(x) \right\}$

is a set. More formally,

$\displaystyle \exists X \forall x \in X: (x \in S) \land (\phi(x)).$

Notice that the comprehension must be restricted: we can only take subsets of existing sets. From this one can deduce that there is no set of all sets; if we had such a set ${V}$, then we could use Comprehension to generate ${\{x \in V : x \notin x\}}$, recovering Russel’s Paradox.

So anyways, this means that we can take list comprehensions.

Finally, I’ll touch a little on why the Axiom of Choice is actually important. You’ve probably heard the phrasing that “you can pick a sock out of every drawer” or some cute popular math phrasing like that. Here’s the precise statement.

Axiom 10 (Choice) Let ${\mathcal F}$ be a set such that ${\varnothing \notin \mathcal F}$. Then we can define a function ${g}$ on ${\mathcal F}$ such that ${g(y) \in y}$ for every ${y \in \mathcal F}$. The function ${g}$ is called a choice function; given a bunch of sets ${\mathcal F}$, it chooses an element ${g(y)}$ out of every ${y}$. In other words, for any ${\mathcal F}$ with ${\varnothing \notin \mathcal F}$, there exists a set

$\displaystyle \left\{ (y,g(y)) \mid y \in \mathcal F \right\}.$

with ${g(y) \in y}$ for every ${y}$.

In light of the discussion in these posts, what’s significant is not that we can conceive such a function (how hard can it be to take an element out of a nonempty set?) but that the resulting structure is actually a set. The whole point of having the ZF axioms is that we have to be careful about how we are allowed to make new sets out of old ones, so that we don’t run ourselves into paradoxes. The Axiom of Choice reflects that this is a subtle issue.

So there you have it, the axioms of ZFC and what they do.

Thanks to Peter Koellner, Harvard, for his class Math 145a, which taught me this material. My notes for this course can be downloaded from my math website.

Thanks to J Wu for pointing out a typo in Replacement and noting that I should emphasize how ${\in}$ leads to the well-ordering for the ordinals.

# Set Theory, Part 1: An Intro to ZFC

Back in high school, I sometimes wondered what all the big deal about ZFC and the Axiom of Choice was, but I never really understood what I read in the corresponding Wikipedia page. In this post, I’ll try to explain what axiomatic set theory is trying to do in a way accessible to those with just a high school background.

1. Motivation

What we’re going to try to lay out something like a “machine code” for math: a way of making math completely rigorous, to the point where it can be verified by a machine. This would make sure that our foundation on which we do our high-level theorem proving is sound. As we’ll see in just a moment, this is actually a lot harder to do than it sounds — there are some traps if we try to play too loosely with our definitions.

First of all, since this is a set theory post, the first thing we need to do is define exactly what a set is. Well, we know what a set is, right? It’s just a collection of objects, and two sets are the same if they have the same objects. For example, we have our empty set ${\varnothing}$ that contains no objects. We can have a set that ${\{1, 2, 3\}}$, or maybe the set of natural numbers ${\mathbb N = \{0, 1, 2, \dots \}}$. (For the purposes of set theory, ${0}$ is usually considered a natural number.) Sets can even contain other sets, like ${\left\{ \mathbb Z, \mathbb Q, \mathbb N \right\}}$. Fine and dandy, right?

The trouble is that this definition actually isn’t good enough, and here’s why. If we just say “a set is any collection of objects”, then we can consider a really big set ${S}$, the set of all sets. So far no problem, right? ${S}$ has the oddity that it has to contain itself ${S \in S}$, but oh well, no big deal. In fact, for this section, let’s even give a name to these sets — we’ll call a set extraordinary if it happens to contain itself, and ordinary otherwise. (Note that this isn’t actually a math term, it’s just for my convenience.)

Now here comes the problem, called Russell’s Paradox (link here). If I can look at the set of all sets ${S}$, I can also probably look at the set of all ordinary sets, which I’ll call ${X}$. Here’s the question: is ${X}$ ordinary?

• Well, if ${X}$ is ordinary, then ${X \in X}$. Hence ${X}$ is extraordinary by definition, which is impossible.
• Now suppose ${X}$ is extraordinary. Then ${X}$ is not ordinary, so that implies ${X \notin X}$. But that contradicts the assumption that ${X}$ is extraordinary!

So that’s all a contradiction! Just by considering the set of all ordinary sets, we’ve run into a trap.

Now if you’re not a set theorist, you could probably just brush this off, saying “oh well, I guess you can’t look at certain sets”. But if you’re a set theorist, this worries you, because you realize it means that you can’t just define a set as “a collection of objects”, because then everything would blow up. Something more is necessary.

2. The Language of Set Theory

We need a way to refer to sets other than “collection of objects”. So here’s what we’re going to do. We’ll start by defining a formal language of set theory, a way of writing logical statements. First of all we can throw in our usual logical operators, like ${\forall}$‘, for all; ${\exists}$, exists; ${=}$, equals; and ${X \implies Y}$, our “if ${X}$ then ${Y}$”. Let’s also add in ${\land}$, which means “and”, and ${\lor}$, which means “or”, as well as ${\neg}$, which means “not”. Now we can write down logical statements.

Since we’re doing set theory, there’s one more operator we’ll add in: the inclusion ${\in}$. And that’s all we’re going to use (for now).

So how do we express something like “the set ${\{1, 2\}}$”? The trick is that we’re not going to actually “construct” any sets, but rather refer to them indirectly, like so:

$\displaystyle \exists S : x \in S \iff \left( (x=1) \lor (x=2) \right).$

“There exists an ${S}$ such that ${x}$ is in ${S}$ if and only if either ${1}$ is in ${S}$ or ${2}$ is in ${S}$”. We don’t have to refer to sets as objects in and of themselves anymore — we now have a way to “create” our sets, by writing formulas for exactly what they contain. This is something a machine can parse.

Well, what are going to do with things like ${1}$ and ${2}$, which are not sets? Here’s the answer: we’re going to make everything into a set. Natural numbers will be sets. Ordered pairs will be sets. Functions will be sets. In later posts, I’ll tell you exactly how we manage to do something like encode ${1}$ as a set. For now, all you need to know is that that sets don’t just hold objects; they hold other sets.

So now it makes sense to talk about whether something is a set or not: ${\exists x}$ means “${x}$ is a set”, while ${\nexists x}$ means “${x}$ is not a set”. In other words, we’ve rephrased the problem of deciding whether something is a set to whether it exists, which makes it easier to deal with in our formal language. That means that our axiom system had better find some way to let us show a lot of things exist, without letting us prove the following formula:

$\displaystyle \exists X : x \in X \iff x \notin x.$

For if we prove this formula, then we have our “bad” set from Bertrand’s paradox that caused us to go down the rabbit hole in the first place.

3. The Axioms of ZF

What the axioms of ZF is do is tell a computer exactly what it’s allowed to do. Since the axioms want to be perfectly crystal clear, they’re all written in formal symbols. Humans don’t actually think in these formal symbols, unless they’re set theorists and have also gone insane. But the point of these symbols is that there is absolutely no confusion what is true and isn’t true.

It’s worth noting that there are several versions of ZF, but they’re all equivalent to one another. Also, we’ll start representing sets with lowercase letters below.

First, we agreed before that two sets are the same if they share the same elements. We’d like to encode that in an axiom: ${x=y}$ if and only if for every ${a}$, ${a \in x \iff a \in y}$. (And now you’re starting to see how convenient it is that everything’s a set!)

This leads us to the first axiom of ZF, called Extensionality.

Axiom 1 (Extensionality) ${\forall x \forall y \forall a : \left( a \in x \iff a \in y \right) \implies x = y}$.

This is machine code for “if ${x}$ and ${y}$ have the property that ${a \in x \iff a \in y}$, then ${x = y}$”. Now our computer can recognize that two sets are equal if they share exactly the same elements. Great.

Unfortunately, our computer still doesn’t know that there even exist any sets. We have to tell it that. So what’s the first thing we tell it? There exists the empty set.

Axiom 2 (Empty Set Exists) ${\exists a : \forall x \; \neg (x \in a)}$.

Now you notice that I have ${\neg (x \in a)}$ rather than the ${x \notin a}$ that we’re all used to. That’s going to get tiring very soon, so we’ll make our first “shortcut”: whenever we write ${x \notin a}$, we really mean ${\neg (x \in a)}$. This makes it easier for us humans to read the formula, but we’re happy because in principle, we could just expand this shortcut and give the computer something to read.

Now our computer knows the empty set is a set. But does our computer know that there’s only one? In fact, yes! If we have two sets ${a}$ and ${b}$, and both of them have no elements, then our computer can prove ${a=b}$ by using Extensionality now. So in fact, there’s only one empty set. Now we’ll take another shortcut and give this set a name, “${\varnothing}$”.

The next three axioms provide us with some various ways of building more sets.

Axiom 3 (Pairing) Given two elements ${x}$ and ${y}$, there exists a set ${a}$ containing only those two elements. In machine code,

$\displaystyle \forall x \forall y \exists a \quad \forall z, \; z \in a \iff \left( (z=x) \lor (z=y) \right).$

Note that ${x}$ and ${y}$ do not have to be different! What that means is that our machine can now build a set by taking ${x = y = \varnothing}$. Pairing gives a set ${a}$ such that ${z \in a}$ if and only if ${z = \varnothing}$. In our human mind, what we’re thinking is “we’ve built the set ${a = \{\varnothing\}}$!” No need for the computer to know that, though.

We can the apply Pairing again to build some more sets, like ${\{\{\varnothing\}\}}$, or ${\{\varnothing, \{\varnothing\}\}}$. Can you see how? And now if I use pairing on those two sets, I get ${\{ \{\varnothing, \{\varnothing\}\}, \{\{\varnothing\}\}\}}$. Yipee!

At this point I’m going to cheat and use ${0,1,2,\dots}$ for some examples as natural “objects”, since all those braces are getting hard to read. All you need to know at this point is that these guys are secretly sets too, and you’ll have to wait until next time for me to tell you what their elements are.

So Pairing tells you that if I have my hands on ${1}$, and I have my hands on ${2}$, then I can get my hands on the set ${\{1,2\}}$. Cool. Now suppose I told you: ${a = \{1,2\}}$ and ${b = \{3,4\}}$. Or rather, the computer told you

$\displaystyle \exists a : x \in a \iff \left( (x=1) \lor (x=2) \right)$

and

$\displaystyle \exists b : x \in b \iff \left( (x=3) \lor (x=4) \right).$

Now you want to construct the set ${\{1,2,3,4\}}$. You might try pairing, but that just gives you the set ${\{ \{1,2\}, \{3,4\} \}}$. It turns out we’re actually super clumsy: our axioms so far don’t even let us get our hands on ${1}$ and ${2}$, even though we have a set that contains them!

Maybe this is actually not that surprising. If I write down the formula

$\displaystyle \exists a : x \in a \iff \left( (x=1) \lor (x=2) \lor (x=\text{unicorn}) \right)$

but I know there are no unicorns, then I shouldn’t expect to actually be able to prove that ${\text{unicorn}}$ exists. However, I’d really like to ignore that and get the set ${\{1,2,3,4\}}$. The axiom that lets us do that is called Union.

Axiom 4 (Union) Given a set ${a}$, we can create ${\cup a}$, the union of the elements of ${a}$. For example, if ${a = \{ \{1,2\}, \{3,4\} \}}$, then ${z = \{1,2,3,4\}}$ is a set. Formally,

$\displaystyle \forall a \exists z \quad \forall x \; (x \in z) \iff (\exists y : x \in y \in a).$

By repeatedly using Pairing and Union, we can take any finite collection of elements and gather them into one set.

Finally, we’ll also give our computer a way to build a power set. Recall that the power set of a set ${S}$, which I write ${\mathcal P(S)}$, consists of all its subsets. For example, in human world, we had ${\mathcal P(\{1,2\}) = \{\varnothing, \{1\}, \{2\}, \{1,2\}\}}$.

Axiom 5 (Power Set) We can construct ${\mathcal P(x)}$. Formally,

$\displaystyle \forall x \exists a \forall y (y \in a \iff y \subseteq x)$

where ${y \subseteq x}$ is short for ${\forall z (z \in y \implies z \in x)}$.

4. Closing

That lays down the first five axioms of ZF.

Though I don’t have the space to show you how to encode the integers here, I’ll give you an idea for some ways we can hack together “encodings” of things as et. For example, what if we want an ordered pair ${(x,y)}$? I claim we can encode this as

$\displaystyle (x,y) \overset{\text{def}}{=} \left\{ \{x\}, \{x,y\} \right\}.$

Indeed, you can check that ${(x_1, y_1) = (x_2, y_2)}$ if and only if ${x_1 = x_2}$ and ${y_1 = y_2}$, which is what we wanted. You can see that this is indeed a set by some applications of Pairing.

Extending this, if we have a relation ${\sim}$, we can encode it as

$\displaystyle {\sim} \overset{\text{def}}{=} \left\{ (x,y) \mid x \sim y \right\}.$

A function ${f}$ is a special type of relation such that if ${(x,y_1) \in f}$ and ${(x,y_2) \in f}$, then ${y_1 = y_2}$. In other words, we represent functions by their “graphs” of ordered pairs ${(x, f(x))}$.

In the next post, I’ll explain some of the more involved axioms, and I’ll also finally tell you what the elements of ${1}$ and ${2}$ are. Specifically, I’ll be constructing the ordinal numbers.

Thanks to Peter Koellner, Harvard, for his class Math 145a, which taught me this material. My notes for this course can be downloaded from my math website.