
Careful readers of my blog might have heard about plans to have a second edition of Napkin out by the end of February. As it turns out I was overly ambitious, and (seeing that I am spending the next week … Continue reading
Careful readers of my blog might have heard about plans to have a second edition of Napkin out by the end of February. As it turns out I was overly ambitious, and (seeing that I am spending the next week … Continue reading
In this post I’ll describe the structure theorem over PID’s which generalizes the following results:
Prototypical example for this section: .
Before I can state the main theorem, I need to define a few terms for UFD’s, which behave much like : Our intuition from the case
basically carries over verbatim. We don’t even need to deal with prime ideals and can factor elements instead.
Definition 1
If is a UFD, then
is a prime element if
is a prime ideal and
. For UFD’s this is equivalent to the following property: if
then either
or
is a unit.
So for example in the set of prime elements is
. Now, since
is a UFD, every element
factors into a product of prime elements
Definition 2
We say divides
if
for some
. This is written
.
Example 3 (Divisibility in )
The number is divisible by every element of
. All other divisibility as expected.
Ques 4
Show that if and only if the exponent of each prime in
is less than or equal to the corresponding exponent in
.
Now, the case of interest is the even stronger case when is a PID:
Proposition 5 (PID’s are Noetherian UFD’s)
If is a PID, then it is Noetherian and also a UFD.
Proof: The fact that is Noetherian is obvious. For
to be a UFD we essentially repeat the proof for
, using the fact that
is principal in order to extract
.
In this case, we have a Chinese remainder theorem for elements.
Theorem 6 (Chinese remainder theorem for rings)
Let and
be relatively prime elements, meaning
. Then
Proof: This is the same as the proof of the usual Chinese remainder theorem. First, since we have
for some
and
. Then we have a map
One can check that this map is well-defined and an isomorphism of rings. (Diligent readers invited to do so.)
Finally, we need to introduce the concept of a Noetherian -module.
Definition 7
An -module
is Noetherian if it satisfies one of the two equivalent conditions:
This generalizes the notion of a Noetherian ring: a Noetherian ring is one for which
is Noetherian as an
-module.
Ques 8
Check these two conditions are equivalent. (Copy the proof for rings.)
Our structure theorem takes two forms:
Theorem 9 (Structure theorem, invariant form)
Let be a PID and let
be any finitely generated
-module. Then
for some satisfying
.
Corollary 10 (Structure theorem, primary form)
Let be a PID and let
be any finitely generated
-module. Then
where for some prime element
and integer
.
Proof: Factor each into prime factors (since
is a UFD), then use the Chinese remainder theorem.
Remark 11
In both theorems the decomposition is unique up to permutations of the summands; good to know, but I won’t prove this.
The proof of the structure theorem proceeds in two main steps. First, we reduce the problem to a linear algebra problem involving free -modules
. Once that’s done, we just have to play with matrices; this is done in the next section.
Suppose is finitely generated by
elements. Then there is a surjective map of
-modules
whose image on the basis of are the generators of
. Let
denote the kernel.
We claim that is finitely generated as well. To this end we prove that
Lemma 12 (Direct sum of Noetherian modules is Noetherian)
Let and
be two Noetherian
-modules. Then the direct sum
is also a Noetherian
-module.
Proof: It suffices to show that if , then
is finitely generated. It’s unfortunately not true that
(take
) so we will have to be more careful.
Consider the submodules
(Note the asymmetry for and
: the proof doesn’t work otherwise.) Then
is finitely generated by
, \dots,
, and
is finitely generated by
, \dots,
. Let
and let
be elements of
(where the
‘s are arbitrary things we don’t care about). Then
and
together generate
.
Ques 13
Deduce that for a PID,
is Noetherian.
Hence is finitely generated as claimed. So we can find another surjective map
. Consequently, we have a composition
Observe that is the cokernel of the composition
, i.e. we have that
So it suffices to understand the map well.
The idea is now that we have reduced our problem to studying linear maps , which can be thought of as a generic matrix
for the standard basis , \dots,
of
and
, \dots,
of
.
Of course, as you might expect it ought to be possible to change the given basis of such that
has a nicer matrix form. We already saw this in Jordan form, where we had a map
and changed the basis so that
was “almost diagonal”. This time, we have two sets of bases we can change, so we would hope to get a diagonal basis, or even better.
Before proceeding let’s think about how we might edit the matrix: what operations are permitted? Here are some examples:
More generally, If is an invertible
matrix we can replace
with
. This corresponds to replacing
(the “invertible” condition just guarantees the latter is a basis). Of course similarly we can replace with
where
is an invertible
matrix; this corresponds to
Armed with this knowledge, we can now approach the following result.
Theorem 14 (Smith normal form)
Let be a PID. Let
and
be free
-modules and let
be a linear map. Set
.
Then we can select a pair of new bases for and
such that
has only diagonal entries
,
, \dots,
and
.
So if , the matrix should take the form
and similarly when .
Ques 15
Show that Smith normal form implies the structure theorem.
Remark 16
Note that this is not a generalization of Jordan form.
Example 17 (Example of Smith normal form)
To give a flavor of the idea of the proof, let’s work through a concrete example with the following matrix with entries from :
The GCD of all the entries is , and so motivated by this, we perform the Euclidean algorithm on the left column: subtract the second row from the first row, then three times the first row from the second:
Now that the GCD of is present, we move it to the upper-left by switching the two rows, and then kill off all the entries in the same row/column; since
was the GCD all along, we isolate
completely:
This reduces the problem to a matrix. So we just apply the Euclidean algorithm again there:
Now all we have to do is generalize this proof to work with any PID. It’s intuitively clear how to do this: the PID condition more or less lets you perform a Euclidean algorithm.
Proof: Begin with a generic matrix
We want to show, by a series of operations (gradually changing the given basis) that we can rearrange the matrix into Smith normal form.
Define to be any generator of the principal ideal
.
Claim 18 (“Euclidean algorithm”)
If and
are entries in the same row or column, we can change bases to replace
with
and
with something else.
Proof: We do just the case of columns. By hypothesis, for some
. We must have
now (we’re in a UFD). So there are
and
such that
. Then
and the first matrix is invertible (check this!), as desired.
Let be the GCD of all entries. Now by repeatedly applying this algorithm, we can cause
to appear in the upper left hand corner. Then, we use it to kill off all the entries in the first row and the first column, thus arriving at a matrix
Now we repeat the same procedure with this lower-right matrix, and so on. This gives the Smith normal form.
With the Smith normal form, we have in the original situation that
and applying the theorem to completes the proof of the structure theorem.
Now, we can apply our structure theorem! I’ll just sketch proofs of these and let the reader fill in details.
Corollary 19 (Finite-dimensional vector spaces are all isomorphic)
A vector space over a field
has a finite spanning set of vectors. Then for some
,
.
Proof: In the structure theorem, .
Corollary 20 (Frobenius normal form)
Let where
is a finite-dimensional vector space over an arbitrary field
(not necessarily algebraically closed). Then one can write
as a block-diagonal matrix whose blocks are all of the form
Proof: View as a
-module with action
. By theorem
for some polynomials
, where
. Write each block in the form described.
Corollary 21 (Jordan normal form)
Let where
is a finite-dimensional vector space over an arbitrary field
which is algebraically closed. Prove that
can be written in Jordan form.
Proof: We now use the structure theorem in its primary form. Since is algebraically closed each
is a linear factor, so every summand looks like
for some
.
This is a draft of Chapter 15 of the Napkin.
Model theory is really meta, so you will have to pay attention here.
Roughly, a “model of ” is a set with a binary relation that satisfies the
axioms, just as a group is a set with a binary operation that satisfies the group axioms. Unfortunately, unlike with groups, it is very hard for me to give interesting examples of models, for the simple reason that we are literally trying to model the entire universe.
Prototypical example for this section: obeys
,
is a model for
inaccessible (later).
Definition 1 A model
consists of a set
and a binary relation
. (The
relation is the “
” for the model.)
Remark 2 I’m only considering set-sized models where
is a set. Experts may be aware that I can actually play with
being a class, but that would require too much care for now.
If you have a model, you can ask certain things about it. For example, you can ask “does it satisfy ?”. Let me give you an example of what I mean, and then make it rigorous.
Example 3 (A Stupid Model) Let’s take
. This is not a very good model of
, but let’s see if we can make sense of some of the first few axioms.
satisfies
, which is the sentence
This just follows from the fact that
is actually
.
satisfies
, which is the sentence
Namely, take
.
does not satisfy
, since
is not in
, even though
.
- Miraculously,
satisfies
, since for any
,
is
(unless
). The Union axiom statements that
An important thing to notice is that the “
” ranges only over the sets in the model of the universe,
.
Example 4 (Important: This Stupid Model Satisfies
) Most incredibly of all:
satisfies
. This is a really important example. You might think this is ridiculous. Look at
. The power set of this is
which is not in the model, right?
Well, let’s look more closely at
. It states that:
What happens if we set
? Well, actually, we claim that
works. The key point is “for all
” — this only ranges over the objects in
. In
, the only subsets of
are
,
and
. The “set”
in the “real world” (in
) is not a set in the model
.
In particular, you might say that in this strange new world, we have
, since
really does have only
subsets.
Example 5 (Sentences with Parameters) The sentences we ask of our model are allowed to have “parameters” as well. For example, if
as before then
satisfies the sentence
With this intuitive notion, we can define what it means for a model to satisfy a sentence.
Definition 6 Note that any sentence
can be written in one of the following five forms:
(“not
”) for some shorter sentence
(“`
or
”) for some shorter sentences
,
(“exists
”) for some shorter sentence
.
Ques 7 What happened to
(and) and
(for all)? (Hint: use
.)
Often (almost always, actually) we will proceed by so-called “induction on formula complexity”, meaning that we define or prove something by induction using this. Note that we require all formulas to be finite.
Now suppose we have a sentence , like
or
, plus a model
. We want to ask whether
satisfies
.
To give meaning to this, we have to designate certain variables as parameters. For example, if I asked you “Does ?” the first question you would ask is what
and
are. So
,
would be parameters: I have to give them values for this sentence to make sense.
On the other hand, if I asked you “Does ?” then you would just say “yes”. In this case,
and
are not parameters. In general, parameters are those variables whose meaning is not given by some
or
.
In what follows, we will let denote a formula
, whose parameters are
, \dots,
. Note that possibly
, for example all
axioms have no parameters.
Ques 8 Try to guess the definition of satisfaction before reading it below. (It’s not very hard to guess!)
Definition 9 Let
be a model. Let
be a sentence, and let
. We will define a relation
and say
satisfies the sentence
with parameters
.
The relationship is defined by induction on formula complexity as follows:
- If
is “
” then
.
- If
is “
” then
.
(This is what we mean by “interprets
”.)
- If
is “
” then
.
- If
is “
” then
means
for some
.
- Most important case: suppose
is
. Then
if and only if
Note that
has one extra parameter.
Notice where the information of the model actually gets used. We only ever use in interpreting
; unsurprising. But we only ever use the set
when we are running over
(and hence
). That’s well-worth keeping in mind: The behavior of a model essentially comes from
and
, which search through the entire model
.
And finally,
Definition 10 A model of
is a model
satisfying all
axioms.
We are especially interested in models of the form , where
is a transitive set. (We want our universe to be transitive, otherwise we would have elements of sets which are not themselves in the universe, which is very strange.) Such a model is called a transitive model. If
is a transitive set, the model
will be abbreviated to just
.
Definition 11 An inner model of
is a transitive model satisfying
.
Prototypical example for this section: is absolute. The axiom
is
,
is
.
A key point to remember is that the behavior of a model is largely determined by and
. It turns out we can say even more than this.
Consider a formula such as
which checks whether a given set has a nonempty element. Technically, this has an “
” in it. But somehow this
does not really search over the entire model, because it is bounded to search in
. That is, we might informally rewrite this as
which doesn’t fit into the strict form, but points out that we are only looking over . We call such a quantifier a bounded quantifier.
We like sentences with bounded quantifiers because they designate properties which are absolute over transitive models. It doesn’t matter how strange your surrounding model is. As long as
is transitive,
will always hold. Similarly, the sentence
Sentences with this property are called or
.
The situation is different with a sentence like
which in English means “ is the power set of
”, or just
. The
is not bounded here. This weirdness is what allows things like
and hence
which was our stupid example earlier. The sentence consists of an unbounded
followed by an absolute sentence, so we say it is
.
More generally, the Levy hierarchy keeps track of how bounded our quantifiers are. Specifically,
(A formula which is both and
is called
, but we won’t use this except for
.)
Example 12 (Examples of
Sentences)
- The sentences
,
, as discussed above.
- The formula “
is transitive” can be expanded as a
sentence.
- The formula “
is an ordinal” can be expanded as a
sentence.
Exercise 13 Write out the expansions for “
is transitive” and “
is ordinal” in a
form.
Example 14 (More Complex Formulas)
- The axiom
is
; it is
, and
is
.
- The formula “
” is
, as discussed above.
- The formula “
is countable” is
. One way to phrase it is “
an injective map
”, which necessarily has an unbounded “
”.
- The axiom
is
:
Let and
be models.
Definition 15 We say that
if
and
agrees with
; we say
is a substructure of
.
That’s boring. The good part is:
Definition 16 We say
, or
is an elementary substructure of
, if for every sentence
and parameters
, we have
In other words, and
agree on every sentence possible. Note that the
have to come from
; if the
came from
then asking something of
wouldn’t make sense.
Let’s ask now: how would fail to be true? If we look at the possibly sentences, none of the atomic formulas, nor the “
” and “
”, are going to cause issues.
The intuition you should be getting by now is that things go wrong once we hit and
. They won’t go wrong for bounded quantifiers. But unbounded quantifiers search the entire model, and that’s where things go wrong.
To give a “concrete example”: imagine is MIT, and
is the state of Massachusetts. If
thinks there exist hackers at MIT, certainly there exist hackers in Massachusetts. Where things go wrong is something like:
This is true for because we can take the witness
, say. But it’s false for
, because at MIT all courses are numbered
or something similar. The issue is that the witness for statements in
do not necessarily propagate up down to witnesses for
, even though they do from
to
.
The Tarski-Vaught test says this is the only impediment: if every witness in can be replaced by one in
then
.
Lemma 17 (Tarski-Vaught) Let
. Then
if and only if for every sentence
and parameters
: if there is a witness
to
then there is a witness
to
.
Proof: Easy after the above discussion. To formalize it, use induction on formula complexity.
Extending the above ideas, one can obtain without much difficulty the following. The idea is that almost all the axioms are just
claims about certain desired sets, and so verifying an axiom reduces to checking some appropriate “closure” condition: that the witness to the axiom is actually in the model.
For example, the axiom is “
”, and so we’re happy as long as
, which is of course true for any nonempty transitive set
.
Lemma 18 (Transitive Sets Inheriting
) Let
be a nonempty transitive set. Then
satisfies
,
,
.
if
.
if
.
if
.
if for every
and every function
which is
-definable with parameters, we have
as well.
as long as
.
Here, a set is
-definable with parameters if it can be realized as
for some (fixed) choice of parameters . We allow
, in which case we say
is
-definable without parameters. Note that
need not itself be in
! As a trivial example,
is
-definable without parameters (just take
to always be true), and certainly we do not have
.
Exercise 19 Verify (i)-(iv) above.
Remark 20 Converses to the statements of Lemma 18 are true for all claims other than (vii).
Up until now I have been only talking about transitive models, because they were easier to think about. Here’s a second, better reason we might only care about transitive models.
Lemma 21 (Mostowski Collapse) Let
be a model such that
. Then there exists an isomorphism
for a transitive model
.
This is also called the transitive collapse. In fact, both and
are unique.
Proof: The idea behind the proof is very simple. Since is well-founded and extensional, we can look at the
-minimal element
of
with respect to
. Clearly, we want to send that to
.
Then we take the next-smallest set under , and send it to
. We “keep doing this”; it’s not hard to see this does exactly what we want.
To formalize, define by transfinite recursion:
This , by construction, does the trick.
The picture of this is quite “collapsing” the elements of down to the bottom of
, hence the name.
Prototypical example for this section:
At this point you might be asking, well, where’s my model of ?
I unfortunately have to admit now: can never prove that there is a model of
(unless
is inconsistent, but that would be even worse). This is a result called Gödel’s Incompleteness Theorem.
Nonetheless, with some very modest assumptions added, we can actually show that a model does exist: for example, assuming that there exists a strongly inaccessible cardinal would do the trick, it turns out
will be such a model. Intuitively you can see why:
is so big that any set of rank lower than it can’t escape it even if we take their power sets, or any other method that
lets us do.
More pessimistically, this shows that it’s impossible to prove in that such a
exists. Nonetheless, we now proceed under
for convenience, which adds the existence of such a
as a final axiom. So we now have a model
to play with. Joy!
Great. Now we do something really crazy.
Theorem 22 (Countable Transitive Model) Assume
. Then there exists a transitive model
of
such that
is a countable set.
Proof: Fasten your seat belts.
Start with the set . Then for every integer
, we do the following to get
.
We then add in the element to
.
At every step is countable. Reason: there are countably many possible finite sets of parameters in
, and countably many possible formulas, so in total we only ever add in countably many things at each step. This exhibits an infinite nested sequence of countable sets
None of these is a substructure of , because each
by relies on witnesses in
. So we instead take the union:
This satisfies the Tarski-Vaught test, and is countable.
There is one minor caveat: might not be transitive. We don’t care, because we just take its Mostowski collapse.
Please take a moment to admire how insane this is. It hinges irrevocably on the fact that there are countably many sentences we can write down.
Remark 23 This proof relies heavily on the Axiom of Choice when we add in the element
to
. Without Choice, there is no way of making these decisions all at once.
Usually, the right way to formalize the Axiom of Choice usage is, for every formula
, to pre-commit (at the very beginning) to a function
, such that given any
![]()
will spit out the suitable value of
(if one exists). Personally, I think this is hiding the spirit of the proof, but it does make it clear how exactly Choice is being used.
These
‘s have a name: Skolem functions.
The trick we used in the proof works in more general settings:
Theorem 24 (Downward Löwenheim-Skolem Theorem) Let
be a model, and
. Then there exists a set
(called the Skolem hull of
) with
, such that
, and
In our case, what we did was simply take to be the empty set.
Ques 25 Prove this. (Exactly the same proof as before.)
The most common one is “how is this possible?”, with runner-up “what just happened”.
Let me do my best to answer the first question. It seems like there are two things running up against each other:
(This has confused so many people it has a name, Skolem’s paradox.)
The reason this works I actually pointed out earlier: countability is not absolute, it is a notion.
Recall that a set is countable if there exists an injective map
. The first statement just says that in the universe
, there is a injective map
. In particular, for any
(hence
, since
is transitive),
is countable in
. This is the content of the first statement.
But for to be a model of
,
only has to think statements in
are true. More to the point, the fact that
tells us there are uncountable sets means
In other words,
The key point is the searches only functions in our tiny model
. It is true that in the “real world”
, there are injective functions
. But
has no idea they exist! It is a brain in a vat:
is oblivious to any information outside it.
So in fact, every ordinal which appears in is countable in the real world. It is just not countable in
. Since
,
is going to think there is some smallest uncountable cardinal, say
. It will be the smallest (infinite) ordinal in
with the property that there is no bijection in the model
between
and
. However, we necessarily know that such a bijection is going to exist in the real world
.
Put another way, cardinalities in can look vastly different from those in the real world, because cardinality is measured by bijections, which I guess is inevitable, but leads to chaos.
Here is a picture of a countable transitive model .
Note that and
must agree on finite sets, since every finite set has a formula that can express it. However, past
the model and the true universe start to diverge.
The entire model is countable, so it only occupies a small portion of the universe, below the first uncountable cardinal
(where the superscript means “of the true universe
”). The ordinals in
are precisely the ordinals of
which happen to live inside the model, because the sentence “
is an ordinal” is absolute. On the other hand,
has only a portion of these ordinals, since it is only a lowly set, and a countable set at that. To denote the ordinals of
, we write
, where the superscript means “the ordinals as computed in
”. Similarly,
will now denote the “set of true ordinals”.
Nonetheless, the model has its own version of the first uncountable cardinal
. In the true universe,
is countable (below
), but the necessary bijection witnessing this might not be inside
. That’s why
can think
is uncountable, even if it is a countable cardinal in the original universe.
So our model is a brain in a vat. It happens to believe all the axioms of
, and so every statement that is true in
could conceivably be true in
as well. But
can’t see the universe around it; it has no idea that what it believes is the uncountable
is really just an ordinary countable cardinal.
Problem 1 Show that for any transitive model
, the set of ordinals in
is itself some ordinal.
Problem 2 Assume
. Show that
- If
is
, then
.
- If
is
, then
.
- If
is
, then
.
Problem 3 (Reflection) Let
be an inaccessible cardinal such that
for all
. Prove that for any
there exists
such that
; in other words, the set of
such that
is unbounded in
. This means that properties of
reflect down to properties of
.
Problem 4 (Inaccessible Cardinal Produce Models) Let
be an inaccessible cardinal. Prove that
is a model of
.
(Standard post on cardinals, as a prerequisite for forthcoming theory model post.)
An ordinal measures a total ordering. However, it does not do a fantastic job at measuring size. For example, there is a bijection between the elements of and
:
In fact, as you likely already know, there is even a bijection between and
:
So ordinals do not do a good job of keeping track of size. For this, we turn to the notion of a cardinal number.
Definition 1 Two sets
and
are equinumerous, written
, if there is a bijection between them.
Definition 2 A cardinal is an ordinal
such that for no
do we have
.
Example 3 (Examples of Cardinals) Every finite number is a cardinal. Moreover,
is a cardinal. However,
,
,
are not, because they are countable.
Example 4 (
is Countable) Even
is not a cardinal, since it is a countable union
and each
is countable.
Ques 5 Why must an infinite cardinal be a limit ordinal?
Remark 6 There is something fishy about the definition of a cardinal: it relies on an external function
. That is, to verify
is a cardinal I can’t just look at
itself; I need to examine the entire universe
to make sure there does not exist a bijection
for
. For now this is no issue, but later in model theory this will lead to some highly counterintuitive behavior.
Now that we have defined a cardinal, we can discuss the size of a set by linking it to a cardinal.
Definition 7 The cardinality of a set
is the least ordinal
such that
. We denote it by
.
Ques 8 Why must
be a cardinal?
Remark 9 One needs the Well-Ordering Theorem (equivalently, Choice) in order to establish that such an ordinal
actually exists.
Since cardinals are ordinals, it makes sense to ask whether , and so on. Our usual intuition works well here.
Proposition 10 (Restatement of Cardinality Properties) Let
and
be sets.
if and only
, if and only if there is a bijection between
and
.
if and only if there is an injective map
.
Ques 11 Prove this.
Prototypical example for this section: is
, and
is the first uncountable
First, let us check that cardinals can get arbitrarily large:
Proposition 12 We have
for every set
.
Proof: There is an injective map but there is no injective map
by Cantor’s diagonal argument.
Thus we can define:
Definition 13 For a cardinal
, we define
to be the least cardinal above
, called the successor cardinal.
This exists and has
.
Next, we claim that:
Exercise 14 Show that if
is a set of cardinals, then
is a cardinal.
Thus by transfinite induction we obtain that:
Definition 15 For any
, we define the aleph numbers as
Thus we have the following sequence of cardinals:
By definition, is the cardinality of the natural numbers,
is the first uncountable ordinal, \dots.
We claim the aleph numbers constitute all the cardinals:
Lemma 16 (Aleph Numbers Constitute All Infinite Cardinals) If
is a cardinal then either
is finite (i.e.
) or
for some
.
Proof: Assume is infinite, and take
minimal with
. Suppose for contradiction that we have
. We may assume
, since the case
is trivial.
If is a successor, then
which contradicts the fact the definition of the successor cardinal. If is a limit ordinal, then
is the supremum
. So there must be some
has
, which contradicts the minimality of
.
Definition 17 An infinite cardinal which is not a successor cardinal is called a limit cardinal. It is exactly those cardinals of the form
, for
a limit ordinal, plus
.
Prototypical example for this section:
Recall the way we set up ordinal arithmetic. Note that in particular, and
. Since cardinals count size, this property is undesirable, and we want to have
because and
are countable. In the case of cardinals, we simply “ignore order”.
The definition of cardinal arithmetic is as expected:
Definition 18 (Cardinal Arithmetic) Given cardinals
and
, define
and
Ques 19 Check this agrees with what you learned in pre-school for finite cardinals.
This is a slight abuse of notation since we are using the same symbols as for ordinal arithmetic, even though the results are different ( but
). In general, I’ll make it abundantly clear whether I am talking about cardinal arithmetic or ordinal arithmetic. To help combat this confusion, we use separate symbols for ordinals and cardinals. Specifically,
will always refer to
viewed as an ordinal;
will always refer to the same set viewed as a cardinal. More generally,
Definition 20 Let
viewed as an ordinal.
However, as we’ve seen already we have that . In fact, this holds even more generally:
Theorem 21 (Infinite Cardinals Squared) Let
be an infinite cardinal. Then
.
Proof: Obviously , so we want to show
.
The idea is to repeat the same proof that we had for , so we re-iterate it here. We took the “square” of elements of
, and then re-ordered it according to the diagonal:
Let’s copy this idea for a general .
We proceed by transfinite induction on . The base case is
, done above. For the inductive step, first we put the “diagonal” ordering
on
as follows: for
and
in
we declare
if
Then is a well-ordering of
, so we know it is in order-preserving bijection with some ordinal
. Our goal is to show that
. To do so, it suffices to prove that for any
, we have
.
Suppose corresponds to the point
under this bijection. If
and
are both finite, then certainly
is finite too. Otherwise, let
; then the number of points below
is at most
by the inductive hypothesis. So as desired.
From this it follows that cardinal addition and multiplication is really boring:
Theorem 22 (Infinite Cardinal Arithmetic is Trivial) Given cardinals
and
, one of which is infinite, we have
Proof: The point is that both of these are less than the square of the maximum. Writing out the details:
Prototypical example for this section: .
Definition 23 Suppose
and
are cardinals. Then
Here
is the set of functions from
to
.
As before, we are using the same notation for both cardinal and ordinal arithmetic. Sorry!
In particular, , and so from now on we can use the notation
freely. (Note that this is totally different from ordinal arithmetic; there we had
. In cardinal arithmetic
.)
I have unfortunately not told you what equals. A natural conjecture is that
; this is called the Continuum Hypothesis. It turns out to that this is undecidable — it is not possible to prove or disprove this from the
axioms.
Prototypical example for this section: ,
, \dots are all regular, but
has cofinality
.
Definition 24 Let
be a limit ordinal, and
another ordinal. A map
of ordinals is called cofinal if for every
, there is some
such that
. In other words, the map reaches arbitrarily high into
.
Example 25 (Example of a Cofinal Map)
- The map
by
is cofinal.
- For any ordinal
, the identity map
is cofinal.
Definition 26 Let
be a limit ordinal. The cofinality of
, denoted
, is the smallest ordinal
such that there is a cofinal map
.
Ques 27 Why must
be an infinite cardinal?
Usually, we are interested in taking the cofinality of a cardinal .
Pictorially, you can imagine standing at the bottom of the universe and looking up the chain of ordinals to . You have a machine gun and are firing bullets upwards, and you want to get arbitrarily high but less than
. The cofinality is then the number of bullets you need to do this.
We now observe that “most” of the time, the cofinality of a cardinal is itself. Such a cardinal is called regular.
Example 28 (
is Regular)
, because no finite subset of
can reach arbitrarily high.
Example 29 (
is Regular)
. Indeed, assume for contradiction that some countable set of ordinals
reaches arbitrarily high inside
. Then
is a countable ordinal, because it is a countable union of countable ordinals. In other words
. But
is an upper bound for
, contradiction.
On the other hand, there are cardinals which are not regular; since these are the “rare” cases we call them singular.
Example 30 (
is Not Regular) Notice that
reaches arbitrarily high in
, despite only having
terms. It follows that
.
We now confirm a suspicion you may have:
Theorem 31 (Successor Cardinals Are Regular) If
is a successor cardinal, then it is regular.
Proof: We copy the proof that was regular.
Assume for contradiction that for some , there are
sets reaching arbitrarily high in
as a cardinal. Observe that each of these sets must have cardinality at most
. We take the union of all
sets, which gives an ordinal
serving as an upper bound.
The number of elements in the union is at most
and hence .
So, what about limit cardinals? It seems to be that most of them are singular: if is a limit ordinal, then the sequence
(of length
) is certainly cofinal.
Example 32 (Beth Fixed Point) Consider the monstrous cardinal
This might look frighteningly huge, as
, but its cofinality is
as it is the limit of the sequence
More generally, one can in fact prove that
But it is actually conceivable that is so large that
.
A regular limit cardinal other than has a special name: it is weakly inaccessible. Such cardinals are so large that it is impossible to prove or disprove their existence in
. It is the first of many so-called “large cardinals”.
An infinite cardinal is a strong limit cardinal if
for any cardinal . For example,
is a strong limit cardinal.
Ques 33 Why must strong limit cardinals actually be limit cardinals? (This is offensively easy.)
A regular strong limit cardinal other than is called strongly inaccessible.
Problem 1 Compute
.
Problem 2 Prove that for any limit ordinal
,
is a regular cardinal.
Sproblem 3 (Strongly Inaccessible Cardinals) Show that for any strongly inaccessible
, we have
.
Problem 4 (Konig’s Theorem) Show that
for every infinite cardinal
.
(This post is a draft of a chapter from my Napkin project.)