Positive Definite Quadratic Forms

I’m reading through Primes of the Form {x^2+ny^2}, by David Cox (link; it’s good!). Here are the high-level notes I took on the first chapter, which is about the theory of quadratic forms.

(Meta point re blog: I’m probably going to start posting more and more of these more high-level notes/sketches on this blog on topics that I’ve been just learning. Up til now I’ve been mostly only posting things that I understand well and for which I have a very polished exposition. But the perfect is the enemy of the good here; given that I’m taking these notes for my own sake, I may as well share them to help others.)

1. Overview

Definition 1

For us a quadratic form is a polynomial {Q = Q(x,y) = ax^2 + bxy + cy^2}, where {a}, {b}, {c} are some integers. We say that it is primitive if {\gcd(a,b,c) = 1}.

For example, we have the famous quadratic form

\displaystyle  Q_{\text{Fermat}}(x,y) = x^2+y^2.

As readers are probably aware, we can say a lot about exactly which integers can be represented by {Q_{\text{Fermat}}}: by Fermat’s Christmas theorem, the primes {p \equiv 1 \pmod 4} (and {p=2}) can all be written as the sum of two squares, while the primes {p \equiv 3 \pmod 4} cannot. For convenience, let us say that:

Definition 2

Let {Q} be a quadratic form. We say it represents the integer {m} if there exists {x,y \in \mathbb Z} with {m = Q(x,y)}. Moreover, {Q} properly represents {m} if one can find such {x} and {y} which are also relatively prime.

The basic question is: what can we say about which primes/integers are properly represented by a quadratic form? In fact, we will later restrict our attention to “positive definite” forms (described later).

For example, Fermat’s Christmas theorem now rewrites as:

Theorem 3 (Fermat’s Christmas theorem for primes)

An odd prime {p} is (properly) represented by {Q_{\text{Fermat}}} if and only if {p \equiv 1 \pmod 4}.

The proof of this is classical, see for example my olympiad handout. We also have the formulation for odd integers:

Theorem 4 (Fermat’s Christmas theorem for odd integers)

An odd integer {m} is properly represented by {Q_{\text{Fermat}}} if and only if all prime factors of {m} are {1 \pmod 4}.

Proof: For the “if” direction, we use the fact that {Q_{\text{Fermat}}} is multiplicative in the sense that

\displaystyle  (x^2+y^2)(u^2+v^2) = (xu \pm yv)^2 + (xv \mp yu)^2.

For the “only if” part we use the fact that if a multiple of a prime {p} is properly represented by {Q_{\text{Fermat}}}, then so is {p}. This follows by noticing that if {x^2+y^2 \equiv 0 \pmod p} (and {xy \not\equiv 0 \pmod p}) then {(x/y)^2 \equiv -1 \pmod p}. \Box
Tangential remark: the two ideas in the proof will grow up in the following way.

  • The fact that {Q_{\text{Fermat}}} “multiplies nicely” will grow up to become the so-called composition of quadratic forms.
  • The second fact will not generalize for an arbitrary form {Q}. Instead, we will see that if a multiple of {p} is represented by a form {Q} then some form of the same “discriminant” will represent the prime {p}, but this form need not be the same as {Q} itself.

2. Equivalence of forms, and the discriminant

The first thing we should do is figure out when two forms are essentially the same: for example, {x^2+5y^2} and {5x^2+y^2} should clearly be considered the same. More generally, if we think of {Q} as acting on {\mathbb Z^{\oplus 2}} and {T} is any automorphism of {\mathbb Z^{\oplus 2}}, then {Q \circ T} should be considered the same as {Q}. Specifically,

Definition 5

Two forms {Q_1} and {Q_2} said to be equivalent if there exists

\displaystyle  T = \begin{pmatrix} p & q \\ r & s \end{pmatrix} \in \text{GL }(2,\mathbb Z)

such that {Q_2(x,y) = Q_1(px+ry, qx+sy)}. We have {\det T = ps-qr = \pm 1} and so we say the equivalence is

  • a proper equivalence if {\det T = +1}, and
  • an improper equivalence if {\det T = -1}.

So we generally will only care about forms up to proper equivalence. (It will be useful to distinguish between proper/improper equivalence later.)

Naturally we seek some invariants under this operation. By far the most important is:

Definition 6

The discriminant of a quadratic form {Q = ax^2 + bxy + cy^2} is defined as

\displaystyle  D = b^2-4ac.

The discriminant is invariant under equivalence (check this). Note also that we also have {D \equiv 0 , 1 \pmod 4}.

Observe that we have

\displaystyle  4a \cdot (ax^2+bxy+cy^2) = (2ax + by)^2 - Dy^2.

So if {D < 0} and {a > 0} (thus {c > 0} too) then {ax^2+bxy+cy^2 > 0} for all {x,y > 0}. Such quadratic forms are called positive definite, and we will restrict our attention to these forms.

Now that we have this invariant, we may as well classify equivalence classes of quadratic forms for a fixed discriminant. It turns out this can be done explicitly.

Definition 7

A quadratic form {Q = ax^2 + bxy + cy^2} is reduced if

  • it is primitive and positive definite,
  • {|b| \le a \le c}, and
  • {b \ge 0} if either {|b| = a} or {a = c}.

Exercise 8

Check there only finitely many reduced forms of a fixed discriminant.

Then the big huge theorem is:

Theorem 9 (Reduced forms give a set of representatives)

Every primitive positive definite form {Q} of discriminant is properly equivalent to a unique reduced form. We call this the reduction of {Q}.

Proof: Omitted due to length, but completely elementary. It is a reduction argument with some number of cases. \Box

Thus, for any discriminant {D} we can consider the set

\displaystyle  \text{Cl}(D) = \left\{ \text{reduced forms of discriminant } D \right\}

which will be the equivalence classes of positive definite of discriminant {D}. By abuse of notation we will also consider it as the set of equivalence classes of primitive positive definite forms of discriminant {D}.

We also define {h(D) = \left\lvert \text{Cl}(D) \right\rvert}; by the exercise, {h(D) < \infty}. This is called the class number.

Moreover, we have {h(D) \ge 1}, because we can take {x^2 - D/4 y^2} for {D \equiv 0 \pmod 4} and {x^2 + xy + (1-D)/4 y^2} for {D \equiv 1 \pmod 4}. We call this form the principal form.

3. Tables of quadratic forms

Example 10 (Examples of quadratic forms with {h(D) = 1}, {D \equiv 0 \pmod 4})

The following discriminants have class number {h(D) = 1}, hence having only the principal form:

  • {D = -4}, with form {x^2 + y^2}.
  • {D = -8}, with form {x^2 + 2y^2}.
  • {D = -12}, with form {x^2+3y^2}.
  • {D = -16}, with form {x^2 + 4y^2}.
  • {D = -28}, with form {x^2 + 7y^2}.

This is in fact the complete list when {D \equiv 0 \pmod 4}.

Example 11 (Examples of quadratic forms with {h(D) = 1}, {D \equiv 1 \pmod 4})

The following discriminants have class number {h(D) = 1}, hence having only the principal form:

  • {D = -3}, with form {x^2 + xy + y^2}.
  • {D = -7}, with form {x^2 + xy + 2y^2}.
  • {D = -11}, with form {x^2 + xy + 3y^2}.
  • {D = -19}, with form {x^2 + xy + 5y^2}.
  • {D = -27}, with form {x^2 + xy + 7y^2}.
  • {D = -43}, with form {x^2 + xy + 11y^2}.
  • {D = -67}, with form {x^2 + xy + 17y^2}.
  • {D = -163}, with form {x^2 + xy + 41y^2}.

This is in fact the complete list when {D \equiv 1 \pmod 4}.

Example 12 (More examples of quadratic forms)

Here are tables for small discriminants with {h(D) > 1}. When {D \equiv 0 \pmod 4} we have

  • {D = -20}, with {h(D) = 2} forms {2x^2 + 2xy + 3y^2} and {x^2 + 5y^2}.
  • {D = -24}, with {h(D) = 2} forms {2x^2 + 3y^2} and {x^2 + 6y^2}.
  • {D = -32}, with {h(D) = 2} forms {3x^2 + 2xy + 3y^2} and {x^2 + 8y^2}.
  • {D = -36}, with {h(D) = 2} forms {2x^2 + 2xy + 5y^2} and {x^2 + 9y^2}.
  • {D = -40}, with {h(D) = 2} forms {2x^2 + 5y^2} and {x^2 + 10y^2}.
  • {D = -44}, with {h(D) = 3} forms {3x^2 \pm 2xy + 4y^2} and {x^2 + 11y^2}.

As for {D \equiv 1 \pmod 4} we have

  • {D = -15}, with {h(D) = 2} forms {2x^2 + xy + 2y^2} and {x^2 + xy + 4y^2}.
  • {D = -23}, with {h(D) = 3} forms {2x^2 \pm xy + 3y^2} and {x^2+ xy + 6y^2}.
  • {D = -31}, with {h(D) = 3} forms {2x^2 \pm xy + 4} and {x^2 + xy + 8y^2}.
  • {D = -39}, with {h(D) = 4} forms {3x^2 + 3xy + 4y^2}, {2x^2 \pm 2xy + 5y^2} and {x^2 + xy + 10y^2}.

Example 13 (Even More Examples of quadratic forms)

Here are some more selected examples:

  • {D = -56} has {h(D) = 4} forms {x^2+14y^2}, {2x^2+7y^2} and {3x^2 \pm 2xy + 5y^2}.
  • {D = -108} has {h(D) = 3} forms {x^2+27y^2} and {4x^2 \pm 2xy + 7y^2}.
  • {D = -256} has {h(D) = 4} forms {x^2+64y^2}, {4x^2+4xy+17y^2} and {5x^2\pm2xy+13y^2}.

4. The Character {\chi_D}

We can now connect this to primes {p} as follows. Earlier we played with {Q_{\text{Fermat}} = x^2+y^2}, and observed that for odd primes {p}, {p \equiv 1 \pmod 4} if and only if some multiple of {p} is properly represented by {Q_{\text{Fermat}}}.

Our generalization is as follows:

Theorem 14 (Primes represented by some quadratic form)

Let {D < 0} be a discriminant, and let {p \nmid D} be an odd prime. Then the following are equivalent:

  • {\left( \frac Dp \right) = 1}, i.e. {D} is a quadratic residue modulo {p}.
  • The prime {p} is (properly) represented by some reduced quadratic form in {\text{Cl}(D)}.

This generalizes our result for {Q_{\text{Fermat}}}, but note that it uses {h(-4) = 1} in an essential way! That is: if {(-1/p) = 1}, we know {p} is represented by some quadratic form of discriminant {D = -4}\dots but only since {h(-4) = 1} do we know that this form reduces to {Q_{\text{Fermat}} = x^2+y^2}.

Proof: First assume WLOG that {p \nmid 4a} and {Q(x,y) \equiv 0 \pmod p}. Thus {p \nmid y}, since otherwise this would imply {x \equiv y \equiv 0 \pmod p}. Then

\displaystyle  0 \equiv 4a \cdot Q(x,y) \equiv (2ax + by)^2 - Dy^2 \pmod p

hence {D \equiv \left( 2axy^{-1} + b \right)^2 \pmod p}.

The converse direction is amusing: let {m^2 = D + pk} for integers {m}, {k}. Consider the quadratic form

\displaystyle  Q(x,y) = px^2 + mxy + ky^2.

It is primitive of discriminant {D} and {Q(1,0) = p}. Now {Q} may not be reduced, but that’s fine: just take the reduction of {Q}, which must also properly represent {p}. \Box

Thus to every discriminant {D < 0} we can attach the Legendre character (is that the name?), which is a homomorphism

\displaystyle  \chi_D = \left( \tfrac{D}{\bullet} \right) : \left( \mathbb Z / D\mathbb Z \right)^\times \rightarrow \{ \pm 1 \}

with the property that if {p} is a rational prime not dividing {D}, then {\chi_D(p) = \left( \frac{D}{p} \right)}. This is abuse of notation since I should technically write {\chi_D(p \pmod D)}, but there is no harm done: one can check by quadratic reciprocity that if {p \equiv q \pmod D} then {\chi_D(p) = \chi_D(q)}. Thus our previous result becomes:

Theorem 15 ({\ker(\chi_D)} consists of representable primes)

Let {p \nmid D} be prime. Then {p \in \ker(\chi_D)} if and only if some quadratic form in {\text{Cl}(D)} represents {p}.

As a corollary of this, using the fact that {h(-8) = h(-12) = h(-28) = 1} one can prove that

Corollary 16 (Fermat-type results for {h(-4n) = 1})

Let {p > 7} be a prime. Then {p} is

  • of the form {x^2 + 2y^2} if and only if {p \equiv 1, 3 \pmod 8}.
  • of the form {x^2 + 3y^2} if and only if {p \equiv 1 \pmod 3}.
  • of the form {x^2 + 7y^2} if and only if {p \equiv 1, 2, 4 \pmod 7}.

Proof: The congruence conditions are equivalent to {(-4n/p) = 1}, and as before the only point is that the only reduced quadratic form for these {D = -4n} is the principal one. \Box

5. Genus theory

What if {h(D) > 1}? Sometimes, we can still figure out which primes go where just by taking mods.

Let {Q \in \text{Cl}(D)}. Then it represents some residue classes of {(\mathbb Z/D\mathbb Z)^\times}. In that case we call the set of residue classes represented the genus of the quadratic form {Q}.

Example 17 (Genus theory of {D = -20})

Consider {D = -20}, with

\displaystyle  \ker(\chi_D) = \left\{ 1, 3, 7, 9 \right\} \subseteq (\mathbb Z/D\mathbb Z)^\times.

We consider the two elements of {\text{Cl}(D)}:

  • {x^2 + 5y^2} represents {1, 9 \in (\mathbb Z/20\mathbb Z)^\times}.
  • {2x^2+2xy+3y^2} represents {3, 7 \in (\mathbb Z/20\mathbb Z)^\times}.

Now suppose for example that {p \equiv 9 \pmod{20}}. It must be represented by one of these two quadratic forms, but the latter form is never {9 \pmod{20}} and so it must be the first one. Thus we conclude that

  • {p = x^2+5y^2} if and only if {p \equiv 1, 9 \pmod{20}}.
  • {p = 2x^2 + 2xy + 3y^2} if and only if {p \equiv 3, 7 \pmod{20}}.

The thing that makes this work is that each genus appears exactly once. We are not always so lucky: for example when {D = -108} we have that

Example 18 (Genus theory of {D = -108})

The two elements of {\text{Cl}(-108)} are:

  • {x^2+27y^2}, which represents exactly the {1 \pmod 3} elements of {(\mathbb Z/D\mathbb Z)^\times}.
  • {4x^2 \pm 2xy + 7y^2}, which also represents exactly the {1 \pmod 3} elements of {(\mathbb Z/D\mathbb Z)^\times}.

So the best we can conclude is that {p = x^2+27y^2} OR {p = 4x^2\pm2xy+7y^2} if and only if {p \equiv 1 \pmod 3} This is because the two distinct quadratic forms of discriminant {-108} happen to have the same genus.

We now prove that:

Theorem 19 (Genii are cosets of {\ker(\chi_D)})

Let {D} be a discriminant and consider the Legendre character {\chi_D}.

  • The genus of the principal form of discriminant {D} constitutes a subgroup {H} of {\ker(\chi_D)}, which we call the principal genus.
  • Any genus of a quadratic form in {\text{Cl}(D)} is a coset of the principal genus {H} in {\ker(\chi_D)}.

Proof: For the first part, we aim to show {H} is multiplicatively closed. For {D \equiv 0 \pmod 4}, {D = -4n} we use the fact that

\displaystyle  (x^2+ny^2)(u^2+nv^2) = (xu \pm nyv)^2 + n(xv \mp yu)^2.

For {D \equiv 1 \pmod 4}, we instead appeal to another “magic” identity

\displaystyle  4\left( x^2+xy+\frac{1-D}{4}y^2 \right) \equiv (2x+y)^2 \pmod D

and it follows from here that {H} is actually the set of squares in {(\mathbb Z/D\mathbb Z)^\times}, which is obviously a subgroup.

Now we show that other quadratic forms have genus equal to a coset of the principal genus. For {D \equiv 0 \pmod 4}, with {D = -4n} we can write

\displaystyle  a(ax^2+bxy+cy^2) = (ax+b/2 y)^2 + ny^2

and thus the desired coset is shown to be {a^{-1} H}. As for {D \equiv 1 \pmod 4}, we have

\displaystyle  4a \cdot (ax^2+bxy+cy^2) = (2ax + by)^2 - Dy^2 \equiv (2ax+by)^2 \pmod D

so the desired coset is also {a^{-1} H}, since {H} was the set of squares. \Box

Thus every genus is a coset of {H} in {\ker(\chi_D)}. Thus:

Definition 20

We define the quotient group

\displaystyle  \text{Gen}(D) = \ker(\chi_D) / H

which is the set of all genuses in discriminant {D}. One can view this as an abelian group by coset multiplication.

Thus there is a natural map

\displaystyle  \Phi_D : \text{Cl}(D) \twoheadrightarrow \text{Gen}(D).

(The map is surjective by Theorem~14.) We also remark than {\text{Gen}(D)} is quite well-behaved:

Proposition 21 (Structure of {\text{Gen}(D)})

The group {\text{Gen}(D)} is isomorphic to {(\mathbb Z/2\mathbb Z)^{\oplus m}} for some integer {m}.

Proof: Observe that {H} contains all the squares of {\ker(\chi_D)}: if {f} is the principal form then {f(t,0) = t^2}. Thus claim each element of {\text{Gen}(D)} has order at most {2}, which implies the result since {\text{Gen}(D)} is a finite abelian group. \Box

In fact, one can compute the order of {\text{Gen}(D)} exactly, but for this post I Will just state the result.

Theorem 22 (Order of {\text{Gen}(D)})

Let {D < 0} be a discriminant, and let {r} be the number of distinct odd primes which divide {D}. Define {\mu} by:

  • {\mu = r} if {D \equiv 1 \pmod 4}.
  • {\mu = r} if {D = -4n} and {n \equiv 3 \pmod 4}.
  • {\mu = r+1} if {D = -4n} and {n \equiv 1,2 \pmod 4}.
  • {\mu = r+1} if {D = -4n} and {n \equiv 4 \pmod 8}.
  • {\mu = r+2} if {D = -4n} and {n \equiv 0 \pmod 8}.

Then {\left\lvert \text{Gen}(D) \right\rvert = 2^{\mu-1}}.

6. Composition

We have already used once the nice identity

\displaystyle  (x^2+ny^2)(u^2+nv^2) = (xu \pm nyv)^2 + n(xv \mp yu)^2.

We are going to try and generalize this for any two quadratic forms in {\text{Cl}(D)}. Specifically,

Proposition 23 (Composition defines a group operation)

Let {f,g \in \text{Cl}(D)}. Then there is a unique {h \in \text{Cl}(D)} and bilinear forms {B_i(x,y,z,w) = a_ixz + b_ixw + c_iyz + d_iyw} for {i=1,2} such that

  • {f(x,y) g(z,w) = h(B_1(x,y,z,w), B_2(x,y,z,w))}.
  • {a_1b_2 - a_2b_1 = +f(1,0)}.
  • {a_1c_2 - a_2c_1 = +g(1,0)}.

In fact, without the latter two constraints we would instead have {a_1b_2 - a_2b_1 = \pm f(1,0)} and {a_1c_2 - a_2c_1 = \pm g(1,0)}, and each choice of signs would yield one of four (possibly different) forms. So requiring both signs to be positive makes this operation well-defined. (This is why we like proper equivalence; it gives us a well-defined group structure, whereas with improper equivalence it would be impossible to put a group structure on the forms above.)

Taking this for granted, we then have that

Theorem 24 (Form class group)

Let {D \equiv 0, 1 \pmod 4}, {D < 0} be a discriminant. Then {\text{Cl}(D)} becomes an abelian group under composition, where

  • The identity of {\text{Cl}(D)} is the principal form, and
  • The inverse of the form {ax^2+bxy+cy^2} is {ax^2-bxy+cy^2}.

This group is called the form class group.

We then have a group homomorphism

\displaystyle  \Phi_D : \text{Cl}(D) \twoheadrightarrow \text{Gen}(D).

Observe that {ax^2 + bxy + cy^2} and {ax^2 - bxy + cy^2} are inverses and that their {\Phi_D} images coincide (being improperly equivalent); this is expressed in the fact that {\text{Gen}(D)} has elements of order {\le 2}. As another corollary, the number of elements of {\text{Cl}(D)} with a given genus is always a power of two.

We now define:

Definition 25

An integer {n \ge 1} is convenient if the following equivalent conditions hold:

  • The principal form {x^2+ny^2} is the only reduced form with the principal genus.
  • {\Phi_D} is injective (hence an isomorphism).
  • {\left\lvert h(D) \right\rvert = 2^{\mu-1}}.

Thus we arrive at the following corollary:

Corollary 26 (Convenient numbers have nice representations)

Let {n \ge 1} be convenient. Then {p} is of the form {x^2+ny^2} if and only if {p} lies in the principal genus.

Hence the represent-ability depends only on {p \pmod{4n}}.

OEIS A000926 lists 65 convenient numbers. This sequence is known to be complete except for at most one more number; moreover the list is complete assuming the Grand Riemann Hypothesis.

7. Cubic and quartic reciprocity

To treat the cases where {n} is not convenient, the correct thing to do is develop class field theory. However, we can still make a little bit more progress if we bring higher reciprocity theorems to bear: we’ll handle the cases {n=27} and {n=64}, two examples of numbers which are not convenient.

7.1. Cubic reciprocity

First, we prove that

Theorem 27 (On {p = x^2+27y^2})

A prime {p > 3} is of the form {x^2+27y^2} if and only if {p \equiv 1 \pmod 3} and {2} is a cubic residue modulo {p}.

To do this we use cubic reciprocity, which requires working in the Eisenstein integers {\mathbb Z[\omega]} where {\omega} is a cube root of unity. There are six units in {\mathbb Z[\omega]} (the sixth roots of unity), hence each nonzero number has six associates (differing by a unit), and the ring is in fact a PID.

Now if we let {\pi} be a prime not dividing {3}, and {\alpha} is coprime to {\pi}, then we can define the cubic Legendre symbol by setting

\displaystyle  \left( \frac{\alpha}{\pi} \right)_3 \equiv \alpha^{\frac13(N\pi-1)} \pmod \pi \in \left\{ 1, \omega, \omega^2 \right\}.

Moreover, we can define a primary prime {\pi \nmid 3} to be one such that {\pi \equiv -1 \pmod 3}; given any prime exactly one of the six associates is primary. We then have the following reciprocity theorem:

Theorem 28 (Cubic reciprocity)

If {\pi} and {\theta} are disjoint primary primes in {\mathbb Z[\omega]} then

\displaystyle  \left( \frac{\pi}{\theta} \right)_3 = \left( \frac{\theta}{\pi} \right)_3.

We also have the following supplementary laws: if {\pi = (3m-1) + 3n\omega}, then

\displaystyle  \left( \frac{\omega}{\pi} \right)_3 = \omega^{m+n} \qquad\text{and}\qquad \left( \frac{1-\omega}{\pi} \right)_3 = \omega^{2m}.

The first supplementary law is for the unit (analogous to {(-1/p)}) while the second reciprocity law handles the prime divisors of {3 = -\omega^2(1-\omega)^2} (analogous to {(2/p)}.)

We can tie this back into {\mathbb Z} as follows. If {p \equiv 1 \pmod 3} is a rational prime then it is represented by {x^2+xy+y^2}, and thus we can put {p = \pi \overline{\pi}} for some prime {\pi}, {N(\pi) = p}. Consequently, we have a natural isomorphism

\displaystyle  \mathbb Z[\omega] / \pi \mathbb Z[\omega] \cong \mathbb Z / p \mathbb Z.

Therefore, we see that a given {a \in (\mathbb Z/p\mathbb Z)^\times} is a cubic residue if and only if {(\alpha/\pi)_3 = 1}.

In particular, we have the following corollary, which is all we will need:

Corollary 29 (When {2} is a cubic residue)

Let {p \equiv 1 \pmod 3} be a rational prime, {p > 3}. Write {p = \pi \overline{\pi}} with {\pi} primary. Then {2} is a cubic residue modulo {p} if and only if {\pi \equiv 1 \pmod 2}.

Proof: By cubic reciprocity:

\displaystyle  \left( \frac{2}{\pi} \right)_3 = \left( \frac{\pi}{2} \right)_3 \equiv \pi^{\frac13(N2-1)} \equiv \pi \pmod 2.


Now we give the proof of Theorem~27. Proof: First assume

\displaystyle  p = x^2+27y^2 = \left( x+3\sqrt 3 y \right)\left( x-3\sqrt 3 y \right).

Let {\pi = x + 3 \sqrt{-3} y = (x+3y) + 6y\omega} be primary, noting that {\pi \equiv 1 \pmod 2}. Now clearly {p \equiv 1 \pmod 3}, so done by corollary.

For the converse, assume {p \equiv 1 \pmod 3}, {p = \pi \overline{\pi}} with {\pi} primary and {\pi \equiv 1 \pmod 2}. If we set {\pi = a + b\omega} for integers {a} and {b}, then the fact that {\pi \equiv 1 \pmod 2} and {\pi \equiv -1 \pmod 3} is enough to imply that {6 \mid b} (check it!). Moreover,

\displaystyle  p = a^2-ab+b^2 = \left( a - \frac{1}{2} b \right)^2 + 27 \left( \frac16b \right)^2

as desired. \Box

7.2. Quartic reciprocity

This time we work in {\mathbb Z[i]}, for which there are four units {\pm 1}, {\pm i}. A prime is primary if {\pi \equiv 1 \pmod{2+2i}}; every prime not dividing {2 = -i(1+i)^2} has a unique associate which is primary. Then we can as before define

\displaystyle  \alpha^{\frac14(N\pi-1)} \equiv \left( \frac{\alpha}{\pi} \right)_4 \pmod{\pi} \in \left\{ \pm 1, \pm i \right\}

where {\pi} is primary, and {\alpha} is nonzero mod {\pi}. As before {p \equiv 1 \pmod 4}, {p = \pi\overline{\pi}} we have that {a} is a quartic residue modulo {p} if and only if {\left( a/\pi \right)_4 = 1} thanks to the isomorphism

\displaystyle  \mathbb Z[i] / \pi \mathbb Z[i] \cong \mathbb Z / p \mathbb Z.

Now we have

Theorem 30 (Quartic reciprocity)

If {\pi} and {\theta} are distinct primary primes in {\mathbb Z[i]} then

\displaystyle  \left( \frac{\theta}{\pi} \right)_4 = \left( \frac{\pi}{\theta} \right)_4 (-1)^{\frac{1}{16}(N\theta-1)(N\pi-1)}.

We also have supplementary laws that state that if {\pi = a+bi} is primary, then

\displaystyle  \left( \frac{i}{\pi} \right)_4 = i^{-\frac{1}{2}(a-1)} \qquad\text{and}\qquad \left( \frac{1+i}{\pi} \right)_4 = i^{\frac14(a-b-b^2-1)}.

Again, the first law handles units, and the second law handles the prime divisors of {2}. The corollary we care about this time in fact uses only the supplemental laws:

Corollary 31 (When {2} is a quartic residue)

Let {p \equiv 1 \pmod 4} be a prime, and put {p = \pi\overline{\pi}} with {\pi = a+bi} primary. Then

\displaystyle  \left( \frac{2}{\pi} \right)_4 = i^{-b/2}

and in particular {2} is a quartic residue modulo {p} if and only if {b \equiv 0 \pmod 8}.

Proof: Note that {2 = i^3(1+i)^2} and applying the above. Therefore

\displaystyle  \left( \frac{2}{\pi} \right)_4 = \left( \frac{i}{\pi} \right)_4^3 \left( \frac{1+i}{\pi} \right)_4^2 = i^{-\frac32(a-1)} \cdot i^{\frac12(a-b-b^2-1)} = i^{-(a-1) - \frac{1}{2} b(b+1)}.

Now we assumed {a+bi} is primary. We claim that

\displaystyle  a - 1 + \frac{1}{2} b^2 \equiv 0 \pmod 4.

Note that since {(a+bi)-1} was is divisible by {2+2i}, hence {N(2+2i)=8} divides {(a-1)^2+b^2}. Thus

\displaystyle  2(a-1) + b^2 \equiv 2(a-1) + (a-1)^2 \equiv (a-1)(a-3) \equiv 0 \pmod 8

since {a} is odd and {b} is even. Finally,

\displaystyle  \left( \frac{2}{\pi} \right)_4 = i^{-(a-1) - \frac{1}{2} b(b+1)} = i^{-\frac{1}{2} b + (a-1+\frac{1}{2} b^2)} \equiv i^{-\frac{1}{2} b} \pmod p.


From here we quickly deduce

Theorem 32 (On {p = x^2+64y^2})

If {p > 2} is prime, then {p = x^2+64y^2} if and only if {p \equiv 1 \pmod 4} and {2} is a quartic residue modulo {p}.

Facts about Lie Groups and Algebras

In Spring 2016 I was taking 18.757 Representations of Lie Algebras. Since I knew next to nothing about either Lie groups or algebras, I was forced to quickly learn about their basic facts and properties. These are the notes that I wrote up accordingly. Proofs of most of these facts can be found in standard textbooks, for example Kirillov.

1. Lie groups

Let {K = \mathbb R} or {K = \mathbb C}, depending on taste.

Definition 1

A Lie group is a group {G} which is also a {K}-manifold; the multiplication maps {G \times G \rightarrow G} (by {(g_1, g_2) \mapsto g_1g_2}) and the inversion map {G \rightarrow G} (by {g \mapsto g^{-1}}) are required to be smooth.

A morphism of Lie groups is a map which is both a map of manifolds and a group homomorphism.

Throughout, we will let {e \in G} denote the identity, or {e_G} if we need further emphasis.

Note that in particular, every group {G} can be made into a Lie group by endowing it with the discrete topology. This is silly, so we usually require only focus on connected groups:

Proposition 2 (Reduction to connected Lie groups)

Let {G} be a Lie group and {G^0} the connected component of {G} which contains {e}. Then {G^0} is a normal subgroup, itself a Lie group, and the quotient {G/G^0} has the discrete topology.

In fact, we can also reduce this to the study of simply connected Lie groups as follows.

Proposition 3 (Reduction to simply connected Lie groups)

If {G} is connected, let {\pi : \widetilde G \rightarrow G} be its universal cover. Then {\widetilde G} is a Lie group, {\pi} is a morphism of Lie groups, and {\ker \pi \cong \pi_1(G)}.

Here are some examples of Lie groups.

Example 4 (Examples of Lie groups)

  • {\mathbb R} under addition is a real one-dimensional Lie group.
  • {\mathbb C} under addition is a complex one-dimensional Lie group (and a two-dimensional real Lie group)!
  • The unit circle {S^1 \subseteq \mathbb C} is a real Lie group under multiplication.
  • {\text{GL }(n, K) \subset K^{\oplus n^2}} is a Lie group of dimension {n^2}. This example becomes important for representation theory: a representation of a Lie group {G} is a morphism of Lie groups {G \rightarrow \text{GL }(n, K)}.
  • {\text{SL }(n, K) \subset \text{GL }(n, K)} is a Lie group of dimension {n^2-1}.

As geometric objects, Lie groups {G} enjoy a huge amount of symmetry. For example, any neighborhood {U} of {e} can be “copied over” to any other point {g \in G} by the natural map {gU}. There is another theorem worth noting, which is that:

Proposition 5

If {G} is a connected Lie group and {U} is a neighborhood of the identity {e \in G}, then {U} generates {G} as a group.

2. Haar measure

Recall the following result and its proof from representation theory:

Claim 6

For any finite group {G}, {\mathbb C[G]} is semisimple; all finite-dimensional representations decompose into irreducibles.

Proof: Take a representation {V} and equip it with an arbitrary inner form {\left< -,-\right>_0}. Then we can average it to obtain a new inner form

\displaystyle \left< v, w \right> = \frac{1}{|G|} \sum_{g \in G} \left< gv, gw \right>_0.

which is {G}-invariant. Thus given a subrepresentation {W \subseteq V} we can just take its orthogonal complement to decompose {V}. \Box
We would like to repeat this type of proof with Lie groups. In this case the notion {\sum_{g \in G}} doesn’t make sense, so we want to replace it with an integral {\int_{g \in G}} instead. In order to do this we use the following:

Theorem 7 (Haar measure)

Let {G} be a Lie group. Then there exists a unique Radon measure {\mu} (up to scaling) on {G} which is left-invariant, meaning

\displaystyle \mu(g \cdot S) = \mu(S)

for any Borel subset {S \subseteq G} and “translate” {g \in G}. This measure is called the (left) Haar measure.

Example 8 (Examples of Haar measures)

  • The Haar measure on {(\mathbb R, +)} is the standard Lebesgue measure which assigns {1} to the closed interval {[0,1]}. Of course for any {S}, {\mu(a+S) = \mu(S)} for {a \in \mathbb R}.
  • The Haar measure on {(\mathbb R \setminus \{0\}, \times)} is given by

    \displaystyle \mu(S) = \int_S \frac{1}{|t|} \; dt.

    In particular, {\mu([a,b]) = \log(b/a)}. One sees the invariance under multiplication of these intervals.

  • Let {G = \text{GL }(n, \mathbb R)}. Then a Haar measure is given by

    \displaystyle \mu(S) = \int_S |\det(X)|^{-n} \; dX.

  • For the circle group {S^1}, consider {S \subseteq S^1}. We can define

    \displaystyle \mu(S) = \frac{1}{2\pi} \int_S d\varphi

    across complex arguments {\varphi}. The normalization factor of {2\pi} ensures {\mu(S^1) = 1}.

Note that we have:

Corollary 9

If the Lie group {G} is compact, there is a unique Haar measure with {\mu(G) = 1}.

This follows by just noting that if {\mu} is Radon measure on {X}, then {\mu(X) < \infty}. This now lets us deduce that

Corollary 10 (Compact Lie groups are semisimple)

{\mathbb C[G]} is semisimple for any compact Lie group {G}.

Indeed, we can now consider

\displaystyle \left< v,w\right> = \int_G \left< g \cdot v, g \cdot w\right>_0 \; dg

as we described at the beginning.

3. The tangent space at the identity

In light of the previous comment about neighborhoods of {e} generating {G}, we see that to get some information about the entire Lie group it actually suffices to just get “local” information of {G} at the point {e} (this is one formalization of the fact that Lie groups are super symmetric).

To do this one idea is to look at the tangent space. Let {G} be an {n}-dimensional Lie group (over {K}) and consider {\mathfrak g = T_eG} the tangent space to {G} at the identity {e \in G}. Naturally, this is a {K}-vector space of dimension {n}. We call it the Lie algebra associated to {G}.

Example 11 (Lie algebras corresponding to Lie groups)

  • {(\mathbb R, +)} has a real Lie algebra isomorphic to {\mathbb R}.
  • {(\mathbb C, +)} has a complex Lie algebra isomorphic to {\mathbb C}.
  • The unit circle {S^1 \subseteq \mathbb C} has a real Lie algebra isomorphic to {\mathbb R}, which we think of as the “tangent line” at the point {1 \in S^1}.

Example 12 ({\mathfrak{gl}(n, K)})

Let’s consider {\text{GL }(n, K) \subset K^{\oplus n^2}}, an open subset of {K^{\oplus n^2}}. Its tangent space should just be an {n^2}-dimensional {K}-vector space. By identifying the components in the obvious way, we can think of this Lie algebra as just the set of all {n \times n} matrices.

This Lie algebra goes by the notation {\mathfrak{gl}(n, K)}.

Example 13 ({\mathfrak{sl}(n, K)})

Recall {\text{SL }(n, K) \subset \text{GL }(n, K)} is a Lie group of dimension {n^2-1}, hence its Lie algebra should have dimension {n^2-1}. To see what it is, let’s look at the special case {n=2} first: then

\displaystyle \text{SL }(2, K) = \left\{ \begin{pmatrix} a & b \\ c & d \end{pmatrix} \mid ad - bc = 1 \right\}.

Viewing this as a polynomial surface {f(a,b,c,d) = ad-bc} in {K^{\oplus 4}}, we compute

\displaystyle \nabla f = \left< d, -c, -b, a \right>

and in particular the tangent space to the identity matrix {\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}} is given by the orthogonal complement of the gradient

\displaystyle \nabla f (1,0,0,1) = \left< 1, 0, 0, 1 \right>.

Hence the tangent plane can be identified with matrices satisfying {a+d=0}. In other words, we see

\displaystyle \mathfrak{sl}(2, K) = \left\{ T \in \mathfrak{gl}(2, K) \mid \text{Tr } T = 0. \right\}.

By repeating this example in greater generality, we discover

\displaystyle \mathfrak{sl}(n, K) = \left\{ T \in \mathfrak{gl}(n, K) \mid \text{Tr } T = 0. \right\}.

4. The exponential map

Right now, {\mathfrak g} is just a vector space. However, by using the group structure we can get a map from {\mathfrak g} back into {G}. The trick is “differential equations”:

Proposition 14 (Differential equations for Lie theorists)

Let {G} be a Lie group over {K} and {\mathfrak g} its Lie algebra. Then for every {x \in \mathfrak g} there is a unique homomorphism

\displaystyle \gamma_x : K \rightarrow G

which is a morphism of Lie groups, such that

\displaystyle \gamma_x'(0) = x \in T_eG = \mathfrak g.

We will write {\gamma_x(t)} to emphasize the argument {t \in K} being thought of as “time”. Thus this proposition should be intuitively clear: the theory of differential equations guarantees that {\gamma_x} is defined and unique in a small neighborhood of {0 \in K}. Then, the group structure allows us to extend {\gamma_x} uniquely to the rest of {K}, giving a trajectory across all of {G}. This is sometimes called a one-parameter subgroup of {G}, but we won’t use this terminology anywhere in what follows.

This lets us define:

Definition 15

Retain the setting of the previous proposition. Then the exponential map is defined by

\displaystyle \exp : \mathfrak g \rightarrow G \qquad\text{by}\qquad x \mapsto \gamma_x(1).

The exponential map gets its name from the fact that for all the examples I discussed before, it is actually just the map {e^\bullet}. Note that below, {e^T = \sum_{k \ge 0} \frac{T^k}{k!}} for a matrix {T}; this is called the matrix exponential.

Example 16 (Exponential Maps of Lie algebras)

  • If {G = \mathbb R}, then {\mathfrak g = \mathbb R} too. We observe {\gamma_x(t) = e^{tx} \in \mathbb R} (where {t \in \mathbb R}) is a morphism of Lie groups {\gamma_x : \mathbb R \rightarrow G}. Hence

    \displaystyle \exp : \mathbb R \rightarrow \underbrace{\mathbb R}_{=G} \qquad \exp(x) = \gamma_x(1) = e^t \in \mathbb R = G.

  • Ditto for {\mathbb C}.
  • For {S^1} and {x \in \mathbb R}, the map {\gamma_x : \mathbb R \rightarrow S^1} given by {t \mapsto e^{itx}} works. Hence

    \displaystyle \exp : \mathbb R \rightarrow S^1 \qquad \exp(x) = \gamma_x(1) = e^{it} \in S^1.

  • For {\text{GL }(n, K)}, the map {\gamma_X : K \rightarrow \text{GL }(n, K)} given by {t \mapsto e^{tX}} works nicely (now {X} is a matrix). (Note that we have to check {e^{tX}} is actually invertible for this map to be well-defined.) Hence the exponential map is given by

    \displaystyle \exp : \mathfrak{gl}(n,K) \rightarrow \text{GL }(n,K) \qquad \exp(X) = \gamma_X(1) = e^X \in \text{GL }(n, K).

  • Similarly,

    \displaystyle \exp : \mathfrak{sl}(n,K) \rightarrow \text{SL }(n,K) \qquad \exp(X) = \gamma_X(1) = e^X \in \text{SL }(n, K).

    Here we had to check that if {X \in \mathfrak{sl}(n,K)}, meaning {\text{Tr } X = 0}, then {\det(e^X) = 1}. This can be seen by writing {X} in an upper triangular basis.

Actually, taking the tangent space at the identity is a functor. Consider a map {\varphi : G_1 \rightarrow G_2} of Lie groups, with lie algebras {\mathfrak g_1} and {\mathfrak g_2}. Because {\varphi} is a group homomorphism, {G_1 \ni e_1 \mapsto e_2 \in G_2}. Now, by manifold theory we know that maps {f : M \rightarrow N} between manifolds gives a linear map between the corresponding tangent spaces, say {Tf : T_pM \rightarrow T_{fp}N}. For us we obtain a linear map

\displaystyle \varphi_\ast = T \varphi : \mathfrak g_1 \rightarrow \mathfrak g_2.

In fact, this {\varphi_\ast} fits into a diagram


Here are a few more properties of {\exp}:

  • {\exp(0) = e \in G}, which is immediate by looking at the constant trajectory {\phi_0(t) \equiv e}.
  • {\exp'(x) = x \in \mathfrak g}, i.e. the total derivative {D\exp : \mathfrak g \rightarrow \mathfrak g} is the identity. This is again by construction.
  • In particular, by the inverse function theorem this implies that {\exp} is a diffeomorphism in a neighborhood of {0 \in \mathfrak g}, onto a neighborhood of {e \in G}.
  • {\exp} commutes with the commutator. (By the above diagram.)

5. The commutator

Right now {\mathfrak g} is still just a vector space, the tangent space. But now that there is map {\exp : \mathfrak g \rightarrow G}, we can use it to put a new operation on {\mathfrak g}, the so-called commutator.

The idea is follows: we want to “multiply” two elements of {\mathfrak g}. But {\mathfrak g} is just a vector space, so we can’t do that. However, {G} itself has a group multiplication, so we should pass to {G} using {\exp}, use the multiplication in {G} and then come back.

Here are the details. As we just mentioned, {\exp} is a diffeomorphism near {e \in G}. So for {x}, {y} close to the origin of {\mathfrak g}, we can look at {\exp(x)} and {\exp(y)}, which are two elements of {G} close to {e}. Multiplying them gives an element still close to {e}, so its equal to {\exp(z)} for some unique {z}, call it {\mu(x,y)}.

One can show in fact that {\mu} can be written as a Taylor series in two variables as

\displaystyle \mu(x,y) = x + y + \frac{1}{2} [x,y] + \text{third order terms} + \dots

where {[x,y]} is a skew-symmetric bilinear map, meaning {[x,y] = -[y,x]}. It will be more convenient to work with {[x,y]} than {\mu(x,y)} itself, so we give it a name:

Definition 17

This {[x,y]} is called the commutator of {G}.

Now we know multiplication in {G} is associative, so this should give us some nontrivial relation on the bracket {[,]}. Specifically, since

\displaystyle \exp(x) \left( \exp(y) \exp(z) \right) = \left( \exp(x) \exp(y) \right) \exp(z).

we should have that {\mu(x, \mu(y,z)) = \mu(\mu(x,y), z)}, and this should tell us something. In fact, the claim is:

Theorem 18

The bracket {[,]} satisfies the Jacobi identity

\displaystyle [x,[y,z]] + [y,[z,x]] + [z,[x,y]] = 0.

Proof: Although I won’t prove it, the third-order terms (and all the rest) in our definition of {[x,y]} can be written out explicitly as well: for example, for example, we actually have

\displaystyle \mu(x,y) = x + y + \frac{1}{2} [x,y] + \frac{1}{12} \left( [x, [x,y]] + [y,[y,x]] \right) + \text{fourth order terms} + \dots.

The general formula is called the Baker-Campbell-Hausdorff formula.

Then we can force ourselves to expand this using the first three terms of the BCS formula and then equate the degree three terms. The left-hand side expands initially as {\mu\left( x, y + z + \frac{1}{2} [y,z] + \frac{1}{12} \left( [y,[y,z]] + [z,[z,y] \right) \right)}, and the next step would be something ugly.

This computation is horrifying and painful, so I’ll pretend I did it and tell you the end result is as claimed. \Box
There is a more natural way to see why this identity is the “right one”; see Qiaochu. However, with this proof I want to make the point that this Jacobi identity is not our decision: instead, the Jacobi identity is forced upon us by associativity in {G}.

Example 19 (Examples of commutators attached to Lie groups)

  • If {G} is an abelian group, we have {-[y,x] = [x,y]} by symmetry and {[x,y] = [y,x]} from {\mu(x,y) = \mu(y,x)}. Thus {[x,y] = 0} in {\mathfrak g} for any abelian Lie group {G}.
  • In particular, the brackets for {G \in \{\mathbb R, \mathbb C, S^1\}} are trivial.
  • Let {G = \text{GL }(n, K)}. Then one can show that

    \displaystyle [T,S] = TS - ST \qquad \forall S, T \in \mathfrak{gl}(n, K).

  • Ditto for {\text{SL }(n, K)}.

In any case, with the Jacobi identity we can define an general Lie algebra as an intrinsic object with a Jacobi-satisfying bracket:

Definition 20

A Lie algebra over {k} is a {k}-vector space equipped with a skew-symmetric bilinear bracket {[,]} satisfying the Jacobi identity.

A morphism of Lie algebras and preserves the bracket.

Note that a Lie algebra may even be infinite-dimensional (even though we are assuming {G} is finite-dimensional, so that they will never come up as a tangent space).

Example 21 (Associative algebra {\rightarrow} Lie algebra)

Any associative algebra {A} over {k} can be made into a Lie algebra by taking the same underlying vector space, and using the bracket {[a,b] = ab - ba}.

6. The fundamental theorems

We finish this list of facts by stating the three “fundamental theorems” of Lie theory. They are based upon the functor

\displaystyle \mathscr{L} : G \mapsto T_e G

we have described earlier, which is a functor

  • from the category of Lie groups
  • into the category of finite-dimensional Lie algebras.

The first theorem requires the following definition:

Definition 22

A Lie subgroup {H} of a Lie group {G} is a subgroup {H} such that the inclusion map {H \hookrightarrow G} is also an injective immersion.

A Lie subalgebra {\mathfrak h} of a Lie algebra {\mathfrak g} is a vector subspace preserved under the bracket (meaning that {[\mathfrak h, \mathfrak h] \subseteq \mathfrak h]}).

Theorem 23 (Lie I)

Let {G} be a real or complex Lie group with Lie algebra {\mathfrak g}. Then given a Lie subgroup {H \subseteq G}, the map

\displaystyle H \mapsto \mathscr{L}(H) \subseteq \mathfrak g

is a bijection between Lie subgroups of {G} and Lie subalgebras of {\mathfrak g}.

Theorem 24 (The Lie functor is an equivalence of categories)

Restrict {\mathscr{L}} to a functor

  • from the category of simply connected Lie groups over {K}
  • to the category of finite-dimensional Lie algebras over {K}.


  1. (Lie II) {\mathscr{L}} is fully faithful, and
  2. (Lie III) {\mathscr{L}} is essentially surjective on objects.

If we drop the “simply connected” condition, we obtain a functor which is faithful and exact, but not full: non-isomorphic Lie groups can have isomorphic Lie algebras (one example is {\text{SO }(3)} and {\text{SU }(2)}).

The Structure Theorem over PID’s

In this post I’ll describe the structure theorem over PID’s which generalizes the following results:

  • Finite dimensional vector fields over {k} are all of the form {k^{\oplus n}},
  • The classification theorem for finitely generated abelian groups,
  • The Frobenius normal form of a matrix,
  • The Jordan decomposition of a matrix.

1. Some ring theory prerequisites

Prototypical example for this section: {R = \mathbb Z}.

Before I can state the main theorem, I need to define a few terms for UFD’s, which behave much like {\mathbb Z}: Our intuition from the case {R = \mathbb Z} basically carries over verbatim. We don’t even need to deal with prime ideals and can factor elements instead.

Definition 1

If {R} is a UFD, then {p \in R} is a prime element if {(p)} is a prime ideal and {p \neq 0}. For UFD’s this is equivalent to the following property: if {p = xy} then either {x} or {y} is a unit.

So for example in {\mathbb Z} the set of prime elements is {\{\pm2, \pm3, \pm5, \dots\}}. Now, since {R} is a UFD, every element {r} factors into a product of prime elements

\displaystyle  r = u p_1^{e_1} p_2^{e_2} \dots p_m^{e_m}

Definition 2

We say {r} divides {s} if {s = r'r} for some {r' \in R}. This is written {r \mid s}.

Example 3 (Divisibility in {\mathbb Z})

The number {0} is divisible by every element of {\mathbb Z}. All other divisibility as expected.

Ques 4

Show that {r \mid s} if and only if the exponent of each prime in {r} is less than or equal to the corresponding exponent in {s}.

Now, the case of interest is the even stronger case when {R} is a PID:

Proposition 5 (PID’s are Noetherian UFD’s)

If {R} is a PID, then it is Noetherian and also a UFD.

Proof: The fact that {R} is Noetherian is obvious. For {R} to be a UFD we essentially repeat the proof for {\mathbb Z}, using the fact that {(a,b)} is principal in order to extract {\gcd(a,b)}. \Box

In this case, we have a Chinese remainder theorem for elements.

Theorem 6 (Chinese remainder theorem for rings)

Let {m} and {n} be relatively prime elements, meaning {(m) + (n) = (1)}. Then

\displaystyle  R / (mn) \cong R/m \times R/n.

Proof: This is the same as the proof of the usual Chinese remainder theorem. First, since {(m,n)=(1)} we have {am+bn=1} for some {a} and {b}. Then we have a map

\displaystyle  R/m \times R/n \rightarrow R/(mn) \quad\text{by}\quad (r,s) \mapsto r \cdot bn + s \cdot am.

One can check that this map is well-defined and an isomorphism of rings. (Diligent readers invited to do so.) \Box

Finally, we need to introduce the concept of a Noetherian {R}-module.

Definition 7

An {R}-module {M} is Noetherian if it satisfies one of the two equivalent conditions:

  • Its submodules obey the ascending chain condition: there is no infinite sequence of modules {M_1 \subsetneq M_2 \subsetneq \dots}.
  • All submodules of {M} (including {M} itself) are finitely generated.

This generalizes the notion of a Noetherian ring: a Noetherian ring {R} is one for which {R} is Noetherian as an {R}-module.

Ques 8

Check these two conditions are equivalent. (Copy the proof for rings.)

2. The structure theorem

Our structure theorem takes two forms:

Theorem 9 (Structure theorem, invariant form)

Let {R} be a PID and let {M} be any finitely generated {R}-module. Then

\displaystyle  M \cong \bigoplus_{i=1}^m R/s_i

for some {s_i} satisfying {s_1 \mid s_2 \mid \dots \mid s_m}.

Corollary 10 (Structure theorem, primary form)

Let {R} be a PID and let {M} be any finitely generated {R}-module. Then

\displaystyle  M \cong R^{\oplus r} \oplus R/(q_1) \oplus R/(q_2) \oplus \dots \oplus R/(q_m)

where {q_i = p_i^{e_i}} for some prime element {p_i} and integer {e_i \ge 1}.

Proof: Factor each {s_i} into prime factors (since {R} is a UFD), then use the Chinese remainder theorem. \Box

Remark 11

In both theorems the decomposition is unique up to permutations of the summands; good to know, but I won’t prove this.

3. Reduction to maps of free {R}-modules

The proof of the structure theorem proceeds in two main steps. First, we reduce the problem to a linear algebra problem involving free {R}-modules {R^{\oplus d}}. Once that’s done, we just have to play with matrices; this is done in the next section.

Suppose {M} is finitely generated by {d} elements. Then there is a surjective map of {R}-modules

\displaystyle  R^{\oplus d} \twoheadrightarrow M

whose image on the basis of {R^{\oplus d}} are the generators of {M}. Let {K} denote the kernel.

We claim that {K} is finitely generated as well. To this end we prove that

Lemma 12 (Direct sum of Noetherian modules is Noetherian)

Let {M} and {N} be two Noetherian {R}-modules. Then the direct sum {M \oplus N} is also a Noetherian {R}-module.

Proof: It suffices to show that if {L \subseteq M \oplus N}, then {L} is finitely generated. It’s unfortunately not true that {L = P \oplus Q} (take {M = N = \mathbb Z} {L = \{(n,n) \mid n \in \mathbb Z\}}) so we will have to be more careful.

Consider the submodules

\displaystyle  \begin{aligned} A &= \left\{ x \in M \mid (x,0) \in L \right\} \subseteq M \\ B &= \left\{ y \in N \mid \exists x \in M : (x,y) \in L \right\} \subseteq N. \end{aligned}

(Note the asymmetry for {A} and {B}: the proof doesn’t work otherwise.) Then {A} is finitely generated by {a_1}, \dots, {a_k}, and {B} is finitely generated by {b_1}, \dots, {b_\ell}. Let {x_i = (a_i, 0)} and let {y_i = (\ast, b_i)} be elements of {L} (where the {\ast}‘s are arbitrary things we don’t care about). Then {x_i} and {y_i} together generate {L}. \Box

Ques 13

Deduce that for {R} a PID, {R^{\oplus d}} is Noetherian.

Hence {K \subseteq R^{\oplus d}} is finitely generated as claimed. So we can find another surjective map {R^{\oplus f} \twoheadrightarrow K}. Consequently, we have a composition

\displaystyle  R^{\oplus f} \twoheadrightarrow K \hookrightarrow R^{\oplus d} \twoheadrightarrow M.

Observe that {M} is the cokernel of the composition {T : R^{\oplus f} \rightarrow R^{\oplus d}}, i.e. we have that

\displaystyle  M \cong R^{\oplus d} / \text{im }(T).

So it suffices to understand the map {T} well.

4. Smith normal form

The idea is now that we have reduced our problem to studying linear maps {T : R^{\oplus m} \rightarrow R^{\oplus n}}, which can be thought of as a generic matrix

\displaystyle  T = \begin{pmatrix} a_{11} & \dots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \dots & a_{nm} \end{pmatrix}

for the standard basis {e_1}, \dots, {e_m} of {R^{\oplus m}} and {f_1}, \dots, {f_n} of {N}.

Of course, as you might expect it ought to be possible to change the given basis of {T} such that {T} has a nicer matrix form. We already saw this in Jordan form, where we had a map {T : V \rightarrow V} and changed the basis so that {T} was “almost diagonal”. This time, we have two sets of bases we can change, so we would hope to get a diagonal basis, or even better.

Before proceeding let’s think about how we might edit the matrix: what operations are permitted? Here are some examples:

  • Swapping rows and columns, which just corresponds to re-ordering the basis.
  • Adding a multiple of a column to another column. For example, if we add {3} times the first column to the second column, this is equivalent to replacing the basis

    \displaystyle  (e_1, e_2, e_3, \dots, e_m) \mapsto (e_1, e_2+3e_1, e_3, \dots, e_m).

  • Adding a multiple of a row to another row. One can see that adding {3} times the first row to the second row is equivalent to replacing the basis

    \displaystyle  (f_1, f_2, f_3, \dots, f_n) \mapsto (f_1-3f_2, f_2, f_3, \dots, f_n).

More generally, If {A} is an invertible {n \times n} matrix we can replace {T} with {AT}. This corresponds to replacing

\displaystyle  (f_1, \dots, f_n) \mapsto (A(f_1), \dots, A(f_n))

(the “invertible” condition just guarantees the latter is a basis). Of course similarly we can replace {X} with {XB} where {B} is an invertible {m \times m} matrix; this corresponds to

\displaystyle  (e_1, \dots, e_m) \mapsto (B^{-1}(e_1), \dots, B^{-1}(e_m))

Armed with this knowledge, we can now approach the following result.

Theorem 14 (Smith normal form)

Let {R} be a PID. Let {M = R^{\oplus m}} and {N = R^{\oplus n}} be free {R}-modules and let {T : M \rightarrow N} be a linear map. Set {k = \min(m,n)}.

Then we can select a pair of new bases for {M} and {N} such that {T} has only diagonal entries {s_1}, {s_2}, \dots, {s_k} and {s_1 \mid s_2 \mid \dots \mid s_k}.

So if {m > n}, the matrix should take the form

\displaystyle  \begin{pmatrix} s_1 & 0 & 0 & 0 & \dots & 0 \\ 0 & s_2 & 0 & 0 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots & \dots & \vdots \\ 0 & 0 & 0 & s_n & \dots & 0 \end{pmatrix}.

and similarly when {m \le n}.

Ques 15

Show that Smith normal form implies the structure theorem.

Remark 16

Note that this is not a generalization of Jordan form.

  • In Jordan form we consider maps {T : V \rightarrow V}; note that the source and target space are the same, and we are considering one basis for the space {V}.
  • In Smith form the maps {T : M \rightarrow N} are between different modules, and we pick two sets of bases (one for {M} and one for {N}).

Example 17 (Example of Smith normal form)

To give a flavor of the idea of the proof, let’s work through a concrete example with the following matrix with entries from {\mathbb Z}:

\displaystyle  \begin{pmatrix} 18 & 38 & 48 \\ 14 & 30 & 38 \end{pmatrix}.

The GCD of all the entries is {2}, and so motivated by this, we perform the Euclidean algorithm on the left column: subtract the second row from the first row, then three times the first row from the second:

\displaystyle  \begin{pmatrix} 18 & 38 & 48 \\ 14 & 30 & 38 \end{pmatrix} \mapsto \begin{pmatrix} 4 & 8 & 10 \\ 14 & 30 & 38 \end{pmatrix} \mapsto \begin{pmatrix} 4 & 8 & 10 \\ 2 & 6 & 2 \end{pmatrix}.

Now that the GCD of {2} is present, we move it to the upper-left by switching the two rows, and then kill off all the entries in the same row/column; since {2} was the GCD all along, we isolate {2} completely:

\displaystyle  \begin{pmatrix} 4 & 8 & 10 \\ 2 & 6 & 2 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 6 & 2 \\ 4 & 8 & 10 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 6 & 2 \\ 0 & -4 & 6 \\ \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & -4 & 6 \end{pmatrix}.

This reduces the problem to a {1 \times 2} matrix. So we just apply the Euclidean algorithm again there:

\displaystyle  \begin{pmatrix} 2 & 0 & 0 \\ 0 & -4 & 6 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & -4 & 2 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & 0 & 2 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \end{pmatrix}.

Now all we have to do is generalize this proof to work with any PID. It’s intuitively clear how to do this: the PID condition more or less lets you perform a Euclidean algorithm.

Proof: Begin with a generic matrix

\displaystyle  T = \begin{pmatrix} a_{11} & \dots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \dots & a_{nm} \end{pmatrix}

We want to show, by a series of operations (gradually changing the given basis) that we can rearrange the matrix into Smith normal form.

Define {\gcd(x,y)} to be any generator of the principal ideal {(x,y)}.

Claim 18 (“Euclidean algorithm”)

If {a} and {b} are entries in the same row or column, we can change bases to replace {a} with {\gcd(a,b)} and {b} with something else.

Proof: We do just the case of columns. By hypothesis, {\gcd(a,b) = xa+yb} for some {x,y \in R}. We must have {(x,y) = (1)} now (we’re in a UFD). So there are {u} and {v} such that {xu + yv = 1}. Then

\displaystyle  \begin{pmatrix} x & y \\ -v & u \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} = \begin{pmatrix} \gcd(a,b) \\ \text{something} \end{pmatrix}

and the first matrix is invertible (check this!), as desired. \Box
Let {s_1 = (a_{ij})_{i,j}} be the GCD of all entries. Now by repeatedly applying this algorithm, we can cause {s} to appear in the upper left hand corner. Then, we use it to kill off all the entries in the first row and the first column, thus arriving at a matrix

\displaystyle  \begin{pmatrix} s_1 & 0 & 0 & \dots & 0 \\ 0 & a_{22}' & a_{23}' & \dots & a_{2n}' \\ 0 & a_{32}' & a_{33}' & \dots & a_{3n}' \\ \vdots&\vdots&\vdots&\ddots&\vdots \\ 0 & a_{m2}' & a_{m3}' & \dots & a_{mn}' \\ \end{pmatrix}.

Now we repeat the same procedure with this lower-right {(m-1) \times (n-1)} matrix, and so on. This gives the Smith normal form. \Box

With the Smith normal form, we have in the original situation that

\displaystyle  M \cong R^{\oplus d} / \text{im } T

and applying the theorem to {T} completes the proof of the structure theorem.

5. Applications

Now, we can apply our structure theorem! I’ll just sketch proofs of these and let the reader fill in details.

Corollary 19 (Finite-dimensional vector spaces are all isomorphic)

A vector space {V} over a field {k} has a finite spanning set of vectors. Then for some {n}, {V \cong k^{\oplus n}}.

Proof: In the structure theorem, {k / (s_i) \in \{0,k\}}. \Box

Corollary 20 (Frobenius normal form)

Let {T : V \rightarrow V} where {V} is a finite-dimensional vector space over an arbitrary field {k} (not necessarily algebraically closed). Then one can write {T} as a block-diagonal matrix whose blocks are all of the form

\displaystyle  \begin{pmatrix} 0 & 0 & 0 & \dots & 0 & \ast \\ 1 & 0 & 0 & \dots & 0 & \ast \\ 0 & 1 & 0 & \dots & 0 & \ast \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0 & 0 & 0 & \dots & 1 & \ast \\ \end{pmatrix}.

Proof: View {V} as a {k[x]}-module with action {x \cdot v = T(v)}. By theorem {V \cong \bigoplus_i k[x] / (s_i)} for some polynomials {s_i}, where {s_1 \mid s_2 \mid s_3 \mid \dots}. Write each block in the form described. \Box

Corollary 21 (Jordan normal form)

Let {T : V \rightarrow V} where {V} is a finite-dimensional vector space over an arbitrary field {k} which is algebraically closed. Prove that {T} can be written in Jordan form.

Proof: We now use the structure theorem in its primary form. Since {k[x]} is algebraically closed each {p_i} is a linear factor, so every summand looks like {k[x] / (x-a)^m} for some {m}. \Box

This is a draft of Chapter 15 of the Napkin.

Things Fourier

For some reason several classes at MIT this year involve Fourier analysis. I was always confused about this as a high schooler, because no one ever gave me the “orthonormal basis” explanation, so here goes. As a bonus, I also prove a form of Arrow’s Impossibility Theorem using binary Fourier analysis, and then talk about the fancier generalizations using Pontryagin duality and the Peter-Weyl theorem.

In what follows, we let {\mathbb T = \mathbb R/\mathbb Z} denote the “circle group”, thought of as the additive group of “real numbers modulo {1}”. There is a canonical map {e : \mathbb T \rightarrow \mathbb C} sending {\mathbb T} to the complex unit circle, given by {e(\theta) = \exp(2\pi i \theta)}.

Disclaimer: I will deliberately be sloppy with convergence issues, in part because I don’t fully understand them myself, and in part because I don’t care.

1. Synopsis

Suppose we have a domain {Z} and are interested in functions {f : Z \rightarrow \mathbb C}. Naturally, the set of such functions form a complex vector space. We like to equip the set of such functions with an positive definite inner product. The idea of Fourier analysis is to then select an orthonormal basis for this set of functions, say {(e_\xi)_{\xi}}, which we call the characters; the indexing {\xi} are called frequencies. In that case, since we have a basis, every function {f : Z \rightarrow \mathbb C} becomes a sum

\displaystyle  f(x) = \sum_{\xi} \widehat f(\xi) e_\xi

where {\widehat f(\xi)} are complex coefficients of the basis; appropriately we call {\widehat f} the Fourier coefficients. The variable {x \in Z} is referred to as the physical variable. This is generally good because the characters are deliberately chosen to be nice “symmetric” functions, like sine or cosine waves or other periodic functions. Thus {we} decompose an arbitrarily complicated function into a sum on nice ones.

For convenience, we record a few facts about orthonormal bases.

Proposition 1 (Facts about orthonormal bases)

Let {V} be a complex Hilbert space with inner form {\left< -,-\right>} and suppose {x = \sum_\xi a_\xi e_\xi} and {y = \sum_\xi b_\xi e_\xi} where {e_\xi} are an orthonormal basis. Then

\displaystyle  \begin{aligned} \left< x,x \right> &= \sum_\xi |a_\xi|^2 \\ a_\xi &= \left< x, e_\xi \right> \\ \left< x,y \right> &= \sum_\xi a_\xi \overline{b_\xi}. \end{aligned}

2. Common Examples

2.1. Binary Fourier analysis on {\{\pm1\}^n}

Let {Z = \{\pm 1\}^n} for some positive integer {n}, so we are considering functions {f(x_1, \dots, x_n)} accepting binary values. Then the functions {Z \rightarrow \mathbb C} form a {2^n}-dimensional vector space {\mathbb C^Z}, and we endow it with the inner form

\displaystyle  \left< f,g \right> = \frac{1}{2^n} \sum_{x \in Z} f(x) \overline{g(x)}.

In particular,

\displaystyle  \left< f,f \right> = \frac{1}{2^n} \sum_{x \in Z} \left\lvert f(x) \right\rvert^2

is the average of the squares; this establishes also that {\left< -,-\right>} is positive definite.

In that case, the multilinear polynomials form a basis of {\mathbb C^Z}, that is the polynomials

\displaystyle  \chi_S(x_1, \dots, x_n) = \prod_{s \in S} x_s.

Thus our frequency set is actually the subsets {S \subseteq \{1, \dots, n\}}. Thus, we have a decomposition

\displaystyle  f = \sum_{S \subseteq \{1, \dots, n\}} \widehat f(S) \chi_S.

Example 2 (An example of binary Fourier analysis)

Let {n = 2}. Then binary functions {\{ \pm 1\}^2 \rightarrow \mathbb C} have a basis given by the four polynomials

\displaystyle  1, \quad x_1, \quad x_2, \quad x_1x_2.

For example, consider the function {f} which is {1} at {(1,1)} and {0} elsewhere. Then we can put

\displaystyle  f(x_1, x_2) = \frac{x_1+1}{2} \cdot \frac{x_2+1}{2} = \frac14 \left( 1 + x_1 + x_2 + x_1x_2 \right).

So the Fourier coefficients are {\widehat f(S) = \frac 14} for each of the four {S}‘s.

This notion is useful in particular for binary functions {f : \{\pm1\}^n \rightarrow \{\pm1\}}; for these functions (and products thereof), we always have {\left< f,f \right> = 1}.

It is worth noting that the frequency {\varnothing} plays a special role:

Exercise 3

Show that

\displaystyle  \widehat f(\varnothing) = \frac{1}{|Z|} \sum_{x \in Z} f(x).

2.2. Fourier analysis on finite groups {Z}

This is the Fourier analysis used in this post and this post. Here, we have a finite abelian group {Z}, and consider functions {Z \rightarrow \mathbb C}; this is a {|Z|}-dimensional vector space. The inner product is the same as before:

\displaystyle  \left< f,g \right> = \frac{1}{|Z|} \sum_{x \in Z} f(x) \overline{g}(x).

Now here is how we generate the characters. We equip {Z} with a non-degenerate symmetric bilinear form

\displaystyle  Z \times Z \xrightarrow{\cdot} \mathbb T \qquad (\xi, x) \mapsto \xi \cdot x.

Experts may already recognize this as a choice of isomorphism between {Z} and its Pontryagin dual. This time the characters are given by

\displaystyle  \left( e_\xi \right)_{\xi \in Z} \qquad \text{where} \qquad e_\xi(x) = e(\xi \cdot x).

In this way, the set of frequencies is also {Z}, but the {\xi \in Z} play very different roles from the “physical” {x \in Z}. (It is not too hard to check these indeed form an orthonormal basis in the function space {\mathbb C^{\left\lvert Z \right\rvert}}, since we assumed that {\cdot} is non-degenerate.)

Example 4 (Cube roots of unity filter)

Suppose {Z = \mathbb Z/3\mathbb Z}, with the inner form given by {\xi \cdot x = (\xi x)/3}. Let {\omega = \exp(\frac 23 \pi i)} be a primitive cube root of unity. Note that

\displaystyle  e_\xi(x) = \begin{cases} 1 & \xi = 0 \\ \omega^x & \xi = 1 \\ \omega^{2x} & \xi = 2. \end{cases}

Then given {f : Z \rightarrow \mathbb C} with {f(0) = a}, {f(1) = b}, {f(2) = c}, we obtain

\displaystyle  f(x) = \frac{a+b+c}{3} \cdot 1 + \frac{a + \omega^2 b + \omega c}{3} \cdot \omega^x + \frac{a + \omega b + \omega^2 c}{3} \cdot \omega^{2x}.

In this way we derive that the transforms are

\displaystyle  \begin{aligned} \widehat f(0) &= \frac{a+b+c}{3} \\ \widehat f(1) &= \frac{a+\omega^2 b+ \omega c}{3} \\ \widehat f(2) &= \frac{a+\omega b+\omega^2c}{3}. \end{aligned}

Exercise 5

Show that

\displaystyle  \widehat f(0) = \frac{1}{|Z|} \sum_{x \in Z} f(x).

Olympiad contestants may recognize the previous example as a “roots of unity filter”, which is exactly the point. For concreteness, suppose one wants to compute

\displaystyle  \binom{1000}{0} + \binom{1000}{3} + \dots + \binom{1000}{999}.

In that case, we can consider the function

\displaystyle  w : \mathbb Z/3 \rightarrow \mathbb C.

such that {w(0) = 1} but {w(1) = w(2) = 0}. By abuse of notation we will also think of {w} as a function {w : \mathbb Z \twoheadrightarrow \mathbb Z/3 \rightarrow \mathbb C}. Then the sum in question is

\displaystyle  \begin{aligned} \sum_n \binom{1000}{n} w(n) &= \sum_n \binom{1000}{n} \sum_{k=0,1,2} \widehat w(k) \omega^{kn} \\ &= \sum_{k=0,1,2} \widehat w(k) \sum_n \binom{1000}{n} \omega^{kn} \\ &= \sum_{k=0,1,2} \widehat w(k) (1+\omega^k)^n. \end{aligned}

In our situation, we have {\widehat w(0) = \widehat w(1) = \widehat w(2) = \frac13}, and we have evaluated the desired sum. More generally, we can take any periodic weight {w} and use Fourier analysis in order to interchange the order of summation.

Example 6 (Binary Fourier analysis)

Suppose {Z = \{\pm 1\}^n}, viewed as an abelian group under pointwise multiplication hence isomorphic to {(\mathbb Z/2\mathbb Z)^{\oplus n}}. Assume we pick the dot product defined by

\displaystyle  \xi \cdot x = \frac{1}{2} \sum_i \xi_i x_i

where {\xi = (\xi_1, \dots, \xi_n)} and {x = (x_1, \dots, x_n)}.

We claim this coincides with the first example we gave. Indeed, let {S \subseteq \{1, \dots, n\}} and let {\xi \in \{\pm1\}^n} which is {-1} at positions in {S}, and {+1} at positions not in {S}. Then the character {\chi_S} form the previous example coincides with the character {e_\xi} in the new notation. In particular, {\widehat f(S) = \widehat f(\xi)}.

Thus Fourier analysis on a finite group {Z} subsumes binary Fourier analysis.

2.3. Fourier series for functions {L^2([-\pi, \pi])}

Now we consider the space {L^2([-\pi, \pi])} of square-integrable functions {[-\pi, \pi] \rightarrow \mathbb C}, with inner form

\displaystyle  \left< f,g \right> = \frac{1}{2\pi} \int_{[-\pi, \pi]} f(x) \overline{g(x)}.

Sadly, this is not a finite-dimensional vector space, but fortunately it is a Hilbert space so we are still fine. In this case, an orthonormal basis must allow infinite linear combinations, as long as the sum of squares is finite.

Now, it turns out in this case that

\displaystyle  (e_n)_{n \in \mathbb Z} \qquad\text{where}\qquad e_n(x) = \exp(inx)

is an orthonormal basis for {L^2([-\pi, \pi])}. Thus this time the frequency set {\mathbb Z} is infinite. So every function {f \in L^2([-\pi, \pi])} decomposes as

\displaystyle  f(x) = \sum_n \widehat f(n) \exp(inx)

for {\widehat f(n)}.

This is a little worse than our finite examples: instead of a finite sum on the right-hand side, we actually have an infinite sum. This is because our set of frequencies is now {\mathbb Z}, which isn’t finite. In this case the {\widehat f} need not be finitely supported, but do satisfy {\sum_n |\widehat f(n)|^2 < \infty}.

Since the frequency set is indexed by {\mathbb Z}, we call this a Fourier series to reflect the fact that the index is {n \in \mathbb Z}.

Exercise 7

Show once again

\displaystyle  \widehat f(0) = \frac{1}{2\pi} \int_{[-\pi, \pi]} f(x).

Often we require that the function {f} satisfies {f(-\pi) = f(\pi)}, so that {f} becomes a periodic function, and we can think of it as {f : \mathbb T \rightarrow \mathbb C}.

2.4. Summary

We summarize our various flavors of Fourier analysis in the following table.

\displaystyle  \begin{array}{llll} \hline \text{Type} & \text{Physical var} & \text{Frequency var} & \text{Basis functions} \\ \hline \textbf{Binary} & \{\pm1\}^n & \text{Subsets } S \subseteq \left\{ 1, \dots, n \right\} & \prod_{s \in S} x_s \\ \textbf{Finite group} & Z & \xi \in Z, \text{ choice of } \cdot, & e(\xi \cdot x) \\ \textbf{Fourier series} & \mathbb T \text{ or } [-\pi, \pi] & n \in \mathbb Z & \exp(inx) \\ \end{array}

In fact, we will soon see that all these examples are subsumed by Pontryagin duality for compact groups {G}.

3. Parseval and friends

The notion of an orthonormal basis makes several “big-name” results in Fourier analysis quite lucid. Basically, we can take every result from Proposition~1, translate it into the context of our Fourier analysis, and get a big-name result.

Corollary 8 (Parseval theorem)

Let {f : Z \rightarrow \mathbb C}, where {Z} is a finite abelian group. Then

\displaystyle  \sum_\xi |\widehat f(\xi)|^2 = \frac{1}{|Z|} \sum_{x \in Z} |f(x)|^2.

Similarly, if {f : [-\pi, \pi] \rightarrow \mathbb C} is square-integrable then its Fourier series satisfies

\displaystyle  \sum_n |\widehat f(n)|^2 = \frac{1}{2\pi} \int_{[-\pi, \pi]} |f(x)|^2.

Proof: Recall that {\left< f,f\right>} is equal to the square sum of the coefficients. \Box

Corollary 9 (Formulas for {\widehat f})

Let {f : Z \rightarrow \mathbb C}, where {Z} is a finite abelian group. Then

\displaystyle  \widehat f(\xi) = \frac{1}{|Z|} \sum_{x \in Z} f(x) \overline{e_\xi(x)}.

Similarly, if {f : [-\pi, \pi] \rightarrow \mathbb C} is square-integrable then its Fourier series is given by

\displaystyle  \widehat f(n) = \frac{1}{2\pi} \int_{[-\pi, \pi]} f(x) \exp(-inx).

Proof: Recall that in an orthonormal basis {(e_\xi)_\xi}, the coefficient of {e_\xi} in {f} is {\left< f, e_\xi\right>}. \Box
Note in particular what happens if we select {\xi = 0} in the above!

Corollary 10 (Plancherel theorem)

Let {f : Z \rightarrow \mathbb C}, where {Z} is a finite abelian group. Then

\displaystyle  \left< f,g \right> = \sum_{\xi \in Z} \widehat f(\xi) \overline{\widehat g(\xi)}.

Similarly, if {f : [-\pi, \pi] \rightarrow \mathbb C} is square-integrable then

\displaystyle  \left< f,g \right> = \sum_n \widehat f(\xi) \overline{\widehat g(\xi)}.

Proof: Guess! \Box

4. (Optional) Arrow’s Impossibility Theorem

As an application, we now prove a form of Arrow’s theorem. Consider {n} voters voting among {3} candidates {A}, {B}, {C}. Each voter specifies a tuple {v_i = (x_i, y_i, z_i) \in \{\pm1\}^3} as follows:

  • {x_i = 1} if {A} ranks {A} ahead of {B}, and {x_i = -1} otherwise.
  • {y_i = 1} if {A} ranks {B} ahead of {C}, and {y_i = -1} otherwise.
  • {z_i = 1} if {A} ranks {C} ahead of {A}, and {z_i = -1} otherwise.

Tacitly, we only consider {3! = 6} possibilities for {v_i}: we forbid “paradoxical” votes of the form {x_i = y_i = z_i} by assuming that people’s votes are consistent (meaning the preferences are transitive).

Then, we can consider a voting mechanism

\displaystyle  \begin{aligned} f : \{\pm1\}^n &\rightarrow \{\pm1\} \\ g : \{\pm1\}^n &\rightarrow \{\pm1\} \\ h : \{\pm1\}^n &\rightarrow \{\pm1\} \end{aligned}

such that {f(x_\bullet)} is the global preference of {A} vs. {B}, {g(y_\bullet)} is the global preference of {B} vs. {C}, and {h(z_\bullet)} is the global preference of {C} vs. {A}. We’d like to avoid situations where the global preference {(f(x_\bullet), g(y_\bullet), h(z_\bullet))} is itself paradoxical.

In fact, we will prove the following theorem:

Theorem 11 (Arrow Impossibility Theorem)

Assume that {(f,g,h)} always avoids paradoxical outcomes, and assume {\mathbf E f = \mathbf E g = \mathbf E h = 0}. Then {(f,g,h)} is either a dictatorship or anti-dictatorship: there exists a “dictator” {k} such that

\displaystyle  f(x_\bullet) = \pm x_k, \qquad g(y_\bullet) = \pm y_k, \qquad h(z_\bullet) = \pm z_k

where all three signs coincide.

The “irrelevance of independent alternatives” reflects that The assumption {\mathbf E f = \mathbf E g = \mathbf E h = 0} provides symmetry (and e.g. excludes the possibility that {f}, {g}, {h} are constant functions which ignore voter input). Unlike the usual Arrow theorem, we do not assume that {f(+1, \dots, +1) = +1} (hence possibility of anti-dictatorship).

To this end, we actually prove the following result:

Lemma 12

Assume the {n} voters vote independently at random among the {3! = 6} possibilities. The probability of a paradoxical outcome is exactly

\displaystyle  \frac14 + \frac14 \sum_{S \subseteq \{1, \dots, n\}} \left( -\frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S) \widehat g(S) + \widehat g(S) \widehat h(S) + \widehat h(S) \widehat f(S) \right) .

Proof: Define the Boolean function {D : \{\pm 1\}^3 \rightarrow \mathbb R} by

\displaystyle  D(a,b,c) = ab + bc + ca = \begin{cases} 3 & a,b,c \text{ all equal} \\ -1 & a,b,c \text{ not all equal}. \end{cases}.

Thus paradoxical outcomes arise when {D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) = 3}. Now, we compute that for randomly selected {x_\bullet}, {y_\bullet}, {z_\bullet} that

\displaystyle  \begin{aligned} \mathbf E D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) &= \mathbf E \sum_S \sum_T \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \left( \chi_S(x_\bullet)\chi_T(y_\bullet) \right) \\ &= \sum_S \sum_T \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \mathbf E\left( \chi_S(x_\bullet)\chi_T(y_\bullet) \right). \end{aligned}

Now we observe that:

  • If {S \neq T}, then {\mathbf E \chi_S(x_\bullet) \chi_T(y_\bullet) = 0}, since if say {s \in S}, {s \notin T} then {x_s} affects the parity of the product with 50% either way, and is independent of any other variables in the product.
  • On the other hand, suppose {S = T}. Then

    \displaystyle  \chi_S(x_\bullet) \chi_T(y_\bullet) = \prod_{s \in S} x_sy_s.

    Note that {x_sy_s} is equal to {1} with probability {\frac13} and {-1} with probability {\frac23} (since {(x_s, y_s, z_s)} is uniform from {3!=6} choices, which we can enumerate). From this an inductive calculation on {|S|} gives that

    \displaystyle  \prod_{s \in S} x_sy_s = \begin{cases} +1 & \text{ with probability } \frac{1}{2}(1+(-1/3)^{|S|}) \\ -1 & \text{ with probability } \frac{1}{2}(1-(-1/3)^{|S|}). \end{cases}


    \displaystyle  \mathbf E \left( \prod_{s \in S} x_sy_s \right) = \left( -\frac13 \right)^{|S|}.

Piecing this altogether, we now have that

\displaystyle  \mathbf E D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) = \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \left( -\frac13 \right)^{|S|}.

Then, we obtain that

\displaystyle  \begin{aligned} &\mathbf E \frac14 \left( 1 + D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) \right) \\ =& \frac14 + \frac14\sum_S \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \widehat f(S)^2 \left( -\frac13 \right)^{|S|}. \end{aligned}

Comparing this with the definition of {D} gives the desired result. \Box

Now for the proof of the main theorem. We see that

\displaystyle  1 = \sum_{S \subseteq \{1, \dots, n\}} -\left( -\frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S) \widehat g(S) + \widehat g(S) \widehat h(S) + \widehat h(S) \widehat f(S) \right).

But now we can just use weak inequalities. We have {\widehat f(\varnothing) = \mathbf E f = 0} and similarly for {\widehat g} and {\widehat h}, so we restrict attention to {|S| \ge 1}. We then combine the famous inequality {|ab+bc+ca| \le a^2+b^2+c^2} (which is true across all real numbers) to deduce that

\displaystyle  \begin{aligned} 1 &= \sum_{S \subseteq \{1, \dots, n\}} -\left( -\frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S) \widehat g(S) + \widehat g(S) \widehat h(S) + \widehat h(S) \widehat f(S) \right) \\ &\le \sum_{S \subseteq \{1, \dots, n\}} \left( \frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S)^2 + \widehat g(S)^2 + \widehat h(S)^2 \right) \\ &\le \sum_{S \subseteq \{1, \dots, n\}} \left( \frac13 \right)^1 \left( \widehat f(S)^2 + \widehat g(S)^2 + \widehat h(S)^2 \right) \\ &= \frac13 (1+1+1) = 1. \end{aligned}

with the last step by Parseval. So all inequalities must be sharp, and in particular {\widehat f}, {\widehat g}, {\widehat h} are supported on one-element sets, i.e. they are linear in inputs. As {f}, {g}, {h} are {\pm 1} valued, each {f}, {g}, {h} is itself either a dictator or anti-dictator function. Since {(f,g,h)} is always consistent, this implies the final result.

5. Pontryagin duality

In fact all the examples we have covered can be subsumed as special cases of Pontryagin duality, where we replace the domain with a general group {G}. In what follows, we assume {G} is a locally compact abelian (LCA) group, which just means that:

  • {G} is a abelian topological group,
  • the topology on {G} is Hausdorff, and
  • the topology on {G} is locally compact: every point of {G} has a compact neighborhood.

Notice that our previous examples fall into this category:

Example 13 (Examples of locally compact abelian groups)

  • Any finite group {Z} with the discrete topology is LCA.
  • The circle group {\mathbb T} is LCA and also in fact compact.
  • The real numbers {\mathbb R} are an example of an LCA group which is not compact.

5.1. The Pontryagin dual

The key definition is:

Definition 14

Let {G} be an LCA group. Then its Pontryagin dual is the abelian group

\displaystyle  \widehat G \overset{\mathrm{def}}{=} \left\{ \text{continuous group homomorphisms } \xi : G \rightarrow \mathbb T \right\}.

The maps {\xi} are called characters. By equipping it with the compact-open topology, we make {\widehat G} into an LCA group as well.

Example 15 (Examples of Pontryagin duals)

  • {\widehat{\mathbb Z} \cong \mathbb T}.
  • {\widehat{\mathbb T} \cong \mathbb Z}. The characters are given by {\theta \mapsto n\theta} for {n \in \mathbb Z}.
  • {\widehat{\mathbb R} \cong \mathbb R}. This is because a nonzero continuous homomorphism {\mathbb R \rightarrow S^1} is determined by the fiber above {1 \in S^1}. (Covering projections, anyone?)
  • {\widehat{\mathbb Z/n\mathbb Z} \cong \mathbb Z/n\mathbb Z}, characters {\xi} being determined by the image {\xi(1) \in \mathbb T}.
  • {\widehat{G \times H} \cong \widehat G \times \widehat H}.
  • If {Z} is a finite abelian group, then previous two examples (and structure theorem for abelian groups) imply that {\widehat{Z} \cong Z}, though not canonically. You may now recognize that the bilinear form {\cdot : Z \times Z \rightarrow Z} is exactly a choice of isomorphism {Z \rightarrow \widehat Z}.
  • For any group {G}, the dual of {\widehat G} is canonically isomorphic to {G}, id est there is a natural isomorphism

    \displaystyle  G \cong \widehat{\widehat G} \qquad \text{by} \qquad x \mapsto \left( \xi \mapsto \xi(x) \right).

    This is the Pontryagin duality theorem. (It is an analogy to the isomorphism {(V^\vee)^\vee \cong V} for vector spaces {V}.)

5.2. The orthonormal basis in the compact case

Now assume {G} is LCA but also compact, and thus has a unique Haar measure {\mu} such that {\mu(G) = 1}; this lets us integrate over {G}. Let {L^2(G)} be the space of square-integrable functions to {\mathbb C}, i.e.

\displaystyle  L^2(G) = \left\{ f : G \rightarrow \mathbb C \quad\text{such that}\quad \int_G |f|^2 \; d\mu < \infty \right\}.

Thus we can equip it with the inner form

\displaystyle  \left< f,g \right> = \int_G f\overline{g} \; d\mu.

In that case, we get all the results we wanted before:

Theorem 16 (Characters of {\widehat G} forms an orthonormal basis)

Assume {G} is LCA and compact. Then {\widehat G} is discrete, and the characters

\displaystyle  (e_\xi)_{\xi \in \widehat G} \qquad\text{by}\qquad e_\xi(x) = e(\xi(x)) = \exp(2\pi i \xi(x))

form an orthonormal basis of {L^2(G)}. Thus for each {f \in L^2(G)} we have

\displaystyle  f = \sum_{\xi \in \widehat G} \widehat f(\xi) e_\xi


\displaystyle  \widehat f(\xi) = \left< f, e_\xi \right> = \int_G f(x) \exp(-2\pi i \xi(x)) \; d\mu.

The sum {\sum_{\xi \in \widehat G}} makes sense since {\widehat G} is discrete. In particular,

  • Letting {G = Z} gives “Fourier transform on finite groups”.
  • The special case {G = \mathbb Z/n\mathbb Z} has its own Wikipedia page.
  • Letting {G = \mathbb T} gives the “Fourier series” earlier.

5.3. The Fourier transform of the non-compact case

If {G} is LCA but not compact, then Theorem~16 becomes false. On the other hand, it is still possible to define a transform, but one needs to be a little more careful. The generic example to keep in mind in what follows is {G = \mathbb R}.

In what follows, we fix a Haar measure {\mu} for {G}. (This {\mu} is no longer unique up to scaling, since {\mu(G) = \infty}.)

One considers this time the space {L^1(G)} of absolutely integrable functions. Then one directly defines the Fourier transform of {f \in L^1(G)} to be

\displaystyle  \widehat f(\xi) = \int_G f \overline{e_\xi} \; d\mu

imitating the previous definitions in the absence of an inner product. This {\widehat f} may not be {L^1}, but it is at least bounded. Then we manage to at least salvage:

Theorem 17 (Fourier inversion on {L^1(G)})

Take an LCA group {G} and fix a Haar measure {\mu} on it. One can select a unique dual measure {\widehat \mu} on {\widehat G} such that if {f \in L^1(G)}, {\widehat f \in L^1(\widehat G)}, the “Fourier inversion formula”

\displaystyle  f(x) = \int_{\widehat G} \widehat f(\xi) e_\xi(x) \; d\widehat\mu.

holds almost everywhere. It holds everywhere if {f} is continuous.

Notice the extra nuance of having to select measures, because it is no longer the case that {G} has a single distinguished measure.

Despite the fact that the {e_\xi} no longer form an orthonormal basis, the transformed function {\widehat f : \widehat G \rightarrow \mathbb C} is still often useful. In particular, they have special names for a few special {G}:

5.4. Summary

In summary,

  • Given any LCA group {G}, we can transform sufficiently nice functions on {G} into functions on {\widehat G}.
  • If {G} is compact, then we have the nicest situation possible: {L^2(G)} is an inner product space with {\left< f,g \right> = \int_G f \overline{g} \; d\mu}, and {e_\xi} form an orthonormal basis across {\widehat \xi \in \widehat G}.
  • If {G} is not compact, then we no longer get an orthonormal basis or even an inner product space, but it is still possible to define the transform

    \displaystyle  \widehat f : \widehat G \rightarrow \mathbb C

    for {f \in L^1(G)}. If {\widehat f} is also in {L^1(G)} we still get a “Fourier inversion formula” expressing {f} in terms of {\widehat f}.

We summarize our various flavors of Fourier analysis for various {G} in the following. In the first half {G} is compact, in the second half {G} is not.

\displaystyle  \begin{array}{llll} \hline \text{Name} & \text{Domain }G & \text{Dual }\widehat G & \text{Characters} \\ \hline \textbf{Binary Fourier analysis} & \{\pm1\}^n & S \subseteq \left\{ 1, \dots, n \right\} & \prod_{s \in S} x_s \\ \textbf{Fourier transform on finite groups} & Z & \xi \in \widehat Z \cong Z & e( i \xi \cdot x) \\ \textbf{Discrete Fourier transform} & \mathbb Z/n\mathbb Z & \xi \in \mathbb Z/n\mathbb Z & e(\xi x / n) \\ \textbf{Fourier series} & \mathbb T \cong [-\pi, \pi] & n \in \mathbb Z & \exp(inx) \\ \hline \textbf{Continuous Fourier transform} & \mathbb R & \xi \in \mathbb R & e(\xi x) \\ \textbf{Discrete time Fourier transform} & \mathbb Z & \xi \in \mathbb T \cong [-\pi, \pi] & \exp(i \xi n) \\ \end{array}

You might notice that the various names are awful. This is part of the reason I got confused as a high school student: every type of Fourier series above has its own Wikipedia article. If it were up to me, we would just use the term “{G}-Fourier transform”, and that would make everyone’s lives a lot easier.

6. Peter-Weyl

In fact, if {G} is a Lie group, even if {G} is not abelian we can still give an orthonormal basis of {L^2(G)} (the square-integrable functions on {G}). It turns out in this case the characters are attached to complex irreducible representations of {G} (and in what follows all representations are complex).

The result is given by the Peter-Weyl theorem. First, we need the following result:

Lemma 18 (Compact Lie groups have unitary reps)

Any finite-dimensional (complex) representation {V} of a compact Lie group {G} is unitary, meaning it can be equipped with a {G}-invariant inner form. Consequently, {V} is completely reducible: it splits into the direct sum of irreducible representations of {G}.

Proof: Suppose {B : V \times V \rightarrow \mathbb C} is any inner product. Equip {G} with a right-invariant Haar measure {dg}. Then we can equip it with an “averaged” inner form

\displaystyle  \widetilde B(v,w) = \int_G B(gv, gw) \; dg.

Then {\widetilde B} is the desired {G}-invariant inner form. Now, the fact that {V} is completely reducible follows from the fact that given a subrepresentation of {V}, its orthogonal complement is also a subrepresentation. \Box

The Peter-Weyl theorem then asserts that the finite-dimensional irreducible unitary representations essentially give an orthonormal basis for {L^2(G)}, in the following sense. Let {V = (V, \rho)} be such a representation of {G}, and fix an orthonormal basis of {e_1}, \dots, {e_d} for {V} (where {d = \dim V}). The {(i,j)}th matrix coefficient for {V} is then given by

\displaystyle  G \xrightarrow{\rho} \mathop{\mathrm{GL}}(V) \xrightarrow{\pi_{ij}} \mathbb C

where {\pi_{ij}} is the projection onto the {(i,j)}th entry of the matrix. We abbreviate {\pi_{ij} \circ \rho} to {\rho_{ij}}. Then the theorem is:

Theorem 19 (Peter-Weyl)

Let {G} be a compact Lie group. Let {\Sigma} denote the (pairwise non-isomorphic) irreducible finite-dimensional unitary representations of {G}. Then

\displaystyle  \left\{ \sqrt{\dim V} \rho_{ij} \; \Big\vert \; (V, \rho) \in \Sigma, \text{ and } 1 \le i,j \le \dim V \right\}

is an orthonormal basis of {L^2(G)}.

Strictly, I should say {\Sigma} is a set of representatives of the isomorphism classes of irreducible unitary representations, one for each isomorphism class.

In the special case {G} is abelian, all irreducible representations are one-dimensional. A one-dimensional representation of {G} is a map {G \hookrightarrow \mathop{\mathrm{GL}}(\mathbb C) \cong \mathbb C^\times}, but the unitary condition implies it is actually a map {G \hookrightarrow S^1 \cong \mathbb T}, i.e. it is an element of {\widehat G}.

Artin Reciprocity

I will tell you a story about the Reciprocity Law. After my thesis, I had the idea to define {L}-series for non-abelian extensions. But for them to agree with the {L}-series for abelian extensions, a certain isomorphism had to be true. I could show it implied all the standard reciprocity laws. So I called it the General Reciprocity Law and tried to prove it but couldn’t, even after many tries. Then I showed it to the other number theorists, but they all laughed at it, and I remember Hasse in particular telling me it couldn’t possibly be true.

Still, I kept at it, but nothing I tried worked. Not a week went by — for three years! — that I did not try to prove the Reciprocity Law. It was discouraging, and meanwhile I turned to other things. Then one afternoon I had nothing special to do, so I said, `Well, I try to prove the Reciprocity Law again.’ So I went out and sat down in the garden. You see, from the very beginning I had the idea to use the cyclotomic fields, but they never worked, and now I suddenly saw that all this time I had been using them in the wrong way — and in half an hour I had it.

— Emil Artin

Algebraic number theory assumed (e.g. the ANT chapters of Napkin). In this post, I’m going to state some big theorems of global class field theory and use them to deduce the Kronecker-Weber plus Hilbert class fields. For experts: this is global class field theory, without ideles.

Here’s the executive summary: let {K} be a number field. Then all abelian extensions {L/K} can be understood using solely information intrinsic to {K}: namely, the ray class groups (generalizing ideal class groups).

1. Infinite primes

Let {K} be a number field of degree {n}. We know what a prime ideal of {\mathcal O_K} is, but we now allow for the so-called infinite primes, which I’ll describe using embeddings. We know there are {n} embeddings {\sigma : K \rightarrow \mathbb C}, which consist of

  • {r} real embeddings where {\mathop{\mathrm{im}}\sigma \subseteq \mathbb R}, and
  • {s} pairs of conjugate complex embeddings.

Hence {r+2s = n}. The first class of embeddings form the real infinite primes, while the complex infinite primes are the second type. We say {K} is totally real (resp totally complex) if all its infinite primes are real (resp complex).

Example 1 (Examples of infinite primes)

  • {\mathbb Q} has a single real infinite prime. We often write it as {\infty}.
  • {\mathbb Q(\sqrt{-5})} has a single complex infinite prime, and no real infinite primes. Hence totally complex.
  • {\mathbb Q(\sqrt{5})} has two real infinite primes, and no complex infinite primes. Hence totally real.

The motivation from this actually comes from the theory of valuations. Every prime corresponds exactly to a valuation; the infinite primes correspond to the Archimedean valuations while the finite primes correspond to the non-Archimedean valuations.

2. Modular arithmetic with infinite primes

A modulus is a formal product

\displaystyle \mathfrak m = \prod_{\mathfrak p} \mathfrak p^{\nu(\mathfrak p)}

where the product runs over all primes, finite and infinite. (Here {\nu(\mathfrak p)} is a nonnegative integer, of which only finitely many are nonzero.) We also require that

  • {\nu(\mathfrak p) = 0} for any infinite prime {\mathfrak p}, and
  • {\nu(\mathfrak p) \le 1} for any real prime {\mathfrak p}.

Obviously, every {\mathfrak m} can be written as {\mathfrak m = \mathfrak m_0\mathfrak m_\infty} by separating the finite from the (real) infinite primes.

We say {a \equiv b \pmod{\mathfrak m}} if

  • If {\mathfrak p} is a finite prime, then {a \equiv b \pmod{\mathfrak p^{\nu(\mathfrak p)}}} means exactly what you think it should mean: {a-b \in \mathfrak p^{\nu(\mathfrak p)}}.
  • If {\mathfrak p} is a real infinite prime {\sigma : K \rightarrow \mathbb R}, then {a \equiv b \pmod{\mathfrak p}} means that {\sigma(a/b) > 0}.

Of course, {a \equiv b \pmod{\mathfrak m}} means {a \equiv b} modulo each prime power in {\mathfrak m}. With this, we can define a generalization of the class group:

Definition 2

Let {\mathfrak m} be a modulus of a number field {K}.

  • Let {I_K(\mathfrak m)} to be the set of all fractional ideals of {K} which are relatively prime to {\mathfrak m}, which is an abelian group.
  • Let {P_K(\mathfrak m)} be the subgroup of {I_K(\mathfrak m)} generated by

    \displaystyle \left\{ \alpha \mathcal O_K \mid \alpha \in K^\times \text{ and } \alpha \equiv 1 \pmod{\mathfrak m} \right\}.

    This is sometimes called a “ray” of principal ideals.

Finally define the ray class group of {\mathfrak m} to be {C_K(\mathfrak m) = I_K(\mathfrak m) / P_K(\mathfrak m)}.

This group is known to always be finite. Note the usual class group is {C_K(1)}.

One last definition that we’ll use right after Artin reciprocity:

Definition 3

A congruence subgroup of {\mathfrak m} is a subgroup {H} with

\displaystyle P_K(\mathfrak m) \subseteq H \subseteq I_K(\mathfrak m).

Thus {C_K(\mathfrak m)} is a group which contains a lattice of various quotients {I_K(\mathfrak m) / H}, where {H} is a congruence subgroup.

This definition takes a while to get used to, so here are examples.

Example 4 (Ray class groups in {\mathbb Q})

Consider {K = \mathbb Q} with infinite prime {\infty}. Then

  • If we take {\mathfrak m = 1} then {I_\mathbb Q(1)} is all fractional ideals, and {P_\mathbb Q(1)} is all principal fractional ideals. Their quotient is the usual class group of {\mathbb Q}, which is trivial.


  • Now take {\mathfrak m = 8}. Thus {I_\mathbb Q(8) \cong \left\{ \frac ab\mathbb Z \mid a/b \equiv 1,3,5,7 \pmod 8 \right\}}. Moreover

    \displaystyle P_\mathbb Q(8) \cong \left\{ \frac ab\mathbb Z \mid a/b \equiv 1 \pmod 8 \right\}.

    You might at first glance think that the quotient is thus {(\mathbb Z/8\mathbb Z)^\times}. But the issue is that we are dealing with ideals: specifically, we have

    \displaystyle 7\mathbb Z = -7\mathbb Z \in P_\mathbb Q(8)

    because {-7 \equiv 1 \pmod 8}. So actually, we get

    \displaystyle C_\mathbb Q(8) \cong \left\{ 1,3,5,7 \text{ mod } 8 \right\} / \left\{ 1,7 \text{ mod } 8 \right\} \cong (\mathbb Z/4\mathbb Z)^\times.


  • Now take {\mathfrak m = \infty}. As before {I_\mathbb Q(\infty) = \mathbb Q^\times}. Now, by definition we have

    \displaystyle P_\mathbb Q(\infty) = \left\{ \frac ab \mathbb Z \mid a/b > 0 \right\}.

    At first glance you might think this was {\mathbb Q_{>0}}, but the same behavior with ideals shows in fact {P_\mathbb Q(\infty) = \mathbb Q^\times}. So in this case, {P_\mathbb Q(\infty)} still has all principal fractional ideals. Therefore, {C_\mathbb Q(\infty)} is still trivial.


  • Finally, let {\mathfrak m = 8\infty}. As before {I_\mathbb Q(8\infty) \cong \left\{ \frac ab\mathbb Z \mid a/b \equiv 1,3,5,7 \pmod 8 \right\}}. Now in this case:

    \displaystyle P_\mathbb Q(8\infty) \cong \left\{ \frac ab\mathbb Z \mid a/b \equiv 1 \pmod 8 \text{ and } a/b > 0 \right\}.

    This time, we really do have {-7\mathbb Z \notin P_\mathbb Q(8\infty)}: we have {7 \not\equiv 1 \pmod 8} and also {-8 < 0}. So neither of the generators of {7\mathbb Z} are in {P_\mathbb Q(8\infty)}. Thus we finally obtain

    \displaystyle C_\mathbb Q(8\infty) \cong \left\{ 1,3,5,7 \text{ mod } 8 \right\} / \left\{ 1 \text{ mod } 8 \right\} \cong (\mathbb Z/8\mathbb Z)^\times

    with the bijection {C_\mathbb Q(8\infty) \rightarrow (\mathbb Z/8\mathbb Z)^\times} given by {a\mathbb Z \mapsto |a| \pmod 8}.

    Generalizing these examples, we see that

    \displaystyle \begin{aligned} C_\mathbb Q(m) &= (\mathbb Z/m\mathbb Z)^\times / \{\pm1\} \\ C_\mathbb Q(m\infty) &= (\mathbb Z/m\mathbb Z)^\times. \end{aligned}


3. Infinite primes in extensions

I want to emphasize that everything above is intrinsic to a particular number field {K}. After this point we are going to consider extensions {L/K} but it is important to keep in mind the distinction that the concept of modulus and ray class group are objects defined solely from {K} rather than the above {L}.

Now take a Galois extension {L/K} of degree {m}. We already know prime ideals {\mathfrak p} of {K} break into a produt of prime ideals {\mathfrak P} of {K} in {L} in a nice way, so we want to do the same thing with infinite primes. This is straightforward: each of the {n} infinite primes {\sigma : K \rightarrow \mathbb C} lifts to {m} infinite primes {\tau : L \rightarrow \mathbb C}, by which I mean the diagram

ramify-inf-primecommutes. Hence like before, each infinite prime {\sigma} of {K} has {m} infinite primes {\tau} of {L} which lie above it.

For a real prime {\sigma} of {K}, any of the resulting {\tau} above it are complex, we say that the prime {\sigma} ramifies in the extension {L/K}. Otherwise it is unramified in {L/K}. An infinite prime of {K} is always unramified in {L/K}. In this way, we can talk about an unramified Galois extension {L/K}: it is one where all primes (finite or infinite) are unramified.

Example 5 (Ramification of {\infty})

Let {\infty} be the real infinite prime of {\mathbb Q}.

  • {\infty} is ramified in {\mathbb Q(\sqrt{-5})/\mathbb Q}.
  • {\infty} is unramified in {\mathbb Q(\sqrt{5})/\mathbb Q}.

Note also that if {K} is totally complex then any extension {L/K} is unramified.

4. Frobenius element and Artin symbol

Recall the following key result:

Theorem 6 (Frobenius element)

Let {L/K} be a Galois extension. If {\mathfrak p} is a prime unramified in {K}, and {\mathfrak P} a prime above it in {L}. Then there is a unique element of {\mathop{\mathrm{Gal}}(L/K)}, denoted {\mathrm{Frob}_\mathfrak P}, obeying

\displaystyle \mathrm{Frob}_\mathfrak P(\alpha) \equiv \alpha^{N\mathfrak p} \pmod{\mathfrak P} \qquad \forall \alpha \in \mathcal O_L.

Example 7 (Example of Frobenius elements)

Let {L = \mathbb Q(i)}, {K = \mathbb Q}. We have {\mathop{\mathrm{Gal}}(L/K) \cong \mathbb Z/2\mathbb Z}.

If {p} is an odd prime with {\mathfrak P} above it, then {\mathrm{Frob}_\mathfrak P} is the unique element such that

\displaystyle (a+bi)^p \equiv \mathrm{Frob}_\mathfrak P(a+bi) \pmod{\mathfrak P}

in {\mathbb Z[i]}. In particular,

\displaystyle \mathrm{Frob}_\mathfrak P(i) = i^p = \begin{cases} i & p \equiv 1 \pmod 4 \\ -i & p \equiv 3 \pmod 4. \end{cases}

From this we see that {\mathrm{Frob}_\mathfrak P} is the identity when {p \equiv 1 \pmod 4} and {\mathrm{Frob}_\mathfrak P} is complex conjugation when {p \equiv 3 \pmod 4}.

Example 8 (Cyclotomic Frobenius element)

Generalizing previous example, let {L = \mathbb Q(\zeta)} and {K = \mathbb Q}, with {\zeta} an {m}th root of unity. It’s well-known that {L/K} is unramified outside {\infty} and prime factors of {m}. Moreover, the Galois group {\mathop{\mathrm{Gal}}(L/K)} is {(\mathbb Z/m\mathbb Z)^\times}: the Galois group consists of elements of the form

\displaystyle \sigma_n : \zeta \mapsto \zeta^n

and {\mathop{\mathrm{Gal}}(L/K) = \left\{ \sigma_n \mid n \in (\mathbb Z/m\mathbb Z)^\times \right\}}.

Then it follows just like before that if {p \nmid n} is prime and {\mathfrak P} is above {p}

\displaystyle \mathrm{Frob}_\mathfrak P(x) = \sigma_p.

An important property of the Frobenius element is its order is related to the decomposition of {\mathfrak p} in the higher field {L} in the nicest way possible:

Lemma 9 (Order of the Frobenius element)

The Frobenius element {\mathrm{Frob}_\mathfrak P \in \mathop{\mathrm{Gal}}(L/K)} of an extension {L/K} has order equal to the inertial degree of {\mathfrak P}, that is,

\displaystyle \mathop{\mathrm{ord}} \mathrm{Frob}_\mathfrak P = f(\mathfrak P \mid \mathfrak p).

In particular, {\mathrm{Frob}_\mathfrak P = \mathrm{id}} if and only if {\mathfrak p} splits completely in {L/K}.

Proof: We want to understand the order of the map {T : x \mapsto x^{N\mathfrak p}} on the field {\mathcal O_K / \mathfrak P}. But the latter is isomorphic to the splitting field of {X^{N\mathfrak P} - X} in {\mathbb F_p}, by Galois theory of finite fields. Hence the order is {\log_{N\mathfrak p} (N\mathfrak P) = f(\mathfrak P \mid \mathfrak p)}. \Box

Exercise 10

Deduce from this that the rational primes which split completely in {\mathbb Q(\zeta)} are exactly those with {p \equiv 1 \pmod m}. Here {\zeta} is an {m}th root of unity.

The Galois group acts transitively among the set of {\mathfrak P} above a given {\mathfrak p}, so that we have

\displaystyle \mathrm{Frob}_{\sigma(\mathfrak P)} = \sigma \circ (\mathrm{Frob}_{\mathfrak p}) \circ \sigma^{-1}.

Thus {\mathrm{Frob}_\mathfrak P} is determined by its underlying {\mathfrak p} up to conjugation.

In class field theory, we are interested in abelian extensions, (which just means that {\mathop{\mathrm{Gal}}(L/K)} is Galois) in which case the theory becomes extra nice: the conjugacy classes have size one.

Definition 11

Assume {L/K} is an abelian extension. Then for a given unramified prime {\mathfrak p} in {K}, the element {\mathrm{Frob}_\mathfrak P} doesn’t depend on the choice of {\mathfrak P}. We denote the resulting {\mathrm{Frob}_\mathfrak P} by the Artin symbol.

\displaystyle \left( \frac{L/K}{\mathfrak p} \right).

The definition of the Artin symbol is written deliberately to look like the Legendre symbol. To see why:

Example 12 (Legendre symbol subsumed by Artin symbol)

Suppose we want to understand {(2/p) \equiv 2^{\frac{p-1}{2}}} where {p > 2} is prime. Consider the element

\displaystyle \left( \frac{\mathbb Q(\sqrt 2)/\mathbb Q}{p\mathbb Z} \right) \in \mathop{\mathrm{Gal}}(\mathbb Q(\sqrt 2) / \mathbb Q).

It is uniquely determined by where it sends {a}. But in fact we have

\displaystyle \left( \frac{\mathbb Q(\sqrt 2)/\mathbb Q}{p\mathbb Z} \right) \left( \sqrt 2 \right) \equiv \left( \sqrt 2 \right)^{p} \equiv 2^{\frac{p-1}{2}} \cdot \sqrt 2 \equiv \left( \frac 2p \right) \sqrt 2 \pmod{\mathfrak P}

where {\left( \frac 2p \right)} is the usual Legendre symbol, and {\mathfrak P} is above {p} in {\mathbb Q(\sqrt 2)}. Thus the Artin symbol generalizes the quadratic Legendre symbol.

Example 13 (Cubic Legendre symbol subsumed by Artin symbol)

Similarly, it also generalizes the cubic Legendre symbol. To see this, assume {\theta} is primary in {K = \mathbb Q(\sqrt{-3}) = \mathbb Q(\omega)} (thus {\mathcal O_K = \mathbb Z[\omega]} is Eisenstein integers). Then for example

\displaystyle \left( \frac{K(\sqrt[3]{2})/K}{\theta \mathcal O_K} \right) \left( \sqrt[3]{2} \right) \equiv \left( \sqrt[3]{2} \right)^{N(\theta)} \equiv 2^{\frac{N\theta-1}{3}} \cdot \sqrt 2 \equiv \left( \frac{2}{\theta} \right)_3 \sqrt[3]{2}. \pmod{\mathfrak P}

where {\mathfrak P} is above {p} in {K(\sqrt[3]{2})}.

5. Artin reciprocity

Now, we further capitalize on the fact that {\mathop{\mathrm{Gal}}(L/K)} is abelian. For brevity, in what follows let {\mathop{\mathrm{Ram}}(L/K)} denote the primes of {K} (either finite or infinite) which ramify in {L}.

Definition 14

Let {L/K} be an abelian extension and let {\mathfrak m} be divisible by every prime in {\mathop{\mathrm{Ram}}(L/K)}. Then since {L/K} is abelian we can extend the Artin symbol multiplicatively to a map

\displaystyle \left( \frac{L/K}{\bullet} \right) : I_K(\mathfrak m) \twoheadrightarrow \mathop{\mathrm{Gal}}(L/K).

This is called the Artin map, and it is surjective (for example by Chebotarev Density).

Since the map above is surjective, we denote its kernel by {H_{L/K}(\mathfrak m) \subseteq I_K(\mathfrak m)}. Thus we have

\displaystyle \mathop{\mathrm{Gal}}(L/K) \cong I_K(\mathfrak m) / H(\mathfrak m).

We can now present the long-awaited Artin reciprocity theorem.

Theorem 15 (Artin reciprocity)

Let {L/K} be an abelian extension. Then there is a modulus {\mathfrak f = \mathfrak f(L/K)}, divisible by exactly the primes of {\mathop{\mathrm{Ram}}(L/K)}, with the following property: if {\mathfrak m} is divisible by all primes of {\mathop{\mathrm{Ram}}(L/K)}

\displaystyle P_K(\mathfrak m) \subseteq H_{L/K}(\mathfrak m) \subseteq I_K(\mathfrak m) \quad\text{if and only if}\quad \mathfrak f \mid \mathfrak m

We call {\mathfrak f} the conductor of {L/K}.

So the conductor {\mathfrak f} plays a similar role to the discriminant (divisible by exactly the primes which ramify), and when {\mathfrak m} is divisible by the conductor, {H(L/K, \mathfrak m)} is a congruence subgroup.

Note that for example, if we pick {L} such that {H(L/K, \mathfrak m) \cong P_K(\mathfrak m)} then Artin reciprocity means that there is an isomorphism

\displaystyle \mathop{\mathrm{Gal}}(L/K) \cong I_K(\mathfrak m) / P_K(\mathfrak m) \cong C_K(\mathfrak m).

More generally, we see that {\mathop{\mathrm{Gal}}(L/K)} is always a subgroup some subgroup {C_K(\mathfrak f)}.

Example 16 (Cyclotomic field)

Let {\zeta} be a primitive {m}th root of unity. We show in this example that

\displaystyle H_{\mathbb Q(\zeta) / \mathbb Q} (m\infty) = P_\mathbb Q(m\infty) = \left\{ \frac ab \mathbb Z \mid a/b \equiv 1 \pmod m \right\}.

This is the generic example of achieving the lower bound in Artin reciprocity. It also implies that {\mathfrak f(\mathbb Q(\zeta)/\mathbb Q) \mid m\infty}.

It’s well-known {\mathbb Q(\zeta)/\mathbb Q} is unramified outside finite primes dividing {m}, so that the Artin symbol is defined on {I_K(\mathfrak m)}. Now the Artin map is given by

cyclo-mapSo we see that the kernel of this map is trivial, i.e.\ it is given by the identity of the Galois group, corresponding to 1 \pmod{m} Then {H = H_{L/K}(\mathfrak m)} for a unique abelian extension {L/K}.

Finally, such subgroups reverse inclusion in the best way possible:

Lemma 17 (Inclusion-reversing congruence subgroups)

Fix a modulus {\mathfrak m}. Let {L/K} and {M/K} be abelian extensions and suppose {\mathfrak m} is divisible by the conductors of {L/K} and {M/K}. Then

\displaystyle L \subseteq M \quad\text{if and only if}\quad H_{M/K}(\mathfrak m) \subseteq H_{L/K}(\mathfrak m)

Here by {L \subseteq M} we mean that {L} is isomorphic to some subfield of {M}. Proof: Let us first prove the equivalence with {\mathfrak m} fixed. In one direction, assume {L \subseteq M}; one can check from the definitions that the diagram

inclusion-diagramcommutes, because it suffices to verify this for prime powers, which is just saying that Frobenius elements behave well with respect to restriction. Then the inclusion of kernels follows directly. The reverse direction is essentially the Takagi existence theorem. \Box
Note that we can always take {\mathfrak m} to be the product of conductors here.

6. Consequences

With all this theory we can deduce the following two results.

Corollary 18 (Kronecker-Weber theorem)

Let {L} be an abelian extension of {\mathbb Q}. Then {L} is contained in a cyclic extension {\mathbb Q(\zeta)} where {\zeta} is an {m}th root of unity (for some {m}).

Proof: Suppose {\mathfrak f(L/\mathbb Q) \mid m\infty} for some {m}. Then by the example from earlier we have the chain

\displaystyle P_{\mathbb Q}(m\infty) = H(\mathbb Q(\zeta)/\mathbb Q, m\infty) \subseteq H(L/\mathbb Q, m) \subseteq I_\mathbb Q(m\infty).

So by inclusion reversal we’re done. \Box

Corollary 19 (Hilbert class field)

Let {K} be any number field. Then there exists a unique abelian extension {E/K} which is unramified at all primes (finite or infinite) and such that

  • {E/K} is the maximal such extension by inclusion.
  • {\mathop{\mathrm{Gal}}(E/K)} is isomorphic to the class group of {E}.
  • A prime {\mathfrak p} of {K} splits completely in {E} if and only if it is principal.

We call {E} the Hilbert class field of {K}.

Proof: Apply the Takagi existence theorem with {\mathfrak m = 1} to obtain an unramified extension {E/K} such that {H(E/K, 1) = P_K(1)}. We claim this works:

  • To see it is maximal by inclusion, note that any other extension {M/K} with this property has conductor {1} (no primes divide the conductor), and then we have {P_K(1) = H(E/K, 1) \subseteq H(M/K, 1) \subseteq I_K(1)}, so inclusion reversal gives {M \subseteq E}.
  • We have {\mathop{\mathrm{Gal}}(L/K) \cong I_K(1) / P_K(1) = C_K(1)} the class group.
  • The isomorphism in the previous part is given by the Artin symbol. So {\mathfrak p} splits completely if and only if {\left( \frac{L/K}{\mathfrak p} \right) = \mathrm{id}} if and only if {\mathfrak p} is principal (trivial in {C_K(1)}).

This completes the proof. \Box

One can also derive quadratic and cubic reciprocity from Artin reciprocity; see this link for QR and this link for CR.

Tannakian Reconstruction

These notes are from the February 23, 2016 lecture of 18.757, Representations of Lie Algebras, taught by Laura Rider.

Fix a field {k} and let {G} be a finite group. In this post we will show that one can reconstruct the group {G} from the monoidal category of {k[G]}-modules (i.e. its {G}-representations).

1. Hopf algebras

We won’t do anything with Hopf algebras per se, but it will be convenient to have the language.

Recall that an associative {k}-algebra is a {k}-vector space {A} equipped with a map {m : A \otimes A \rightarrow A} and {i : k \hookrightarrow A} (unit), satisfying some certain axioms.

Then a {k}-coalgebra is a map

\displaystyle \Delta : A \rightarrow A \otimes A \qquad \varepsilon : A \rightarrow k

called comultiplication and counit respectively, which satisfy the dual axioms. See \url{https://en.wikipedia.org/wiki/Coalgebra}.

Now a Hopf algebra {A} is a bialgebra {A} over {k} plus a so-called antipode {S : A \rightarrow A}. We require that the diagram



Given a Hopf algebra {A} group-like element in {A} is an element of

\displaystyle G = \left\{ x \in A \mid \Delta(x) = x \otimes x \right\}.

Exercise 1

Show that {G} is a group with multiplication {m} and inversion {S}.

Now the example

Example 2 (Group algebra is Hopf algebra)

The group algebra {k[G]} is a Hopf algebra with

  • {m}, {i} as expected.
  • {\varepsilon} the counit is the trivial representation.
  • {\Delta} comes form {g \mapsto g \otimes g} extended linearly.
  • {S} takes {g \mapsto g^{-1}} extended linearly.


Theorem 3

The group-like elements are precisely the basis elements {1_k \cdot g \in k[g]}.


Proof: Assume {V = \sum_{g \in G} a_g g} is grouplike. Then by assumption we should have

\displaystyle \sum_{g \in G} a_g (g \otimes g) = \Delta(v) = \sum_{g \in G} \sum_{h \in G} a_ga_h (g \otimes h).

Comparing each coefficient, we get that

\displaystyle a_ga_h = \begin{cases} a_g & g = h \\ 0 & \text{otherwise}. \end{cases}

This can only occur if some {a_g} is {1} and the remaining coefficients are all zero. \Box


2. Monoidal functors

Recall that monoidal category (or tensor category) is a category {\mathscr C} equipped with a functor {\otimes : \mathscr C \times \mathscr C \rightarrow \mathscr C} which has an identity {I} and satisfies some certain coherence conditions. For example, for any {A,B,C \in \mathscr C} we should have a natural isomorphism

\displaystyle A \otimes (B \otimes C) \xrightarrow{a_{A,B,C}} (A \otimes B) \otimes C.

The generic example is of course suggested by the notation: vector spaces over {k}, abelian groups, or more generally modules/algebras over a ring {R}.

Now take two monoidal categories {(\mathscr C, \otimes_\mathscr C)} and {(\mathscr D, \otimes_\mathscr D)}. Then a monoidal functor {F : \mathscr C \rightarrow \mathscr D} is a functor for which we additionally need to select an isomorphism

\displaystyle F(A \otimes B) \xrightarrow{t_{A,B}} F(A) \otimes F(B).

We then require that the diagram


commutes, plus some additional compatibility conditions with the identities of the {\otimes}‘s (see Wikipedia for the list).

We also have a notion of a natural transformation of two functors {t : F \rightarrow G}; this is just making the squares


commute. Now, suppose {F : \mathscr C \rightarrow \mathscr C} is a monoidal functor. Then an automorphism of {F} is a natural transformation {t : F \rightarrow F} which is invertible, i.e. a natural isomorphism.

3. Application to {k[G]}

With this language, we now reach the main point of the post. Consider the category of {k[G]} modules endowed with the monoidal {\otimes} (which is just the tensor over {k}, with the usual group representation). We want to reconstruct {G} from this category.

Let {U} be the forgetful functor

\displaystyle U : \mathsf{Mod}_{k[G]} \rightarrow \mathsf{Vect}_k.

It’s easy to see this is in fact an monoidal functor. Now let {\text{Aut }^{\otimes}(U)} be the set of monoidal automorphisms of {U}.

The key claim is the following:

Theorem 4 ({G} is isomorphic to {\text{Aut }^\otimes(U)})

Consider the map

\displaystyle i : G \rightarrow \text{Aut }^\otimes(U) \quad\text{by}\quad g \mapsto T^g.

Here, the natural transformation {T^g} is defined by the components

\displaystyle T^g_{(V,\phi)} : (V, \phi) \rightarrow U(V, \phi) = V \quad\text{by}\quad v \mapsto \phi(g) v.

Then {i} is an isomorphism of groups.

In particular, using only {\otimes} structure this exhibits an isomorphism {G \cong \text{Aut }^\otimes(U)}. Consequently this solves the problem proposed at the beginning of the lecture.

Proof: It’s easy to see {i} is a group homomorphism.

To see it’s injective, we show {1_G \neq g \in G} gives {T^g} isn’t the identity automorphism. i.e. we need to find some representation for which {g} acts nontrivially on {V}. Now just take the regular representation, which is faithful!

The hard part is showing that it’s surjective. For this we want to reduce it to the regular representation.

Lemma 5

Any {T \in \text{Aut }^\otimes(U)} is completely determined by {T_{k[G]}(1_{k[G]}) \in k[G]}.


Proof: Let {(V, \phi)} be a representation of {G}. Then for all {v \in V}, we have a unique morphism of representations

\displaystyle f_v : k[G] \rightarrow (V, \phi) \quad\text{by}\quad 1_{k[G]} \mapsto v.

If we apply the forgetful functor to this, we have a diagram


Next, we claim

Lemma 6

{T_{k[G]}(1_{k[G]})} is a grouplike element of {k[G]}.


Proof: Draw the diagram

rep-tan-5and note that it implies

\displaystyle \Delta(T_{k[G]}(1_{k[G]})) = T_{k[G]}(1_{k[G]}) \otimes T_{k[G]}(1_{k[G]}).

This implies surjectivity, by our earlier observation that grouplike elements in {k[G]} are exactly the elements of {G}. \Box


Rant: Matrices Are Not Arrays of Numbers

The following is an excerpt from a current work of mine. I thought I’d share it here, as some people have told me they enjoyed it.

As I’ll stress repeatedly, a matrix represents a linear map between two vector spaces. Writing it in the form of an {m \times n} matrix is merely a very convenient way to see the map concretely. But it obfuscates the fact that this map is, well, a map, not an array of numbers.

If you took high school precalculus, you’ll see everything done in terms of matrices. To any typical high school student, a matrix is an array of numbers. No one is sure what exactly these numbers represent, but they’re told how to magically multiply these arrays to get more arrays. They’re told that the matrix

\displaystyle \left( \begin{array}{cccc} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & 1 \\ \end{array} \right)

is an “identity matrix”, because when you multiply by another matrix it doesn’t change. Then they’re told that the determinant is some magical combination of these numbers formed by this weird multiplication rule. No one knows what this determinant does, other than the fact that {\det(AB) = \det A \det B}, and something about areas and row operations and Cramer’s rule.

Then you go into linear algebra in college, and you do more magic with these arrays of numbers. You’re told that two matrices {T_1} and {T_2} are similar if

\displaystyle T_2 = ST_1S^{-1}

for some invertible matrix {S}. You’re told that the trace of a matrix {\text{Tr } T} is the sum of the diagonal entries. Somehow this doesn’t change if you look at a similar matrix, but you’re not sure why. Then you define the characteristic polynomial as

\displaystyle p_T = \det (XI - T).

Somehow this also doesn’t change if you take a similar matrix, but now you really don’t know why. And then you have the Cayley-Hamilton Theorem in all its black magic: {p_T(T)} is the zero map. Out of curiosity you Google the proof, and you find some ad-hoc procedure which still leaves you with no idea why it’s true.

This is terrible. Who gives a — about {T_2 = ST_1S^{-1}}? Only if you know that the matrices are linear maps does this make sense: {T_2} is just {T_1} rewritten with a different choice of basis.

In my eyes, this mess is evil. Linear algebra is the study of linear maps, but it is taught as the study of arrays of numbers, and no one knows what these numbers mean. And for a good reason: the numbers are meaningless. They are a highly convenient way of encoding the matrix, but they are not the main objects of study, any more than the dates of events are the main objects of study in history.

When I took Math 55a as a freshman at Harvard, I got the exact opposite treatment: we did all of linear algebra without writing down a single matrix. During all this time I was quite confused. What’s wrong with a basis? I didn’t appreciate until later that this approach was the morally correct way to treat the subject: it made it clear what was happening.

Throughout this project, I’ve tried to strike a balance between these two approaches, using matrices to illustrate the maps and to simplify proofs, but writing theorems and definitions in their morally correct form. I hope that this has both the advantage of giving the “right” definitions while being concrete enough to be digested. But I would just like to say for the record that, if I had to pick between the high school approach and the 55a approach, I would pick 55a in a heartbeat.

Representation Theory, Part 4: The Finite Regular Representation

Good luck to everyone taking the January TST for the IMO 2015 tomorrow!

Now that we have products of irreducibles under our belt, I’ll talk about the finite regular representation and use it to derive the following two results about irreducibles.

  1. The number of (isomorphsim classes) of irreducibles {\rho_\alpha} is equal to the number of conjugacy classes of {G}.
  2. We have { \left\lvert G \right\rvert = \sum_\alpha \left( \dim \rho_\alpha \right)^2 }.

These will actually follow as corollaries from the complete decomposition of the finite regular representation.

In what follows {k} is an algebraically closed field, {G} is a finite group, and the characteristic of {k} does not divide {\left\lvert G \right\rvert}. As a reminder, here are the representations we’ve already seen in the order we met them, plus two new ones we’ll introduce properly below.

\displaystyle  	\begin{array}{|l|lll|} 		\hline 		\text{Representation} & \text{Group} & \text{Space} & \text{Action} \\ \hline 		\rho & V & G & G \rightarrow g \cdot_\rho V \\ 		\text{Fun}(X) & G & \text{Fun}(X) & (g \cdot f)(x) = f(g^{-1} \cdot x) \\ 		\text{triv}_G & G & k & g \cdot a = a \\ 		\rho_1 \oplus \rho_2 & G & V_1 \oplus V_2 & g \cdot (v_1 + v_2) = (g \cdot_{\rho_1} v_1) + (g \cdot_{\rho_2} v_2) \\ 		\rho_1 \boxtimes \rho_2 & G_1 \times G_2 & V_1 \otimes V_2 & (g_1, g_2) \cdot (v_1 \otimes v_2) \\ 		&&& = (g_1 \cdot_{\rho_1} v_1) \otimes (g_2 \cdot_{\rho_2} v_2) \\ 		\text{Res}^G_H(\rho) & H & V & h \cdot v = h \cdot_\rho v\\ 		\rho_1 \otimes \rho_2 & G & V_1 \otimes V_2 & g \cdot (v_1 \otimes v_2) = (g \cdot_{\rho_1} v_1) \otimes (g \cdot_{\rho_2} v_2) \\ 		\text{Reg}(G) & G \times G & \text{Fun}(G) & (g_1, g_2) \cdot f(g) = f(g_2 g g_1^{-1}) \\ 		\rho^\vee & V^\vee & G & (g \cdot \xi)(v) = \xi(g^{-1} \cdot_\rho v) \\ 		\hline 	\end{array}

1. The Regular Representation

Recall that {\mathrm{Fun}(G)} is the vector space of functions from {G} to {k}, with addition being defined canonically. It has a basis of functions {\delta_g} for each {g \in G}, where

\displaystyle  	\delta_g(x) 	= 	\begin{cases} 		1 & x = g \\ 		0 & \text{otherwise} 	\end{cases}

for every {x \in G}. (Throughout this post, I’ll be trying to use {x} to denote inputs to a function from {G} to {k}.)

Definition Let {G} be a finite group. Then the finite regular representation, {\mathrm{Reg}(G)} is a representation on {G \times G} defined on the vector space {\mathrm{Fun}(G)}, with the following action for each {f \in \mathrm{Fun}(G)} and {(g_1, g_2) \in G \times G}:

\displaystyle  ( g_1, g_2 ) \cdot f(x) \overset{\text{def}}{=} f(g_2 x g_1^{-1}).

Note that this is a representation of the product {G \times G}, not {G}! (As an aside, you can also define this representation for infinite groups {G} by replacing {\mathrm{Fun}(G)} with {\mathrm{Fun}_c(G)}, the functions which are nonzero at only finitely many {g \in G}.)

In any case, we now can make {\mathrm{Reg}(G)} into a representation of {G} by this restriction, giving {\mathrm{Res}_G^{G \times G} \left( \mathrm{Reg}(G) \right)}, which I will abbreviate as just {\mathrm{Reg}^\ast(G)} through out this post (this is not a standard notation). The action for this is

\displaystyle  	(g \cdot_{\mathrm{Reg}^\ast(G)} f)(x) 	\overset{\text{def}}{=} 	\left( (g, g) \cdot_{\mathrm{Reg}(G)} f \right)(x) 	= f\left( g^{-1} x g \right).

Exercise Consider the invariant subspace of {\mathrm{Reg}^\ast(G)}, which is

\displaystyle  \left( \mathrm{Reg}^\ast(G) \right)^G 		= 		\left\{ f : G \rightarrow V \mid f(g^{-1} x g) = f(x) \; \forall x,g \in G \right\}.

Prove that the dimension of this space is equal to the number of conjugacy classes of {G}. (Look at the {\delta_g} basis.)

Recall that in general, the invariant subspace {\rho^G} is defined as

\displaystyle  \rho^G \overset{\text{def}}{=} \left\{ v \in V \mid g \cdot_\rho v = v \; \forall g \in G \right\}.

2. Dual Representations

Before I can state the main theorem of this post, I need to define the dual representation.

Recall that given a vector space {V}, we define the \textbf} by

\displaystyle  V^\vee \overset{\text{def}}{=} \mathrm{Hom}(V,k)

i.e. it is the set of maps from {V} to {k}. If {V} is finite-dimensional, we can think of this as follows: if {V} consists of the column vectors of length {m}, then {V^\vee} is the row vectors of length {m}, which can be multiplied onto elements of {V}. (This analogy breaks down for {V} infinite dimensional.) Recall that if {V} is finite-dimensional then there is a canonical isomorphism {V \simeq (V^\vee)^\vee} by the map {v \mapsto \mathrm{ev}_v}, where {\mathrm{ev}_v : V^\vee \rightarrow k} sends {\xi \mapsto \xi(v)}.

Now we can define the dual representation in a similar way.

Definition Let {\rho = (V, \cdot_\rho)} be a {G}-representation. Then we define the dual representation {\rho^\vee} by

\displaystyle  \rho^\vee = \left( V^\vee, \cdot_{\rho^\vee} \right) 		\quad\text{where}\quad 		\left( g \cdot_{\rho^\vee} \xi \right)(v) 		= \xi \left( g^{-1} \cdot_\rho v \right).

Lemma 1 If {\rho} is finite-dimensional then {(\rho^\vee)^\vee \simeq \rho} by the same isomorphism.

Proof: We want to check that the isomorphism {V = (V^\vee)^\vee} by {v \mapsto \mathrm{ev}_v} respects the action of {G}. That’s equivalent to checking

\displaystyle  \mathrm{ev}_{g \cdot_\rho v} = g \cdot_{(\rho^\vee)^\vee} \mathrm{ev}_v.


\displaystyle  		\mathrm{ev}_{g \cdot v}(\xi) 		= \xi(g \cdot_\rho v)


\displaystyle  		\left( g \cdot_{(\rho^\vee)^\vee} \mathrm{ev}_v \right)(\xi) 		= 		\mathrm{ev}_v(g^{-1} \cdot_{\rho^\vee} \xi) 		= \left( g^{-1} \cdot_{\rho^\vee} \xi \right)(v) 		= \xi(g \cdot_\rho v).

So the functions are indeed equal. \Box

Along with that lemma, we also have the following property.

Lemma 2 For any finite-dimensional {\rho_1}, {\rho_2} we have {\mathrm{Hom}_G(\rho_1, \rho_2) \simeq \mathrm{Hom}_G(\rho_1 \otimes \rho_2^\vee, \text{triv}_G)}.

Proof: Let {\rho_1 = (V_1, \cdot_{\rho_1})} and {\rho_2 = (V_2, \cdot_{\rho_2})}. We already know that we have an isomorphism of vector homomorphisms

\displaystyle  		\mathrm{Hom}_{\textbf{Vect}}(V_1, V_2) 		\simeq \mathrm{Hom}_{\textbf{Vect}} (V_1 \otimes V_2^\vee, k)

by sending each {T \in \mathrm{Hom}_{\textbf{Vect}}(V_1, V_2)} to the map {T' \in \mathrm{Hom}_{\textbf{Vect}} (V_1 \otimes V_2^\vee, k)} which has {T'(v \otimes \xi) = \xi(T(v))}. So the point is to check that {T} respects the {G}-action if and only if {T'} does. This is just a computation. \Box

You can deduce as a corollary the following.

Exercise Use the lemma to show {\mathrm{Hom}_G(\rho, \text{triv}_G) \simeq \mathrm{Hom}_G(\text{triv}_G, \rho^\vee)}.

Finally, we want to talk about when {\rho^\vee} being irreducible. The main result is the following.

Lemma 3 Consider a representation {\rho}, not necessarily finite-dimensional. If {\rho^\vee} is irreducible then so is {\rho}.

When {\rho} is finite dimensional we have {(\rho^\vee)^\vee \simeq \rho}, and so it is true for finite-dimensional irreducible {\rho} that {\rho^\vee} is also irreducible. Interestingly, this result fails for infinite-dimensional spaces as this math.SE thread shows.

Proof: Let {\rho = (V, \cdot_\rho)}. Let {W} be a {\rho}-invariant subspace of {V}. Then consider

\displaystyle  W^\perp = \left\{ \xi \in V^\vee : \xi(w) = 0 \right\}.

This is a {\rho^\vee}-invariant subspace of {V^\vee}, so since {\rho^\vee} is irreducible, either {W^\perp = V^\vee} or {W^\perp = \{0\}}. You can check that these imply {W=0} and {W=V}, respectively. \Box

3. Main Result

Now that we know about the product of representations and dual modules, we can state the main result of this post: the complete decomposition of {\mathrm{Reg}(G)}.

Theorem 4 We have an isomorphism

\displaystyle  		\mathrm{Reg}(G) \simeq 		\bigoplus_{\alpha} \rho_\alpha \boxtimes \rho_\alpha^\vee.

Before we can begin the proof of the theorem we need one more lemma.

Lemma 5 Let {\pi} be a representation of {G \times G}. Then there is an isomorphism

\displaystyle  \mathrm{Hom}_{G \times G}(\pi, \mathrm{Reg}(G)) 		\simeq \mathrm{Hom}_G(\mathrm{Res}^{G \times G}_G(\pi), \text{triv}_G).

Proof: Let {\pi = (V, \cdot_\pi)}. Given a map {T : V \rightarrow \mathrm{Fun}(G)} which respects the {G \times G} action, we send it to the map {\xi_T : V \rightarrow k} with {\xi_T(v) = T(v)(1)}. Conversely, given a map {\xi : V \rightarrow k} which respects the {G} action, we send it to the map {T_\xi : V \rightarrow \mathrm{Fun}(G)} so that {T_\xi(v)(x) = \xi\left( (x,x^{-1}) \cdot v \right)}.

Some very boring calculations show that the two maps are mutually inverse and respect the action. We’ll just do one of them here: let us show that {\xi_T(v)} respects the {G} action given that {T} respects the {G \times G} action. We want to prove

\displaystyle  \xi_T\left( (g,g) \cdot_\pi v \right) 	= g \cdot_\text{triv} \xi_T(v) = \xi_T(v).

Using the definition of {\xi_T}

\displaystyle  		\begin{aligned} 		\xi_T\left( (g,g) \cdot_\pi (v) \right) 		&= T\left( (g,g) \cdot_\pi v \right)(1) \\ 		&= \left( (g,g) \cdot_{\mathrm{Fun}(G)} T(v) \right)(1) \\ 		&= T(v)\left( g 1 g^{-1} \right) = T(v)(1) = \xi_T(v). 		\end{aligned}

The remaining computations are left to a very diligent reader. \Box

Now let’s prove the main theorem!

Proof: We have that {\mathrm{Reg}(G)} is the sum of finite-dimensional irreducibles {\rho_\alpha \boxtimes \rho_\beta}, meaning

\displaystyle  		\mathrm{Reg}(G) = 		\bigoplus_{\alpha, \beta} 		\left( \rho_\alpha \boxtimes \rho_\beta \right) 		\otimes 		\mathrm{Hom}_{G \times G}\left( \rho_\alpha \boxtimes \rho_\beta, \mathrm{Reg}(G) \right).

But using our lemmas, we have that

\displaystyle  		\mathrm{Hom}_{G \times G}\left( \rho_\alpha \boxtimes \rho_\beta, \mathrm{Reg}(G) \right) 		\simeq 		\mathrm{Hom}_G(\rho_\alpha \otimes \rho_\beta, \text{triv}_G) 		\simeq \mathrm{Hom}_G(\rho_\alpha, \rho_\beta^\vee).

We know that {\rho_\beta^\vee} is also irreducible, since {\rho_\beta} is (and we’re in a finite-dimensional situation). So

\displaystyle  		\mathrm{Hom}_G\left( \rho_\alpha, \rho_\beta^\vee \right) 		\simeq 		\begin{cases} 			k & \rho_\beta^\vee = \rho_\alpha \\ 			\{0\} & \text{otherwise}. 		\end{cases}

Thus we deduce

\displaystyle  \mathrm{Reg}(G) 		\simeq \bigoplus_{\alpha} 		\left( \rho_\alpha \boxtimes \rho_\alpha^\vee \right) 		\otimes k 		\simeq \bigoplus_{\alpha} 		\left( \rho_\alpha \boxtimes \rho_\alpha^\vee \right)

and we’re done. \Box

4. Corollaries

Recall that {\mathrm{Fun}(G)}, the space underlying {\mathrm{Reg}(G)}, has a basis with size {\left\lvert G \right\rvert}. Hence by comparing the dimensions of the isomorphsims, we obtain the following corollary.

Theorem 6 We have {\left\lvert G \right\rvert = \sum_\alpha \left( \dim \rho_\alpha \right)^2}.

Moreover, by restriction to {G} we can obtain the corollary

\displaystyle  	\mathrm{Reg}^\ast(G) 	\simeq \bigoplus_\alpha \mathrm{Res}_{G}^{G \times G} \left( \rho_\alpha \otimes \rho_\alpha^\vee \right) 	= \bigoplus_\alpha \rho_\alpha \otimes \rho_\alpha^\vee.

Now let us look at the {G}-invariant spaces in this decomposition. We claim that

\displaystyle  \left( \rho_\alpha \otimes \rho_\alpha^\vee \right)^G \simeq k.

Indeed, {Proposition 1} in {Part 1} tells us that we have a bijection of vector spaces

\displaystyle  \left( \rho_\alpha \otimes \rho_\alpha^\vee \right)^G \simeq 	\mathrm{Hom}_G(\text{triv}_G, \rho_\alpha \otimes \rho_\alpha^\vee).

Then we can write

\displaystyle  \begin{aligned} 	\mathrm{Hom}_G(\text{triv}_G, \rho_\alpha \otimes \rho_\alpha^\vee) 	&\simeq 	\mathrm{Hom}_G\left(\text{triv}_G, \left( \rho_\alpha^\vee \otimes \rho_\alpha \right)^\vee \right) \\ 	&\simeq \mathrm{Hom}_G\left(\rho_\alpha^\vee \otimes \rho_\alpha, \text{triv}_G \right) \\ 	&\simeq \mathrm{Hom}_G\left(\rho_\alpha \otimes \rho_\alpha^\vee, \text{triv}_G \right) \\ 	&\simeq \mathrm{Hom}_G\left(\rho_\alpha, \rho_\alpha \right) \\ 	&\simeq k \end{aligned}

by the lemma, where we have also used Schur’s Lemma at the last step. So that means the dimension of the invariant space {(\mathrm{Reg}^\ast (G))^G} is just the number of irreducibles.

But we already showed that the invariant space of {(\mathrm{Reg}^\ast (G))^G} has dimension equal to the conjugacy classes of {G}. Thus we conclude the second result.

Theorem 7 The number of conjugacy classes of {G} equals the number of irreducible representations of {G}.


Time permitting I might talk about the irreducibles of {S_n} in subsequent posts. No promises here though.

Thanks to Dennis Gaitsgory, who taught me this in his course Math 55a. My notes for Math 55a can be found at my website.

Represenation Theory, Part 3: Products of Representations

Happy New Year to all! A quick reminder that {2015 = 5 \cdot 13 \cdot 31}.

This post will set the stage by examining products of two representations. In particular, I’ll characterize all the irreducibles of {G_1 \times G_2} in terms of those for {G_1} and {G_2}. This will set the stage for our discussion of the finite regular representation in Part 4.

In what follows {k} is an algebraically closed field, {G} is a finite group, and the characteristic of {k} does not divide {\left\lvert G \right\rvert}.

1. Products of Representations

First, I need to tell you how to take the product of two representations.

Definition Let {G_1} and {G_2} be groups. Given a {G_1} representation {\rho_1 = (V_1, \cdot_{\rho_1})} and a {G_2} representation {\rho_2 = (V_2, \cdot_{\rho_2})}, we define

\displaystyle  \rho_1 \boxtimes \rho_2 \overset{\text{def}}{=} 	\left( V_1 \otimes V_2, \cdot \right)

as a representation of {G_1 \times G_2} on {V_1 \otimes V_2}. The action is given by

\displaystyle  (g_1, g_2) \cdot (v_1 \otimes v_2) 	= \left( g_1 \cdot_{\rho_1} v_1 \right) \otimes (g_2 \cdot_{\rho_2} v_2).

In the special case {G_1 = G_2 = G}, we can also restrict {\rho_1 \boxtimes \rho_2} to a representation of {G}. Note that we can interpret {G} itself as a subgroup of {G \times G} by just looking along the diagonal: there’s an obvious isomorphism

\displaystyle  G \sim \left\{ (g,g) \mid g \in G \right\}.

So, let me set up the general definition.

Definition Let {\mathcal G} be a group, and let {\mathcal H} be a subgroup of {\mathcal G}. Then for any representation {\rho = (V, \cdot_\rho)} of {\mathcal G}, we let

\displaystyle  \mathrm{Res}^{\mathcal G}_{\mathcal H} (\rho)

denote the representation of {\mathcal H} on {V} by the same action.

This notation might look intimidating, but it’s not really saying anything, and I include the notation just to be pedantic. All we’re doing is taking a representation and restricting which elements of the group are acting on it.

We now apply this to get {\rho_1 \otimes \rho_2} out of {\rho_1 \boxtimes \rho_2}.

Definition Let {\rho_1 = (V_1, \cdot_{\rho_1})} and {\rho_2 = (V_2, \cdot_{\rho_2})} be representations of {G}. Then we define

\displaystyle  \rho_1 \otimes \rho_2 		\overset{\text{def}}{=} 		\mathrm{Res}^{G \times G}_G \left( \rho_1 \boxtimes \rho_2 \right)

meaning {\rho_1 \otimes \rho_2} has vector space {V_1 \otimes V_2} and action {g \cdot (v_1 \otimes v_2) = (g \cdot_{\rho_1} v_1) \otimes (g \cdot_{\rho_2} v_2)}.

This tensor product obeys some nice properties, for example the following.

Lemma 1 Given representations {\rho}, {\rho_1}, {\rho_2} we have

\displaystyle  		\rho \otimes \left( \rho_1 \oplus \rho_2 \right) 		\simeq 		\left( \rho \otimes \rho_1 \right) \oplus \left( \rho \otimes \rho_2 \right).

Proof: There’s an obvious isomorphism between the underlying vector spaces, and that isomorphism respects the action of {G}. \Box

To summarize all the above, here is a table of the representations we’ve seen, in the order we met them.

\displaystyle  	\begin{array}{|l|lll|} 		\hline 		\text{Representation} & \text{Group} & \text{Space} & \text{Action} \\ \hline 		\rho & V & G & g \cdot_\rho v \\ 		\text{Fun}(X) & G & \text{Fun}(X) & (g \cdot f)(x) = f(g^{-1} \cdot x) \\ 		\text{triv}_G & G & k & g \cdot a = a \\ 		\rho_1 \oplus \rho_2 & G & V_1 \oplus V_2 & g \cdot (v_1 + v_2) = (g \cdot_{\rho_1} v_1) + (g \cdot_{\rho_2} v_2) \\ 		\rho_1 \boxtimes \rho_2 & G_1 \times G_2 & V_1 \otimes V_2 & (g_1, g_2) \cdot (v_1 \otimes v_2) \\ 		&&& = (g_1 \cdot_{\rho_1} v_1) \otimes (g_2 \cdot_{\rho_2} v_2) \\ 		\text{Res}^G_H(\rho) & H & V & h \cdot v = h \cdot_\rho v\\ 		\rho_1 \otimes \rho_2 & G & V_1 \otimes V_2 & g \cdot (v_1 \otimes v_2) = (g \cdot_{\rho_1} v_1) \otimes (g \cdot_{\rho_2} v_2) \\ 		\hline 	\end{array}

2. Revisiting Schur and Maschke

Defining a tensor product of representations gives us another way to express {\rho^{\oplus n}}, as follows. By an abuse of notation, given a vector space {k^m} we can define an associated {G}-representation {k^m = (k^m, \cdot_{k^m})} on it by the trivial action, i.e. {g \cdot_{k^m} v = v} for {v \in k^m}. A special case of this is using {k} to represent {\text{triv}_G}. With this abuse of notation, we have the following lemma.

Lemma 2 Let {M} be an {m}-dimensional vector space over {k}. Then {\rho^{\oplus m} \simeq \rho \otimes M}.

Proof: It reduces to checking that {\rho \otimes k \overset{\text{def}}{=} \rho \otimes \text{triv}_G} is isomorphic to {\rho}, which is evident. We can then proceed by induction: {\rho \otimes (k \oplus k^{t-1}) 	\simeq (\rho \otimes k) \oplus (\rho \otimes k^{t-1})}. \Box

So, we can actually rewrite Maschke’s and Schur’s Theorem as one. Instead of

\displaystyle  \rho \simeq \bigoplus_\alpha \rho_\alpha^{\oplus n_\alpha} \quad\text{where}\quad n_\alpha = \dim \mathrm{Hom}_G(\rho, \rho_\alpha)

we now have instead

\displaystyle  	\bigoplus_\alpha \rho_\alpha \otimes \mathrm{Hom}_G(\rho, \rho_\alpha) 	\simeq \rho.

Now we’re going to explicitly write down the isomorphism between these maps. It suffices to write down the isomorphism {\rho_\alpha \otimes \mathrm{Hom}_G(\rho, \rho_\alpha) \rightarrow \rho_\alpha^{\oplus n_\alpha}}, and then take the sum over each of the {\alpha}‘s. But

\displaystyle  \mathrm{Hom}_G(\rho, \rho_\alpha) \simeq \mathrm{Hom}_G(\rho_\alpha^{\oplus n_\alpha}, \rho_\alpha) \simeq \mathrm{Hom}_G(\rho_\alpha, \rho_\alpha)^{\oplus n_\alpha}.

So to write the isomorphism {\rho_\alpha \otimes \mathrm{Hom}_G(\rho_\alpha, \rho_\alpha)^{\oplus n_\alpha} \rightarrow \rho_\alpha^{\oplus n_\alpha}}, we just have to write down the isomorphism {\rho_\alpha \otimes \mathrm{Hom}_G(\rho_\alpha, \rho_\alpha) \rightarrow \rho_\alpha},

Schur’s Lemma tells us that {\mathrm{Hom}_G(\rho_\alpha, \rho_\alpha) \simeq k}; i.e. every {\xi \in \mathrm{Hom}_G(\rho_\alpha, \rho_\alpha)} just corresponds to multiplying {v} by some constant. So this case is easy: the map

\displaystyle  v \otimes \xi \mapsto \xi(v)

works nicely. And since all we’ve done is break over a bunch of direct sums, the isomorphism propagates all the way up, resulting in the following theorem.

Theorem 3 (Maschke and Schur) For any finite-dimensional {\rho}, the homomorphism of {G} representations

\displaystyle  	\bigoplus_\alpha \rho_\alpha \otimes \mathrm{Hom}_G(\rho, \rho_\alpha) 	\rightarrow \rho

given by sending every simple tensor via

\displaystyle  v \otimes \xi \mapsto \xi(v)

is an isomorphism.

Note that it’s much easier to write the map from left to right than vice-versa, even though the inverse map does exist (since it’s an isomorphism). (Tip: as a general rule of thumb, always map out of the direct sum.)

3. Characterizing the {G_1 \times G_2} irreducibles

Now we are in a position to state the main theorem for this post, which shows that the irreducibles we defined above are very well behaved.

Theorem 4 Let {G_1} and {G_2} be finite groups. Then a finite-dimensional representation {\rho} of {G_1 \times G_2} is irreducible if and only if it is of the form

\displaystyle  \rho_1 \boxtimes \rho_2

where {\rho_1} and {\rho_2} are irreducible representations of {G_1} and {G_2}, respectively.

Proof: First, suppose {\rho = (V, \cdot_\rho)} is an irreducible representation of {G_1 \times G_2}. Set

\displaystyle  \rho^1 \overset{\text{def}}{=} \mathrm{Res}^{G_1 \times G_2}_{G_1} (\rho).

Then by Maschke’s Theorem, we may write {\rho^1} as a direct sum of the irreducibles

\displaystyle  	\bigoplus_\alpha \rho_\alpha^1 \otimes \mathrm{Hom}_{G_1} (\rho_\alpha^1, \rho^1) \simeq \rho^1

with the map {v \otimes \xi \mapsto \xi(v)} being the isomorphism. Now we can put a {G_2} representation structure on {\mathrm{Hom}_{G_1} (\rho_\alpha^1, \rho^1)} by

\displaystyle  	(g_2 \cdot f)(g) 	= g_2 \cdot_{\rho} (f(g)).

It is easy to check that this is indeed a {G_2} representation. Thus it makes sense to talk about the {G_1 \times G_2} representation

\displaystyle  	\bigoplus_\alpha \rho_\alpha^1 \boxtimes \mathrm{Hom}_{G_1} (\rho_\alpha^1, \rho^1).

We claim that the isomorphism for {\rho^1} as a {G_1} representation now lifts to an isomorphism of {G_1 \times G_2} representations. That is, we claim that

\displaystyle  	\bigoplus_\alpha \rho_\alpha^1 \boxtimes \mathrm{Hom}_{G_1} (\rho_\alpha^1, \rho^1) \simeq \rho

by the same isomorphism as for {\rho^1}. To see this, we only have to check that the isomorphism {v \otimes \xi \mapsto \xi(v)} commutes with the action of {g_2 \in G_2}. But this is obvious, since {g_2 \cdot (v \otimes \xi) = v \otimes (g_2 \cdot \xi) \mapsto (g_2 \cdot \xi)(v)}.

Thus the isomorphism holds. But {\rho} is irreducible, so there can only be one nontrivial summand. Thus we derive the required decomposition of {\rho}.

Now for the other direction: take {\rho_1} and {\rho_2} irreducible. Suppose {\rho_1 \boxtimes \rho_2} has a nontrivial subrepresentation of the form {\rho_1' \boxtimes \rho_2'}. Viewing as {G_1} representation, we find that {\rho_1'} is a nontrivial subrepresentation of {\rho_1}, and similarly for {\rho_2}. But {\rho_1} is irreducible, hence {\rho_1' \simeq \rho_1}. Similarly {\rho_2' \simeq \rho_2}. So in fact {\rho_1' \boxtimes \rho_2' \simeq \rho_1 \boxtimes \rho_2}. Hence we conclude {\rho_1 \boxtimes \rho_2} is irreducible. \Box

4. Conclusion

In particular, this means that any representation {\rho} of {G \times G} decomposes as

\displaystyle  \rho \simeq \bigoplus_{\alpha, \beta} \rho_\alpha \boxtimes \rho_\beta

and we even have

\displaystyle  \mathrm{Res}_{G}^{G\times G} \rho \simeq \bigoplus_{\alpha, \beta} \rho_\alpha \otimes \rho_\beta.

In the next post I’ll invoke this on the so-called finite regular representation to get the elegant results I promised at the end of Part 2.

Thanks to Dennis Gaitsgory, who taught me this in his course Math 55a. My notes for Math 55a can be found at my website.

Representation Theory, Part 2: Schur’s Lemma

Merry Christmas!

In the previous post I introduced the idea of an irreducible representation and showed that except in fields of low characteristic, these representations decompose completely. In this post I’ll present Schur’s Lemma at talk about what Schur and Maschke tell us about homomorphisms of representations.

1. Motivation

Fix a group {G} now, and consider all isomorphism classes of finite-dimensional representations of {G}. We’ll denote this set by {\mathrm{Irrep}(G)}. Maschke’s Theorem tells us that any finite-dimensional representation {\rho} can be decomposed as

\displaystyle  	\bigoplus_{\rho_\alpha \in \mathrm{Irrep}(G)} \rho_{\alpha}^{\oplus n_\alpha}

where {n_\alpha} is some nonnegative integer. This begs the question: what is {n_\alpha}? Is it even uniquely determined by {\rho}?

To answer this I first need to compute {\mathrm{Hom}_G(\rho, \pi)} for any two distinct irreducible representations {\rho} and {\pi}. One case is easy.

Lemma 1 Let {\rho} and {\pi} be non-isomorphic irreducible representations (not necessarily finite dimensional). Then there are no nontrivial homomorphisms {\phi : \rho \rightarrow \pi}. In other words, {\mathrm{Hom}_G(\rho, \pi) = \{0\}}.

I haven’t actually told you what it means for representations to be isomorphic, but you can guess — it just means that there’s a homomorphism of {G}-representations between them which is also a bijection of the underlying vector spaces.

Proof: Let {\phi : \rho_1 \rightarrow \rho_2} be a nonzero homomorphism. We can actually prove the following stronger results.

  • If {\rho_2} is irreducible then {\phi} is surjective.
  • If {\rho_1} is irreducible then {\phi} is injective.

Exercise Prove the above two results. (Hint: show that {\text{Im } \phi} and {\ker \phi} give rise to subrepresentations.)

Combining these two results gives the lemma because {\phi} is now a bijection, and hence an isomorphism. \Box

2. Schur’s Lemma

Thus we only have to consider the case {\rho \simeq \pi}. The result which relates these is called Schur’s Lemma, but is important enough that we refer to it as a theorem.

Theorem 2 (Schur’s Lemma) Assume {k} is algebraically closed. Let {\rho} be a finite dimensional irreducible representation. Then {\mathrm{Hom}_{G} (\rho, \rho)} consists precisely of maps of the form {v \mapsto \lambda v}, where {\lambda \in k}; the only possible maps are multiplication by a scalar. In other words,

\displaystyle  \mathrm{Hom}_{G} (\rho, \rho) \simeq k

and {\dim \mathrm{Hom}_G(\rho, \rho) = 1}.

This is NOT in general true without the algebraically closed condition, as the following example shows.

Example Let {k = {\mathbb R}}, let {V = {\mathbb R}^2}, and let {G = {\mathbb Z}_3} act on {V} by rotating every {\vec x \in {\mathbb R}^2} by {120^{\circ}} around the origin, giving a representation {\rho}. Then {\rho} is a counterexample to Schur’s Lemma.

Proof: This representation is clearly irreducible because the only point that it fixes is the origin, so there are no nontrivial subrepresentations.

We can regard now {\rho} as a map in {{\mathbb C}} which is multiplication by {e^{\frac{2\pi i}{3}}}. Then for any other complex number {\xi}, the map “multiplication by {\xi}” commutes with the map “multiplication by {e^{\frac{2\pi i}{3}}}”. So in fact

\displaystyle  \mathrm{Hom}_G(\rho, \rho) \simeq {\mathbb C}

which has dimension {2}. \Box

Now we can give the proof of Schur’s Lemma.

Proof: Clearly any map {v \mapsto \lambda v} respects the {G}-action.

Now consider any {T \in \mathrm{Hom}_G(\rho, \rho)}. Set {\rho = (V, \cdot_\rho)}. Here’s the key: because {k} is algebraically closed, and we’re over a finite dimensional vector space {V}, the map {T} has an eigenvalue {\lambda}. Hence by definition {V} has a subspace {V_\lambda} over which {T} is just multiplication by {\lambda}.

But then {V_\lambda} is a {G}-invariant subspace of {V}! Since {\rho} is irreducible, this can only happen if {V = V_\lambda}. That means {T} is multiplication by {\lambda} for the entire space {V}, as desired. \Box

3. Computing dimensions of homomorphisms

Since we can now compute the dimension of the {\mathrm{Hom}_G} of any two irreducible representations, we can compute the dimension of the {\mathrm{Hom}_G} for any composition of irreducibles, as follows.

Corollary 3 We have

\displaystyle  		\dim \mathrm{Hom}_G 		\left( \bigoplus_\alpha \rho_\alpha^{\oplus n_\alpha}, 		\bigoplus_\beta \rho_\beta^{\oplus m_\beta} \right) 		= \sum_{\alpha} n_\alpha m_\alpha

where the direct sums run over the isomorphism classes of irreducibles.

Proof: The {\mathrm{Hom}} just decomposes over each of the components as

\displaystyle  		\begin{aligned} 		\mathrm{Hom}_G 		\left( \bigoplus_\alpha \rho_\alpha^{\oplus n_\alpha}, 		\bigoplus_\beta \rho_\beta^{\oplus m_\beta} \right) 		&\simeq 		\bigoplus_{\alpha, \beta} 		\mathrm{Hom}_G(\rho_\alpha^{\oplus n_\alpha}, \rho_\beta^{\oplus m_\beta}) \\ 		&\simeq 		\bigoplus_{\alpha, \beta} 		\mathrm{Hom}_G(\rho_\alpha, \rho_\beta)^{\oplus n_\alpha m_\alpha}. 		\end{aligned}

Here we’re using the fact that {\mathrm{Hom}_G(\rho_1 \oplus \rho_2, \rho) = \mathrm{Hom}_G(\rho_1, \rho) \oplus \mathrm{Hom}_G(\rho_2, \rho)} (obvious) and its analog. The claim follows from our lemmas now. \Box

As a special case of this, we can quickly derive the following.

Corollary 4 Suppose {\rho = \bigoplus_\alpha \rho_\alpha^{n_\alpha}} as above. Then for any particular {\beta},

\displaystyle  n_\beta = \dim \mathrm{Hom}_G(\rho, \rho_\beta).

Proof: We have

\displaystyle  \dim \mathrm{Hom}_G(\rho, \rho_\beta) = n_\beta \mathrm{Hom}_G(\rho_\beta, \rho_\beta) = n_\beta

as desired. \Box

This settles the “unique decomposition” in the affirmative. Hurrah!

It might be worth noting that we didn’t actually need Schur’s Lemma if we were solely interested in uniqueness, since without it we would have obtained

\displaystyle  n_\beta = \frac{\dim \mathrm{Hom}_G(\rho, \rho_\beta)}{\dim \mathrm{Hom}_G(\rho_\beta, \rho_\beta)}.

However, the denominator in that expression is rather unsatisfying, don’t you think?

4. Conclusion

In summary, we have shown the following main results for finite dimensional representations of a group {G}.

  • Maschke’s Theorem: If {G} is finite and {\text{char } k} does not divide {\left\lvert G \right\rvert}, then any finite dimensional representation is a direct sum of irreducibles. This decomposition is unique up to isomorphism.
  • Schur’s Lemma: If {k} is algebraically closed, then {\mathrm{Hom}_G(\rho, \rho) \simeq k} for any irreducible {\rho}, while there are no nontrivial homomorphisms between non-isomorphic irreducibles.

In the next post I’ll talk about products of irreducibles, and use them in the fourth post to prove two very elegant results about the irreducibles, as follows.

  1. The number of (isomorphsim classes) of irreducibles {\rho_\alpha} is equal to the number of conjugacy classes of {G}.
  2. We have { \left\lvert G \right\rvert = \sum_\alpha \left( \dim \rho_\alpha \right)^2 }.

Thanks to Dennis Gaitsgory, who taught me this in his course Math 55a. My notes for Math 55a can be found at my website.