For some reason several classes at MIT this year involve Fourier analysis. I was always confused about this as a high schooler, because no one ever gave me the “orthonormal basis” explanation, so here goes. As a bonus, I also prove a form of Arrow’s Impossibility Theorem using binary Fourier analysis, and then talk about the fancier generalizations using Pontryagin duality and the Peter-Weyl theorem.

In what follows, we let denote the “circle group”, thought of as the additive group of “real numbers modulo ”. There is a canonical map sending to the complex unit circle, given by .

Disclaimer: I will deliberately be sloppy with convergence issues, in part because I don’t fully understand them myself, and in part because I don’t care.

## 1. Synopsis

Suppose we have a domain and are interested in functions . Naturally, the set of such functions form a complex vector space. We like to equip the set of such functions with an positive definite **inner product**. The idea of Fourier analysis is to then select an **orthonormal basis** for this set of functions, say , which we call the **characters**; the indexing are called **frequencies**. In that case, since we have a basis, every function becomes a sum

where are complex coefficients of the basis; appropriately we call the **Fourier coefficients**. The variable is referred to as the **physical** variable. This is generally good because the characters are deliberately chosen to be nice “symmetric” functions, like sine or cosine waves or other periodic functions. Thus decompose an arbitrarily complicated function into a sum on nice ones.

For convenience, we record a few facts about orthonormal bases.

**Proposition 1** **(Facts about orthonormal bases)**

Let be a complex Hilbert space with inner form and suppose and where are an orthonormal basis. Then

## 2. Common Examples

### 2.1. Binary Fourier analysis on

Let for some positive integer , so we are considering functions accepting binary values. Then the functions form a -dimensional vector space , and we endow it with the inner form

In particular,

is the average of the squares; this establishes also that is positive definite.

In that case, the **multilinear polynomials** form a basis of , that is the polynomials

Thus our frequency set is actually the subsets . Thus, we have a decomposition

**Example 2** **(An example of binary Fourier analysis)**

Let . Then binary functions have a basis given by the four polynomials

For example, consider the function which is at and elsewhere. Then we can put

So the Fourier coefficients are for each of the four ‘s.

This notion is useful in particular for binary functions ; for these functions (and products thereof), we always have .

It is worth noting that the frequency plays a special role:

**Exercise 3**

Show that

### 2.2. Fourier analysis on finite groups

This is the Fourier analysis used in this post and this post. Here, we have a finite abelian group , and consider functions ; this is a -dimensional vector space. The inner product is the same as before:

Now here is how we generate the characters. We equip with a non-degenerate symmetric bilinear form

Experts may already recognize this as a choice of isomorphism between and its Pontryagin dual. This time the characters are given by

In this way, the set of frequencies is also , but the play very different roles from the “physical” . (It is not too hard to check these indeed form an orthonormal basis in the function space , since we assumed that is non-degenerate.)

**Example 4** **(Cube roots of unity filter)**

Suppose , with the inner form given by . Let be a primitive cube root of unity. Note that

Then given with , , , we obtain

In this way we derive that the transforms are

**Exercise 5**

Show that

Olympiad contestants may recognize the previous example as a “roots of unity filter”, which is exactly the point. For concreteness, suppose one wants to compute

In that case, we can consider the function

such that but . By abuse of notation we will also think of as a function . Then the sum in question is

In our situation, we have , and we have evaluated the desired sum. More generally, we can take any periodic weight and use Fourier analysis in order to interchange the order of summation.

**Example 6** **(Binary Fourier analysis)**

Suppose , viewed as an abelian group under pointwise multiplication hence isomorphic to . Assume we pick the dot product defined by

where and .

We claim this coincides with the first example we gave. Indeed, let and let which is at positions in , and at positions not in . Then the character form the previous example coincides with the character in the new notation. In particular, .

Thus Fourier analysis on a finite group subsumes binary Fourier analysis.

### 2.3. Fourier series for functions

Now we consider the space of square-integrable functions , with inner form

Sadly, this is *not* a finite-dimensional vector space, but fortunately it is a Hilbert space so we are still fine. In this case, an orthonormal basis must allow infinite linear combinations, as long as the sum of squares is finite.

Now, it turns out in this case that

is an orthonormal basis for . Thus this time the frequency set is infinite. So every function decomposes as

for .

This is a little worse than our finite examples: instead of a finite sum on the right-hand side, we actually have an infinite sum. This is because our set of frequencies is now , which isn’t finite. In this case the need not be finitely supported, but do satisfy .

Since the frequency set is indexed by , we call this a **Fourier series** to reflect the fact that the index is .

**Exercise 7**

Show once again

Often we require that the function satisfies , so that becomes a periodic function, and we can think of it as .

### 2.4. Summary

We summarize our various flavors of Fourier analysis in the following table.

In fact, we will soon see that all these examples are subsumed by *Pontryagin duality* for compact groups .

## 3. Parseval and friends

The notion of an orthonormal basis makes several “big-name” results in Fourier analysis quite lucid. Basically, we can take every result from Proposition~1, translate it into the context of our Fourier analysis, and get a big-name result.

**Corollary 8** **(Parseval theorem)**

Let , where is a finite abelian group. Then

Similarly, if is square-integrable then its Fourier series satisfies

*Proof:* Recall that is equal to the square sum of the coefficients.

**Corollary 9** **(Formulas for )**

Let , where is a finite abelian group. Then

Similarly, if is square-integrable then its Fourier series is given by

*Proof:* Recall that in an orthonormal basis , the coefficient of in is .

Note in particular what happens if we select in the above!

**Corollary 10** **(Plancherel theorem)**

Let , where is a finite abelian group. Then

Similarly, if is square-integrable then

*Proof:* Guess!

## 4. (Optional) Arrow’s Impossibility Theorem

As an application, we now prove a form of Arrow’s theorem. Consider voters voting among candidates , , . Each voter specifies a tuple as follows:

- if ranks ahead of , and otherwise.
- if ranks ahead of , and otherwise.
- if ranks ahead of , and otherwise.

Tacitly, we only consider possibilities for : we forbid “paradoxical” votes of the form by assuming that people’s votes are consistent (meaning the preferences are transitive).

Then, we can consider a voting mechanism

such that is the global preference of vs. , is the global preference of vs. , and is the global preference of vs. . We’d like to avoid situations where the global preference is itself paradoxical.

In fact, we will prove the following theorem:

**Theorem 11** **(Arrow Impossibility Theorem)**

Assume that always avoids paradoxical outcomes, and assume . Then is either a dictatorship or anti-dictatorship: there exists a “dictator” such that

where all three signs coincide.

The “irrelevance of independent alternatives” reflects that The assumption provides symmetry (and e.g. excludes the possibility that , , are constant functions which ignore voter input). Unlike the usual Arrow theorem, we do *not* assume that (hence possibility of anti-dictatorship).

To this end, we actually prove the following result:

**Lemma 12**

Assume the voters vote independently at random among the possibilities. The probability of a paradoxical outcome is exactly

*Proof:* Define the Boolean function by

Thus paradoxical outcomes arise when . Now, we compute that for randomly selected , , that

Now we observe that:

- If , then , since if say , then affects the parity of the product with 50% either way, and is independent of any other variables in the product.
- On the other hand, suppose . Then
Note that is equal to with probability and with probability (since is uniform from choices, which we can enumerate). From this an inductive calculation on gives that

Thus

Piecing this altogether, we now have that

Then, we obtain that

Comparing this with the definition of gives the desired result.

Now for the proof of the main theorem. We see that

But now we can just use weak inequalities. We have and similarly for and , so we restrict attention to . We then combine the famous inequality (which is true across all real numbers) to deduce that

with the last step by Parseval. So all inequalities must be sharp, and in particular , , are supported on one-element sets, i.e. they are linear in inputs. As , , are valued, each , , is itself either a dictator or anti-dictator function. Since is always consistent, this implies the final result.

## 5. Pontryagin duality

In fact all the examples we have covered can be subsumed as special cases of *Pontryagin duality*, where we replace the domain with a general group . In what follows, we assume is a **locally compact abelian (LCA) group**, which just means that:

- is a
*abelian*topological group, - the topology on is Hausdorff, and
- the topology on is
*locally compact*: every point of has a compact neighborhood.

Notice that our previous examples fall into this category:

**Example 13** **(Examples of locally compact abelian groups)**

- Any finite group with the discrete topology is LCA.
- The circle group is LCA and also in fact compact.
- The real numbers are an example of an LCA group which is
*not*compact.

### 5.1. The Pontryagin dual

The key definition is:

**Definition 14**

Let be an LCA group. Then its **Pontryagin dual** is the abelian group

The maps are called **characters**. By equipping it with the compact-open topology, we make into an LCA group as well.

**Example 15** **(Examples of Pontryagin duals)**

- .
- . The characters are given by for .
- . This is because a nonzero continuous homomorphism is determined by the fiber above . (Covering projections, anyone?)
- , characters being determined by the image .
- .
- If is a finite abelian group, then previous two examples (and structure theorem for abelian groups) imply that , though not canonically. You may now recognize that the bilinear form is exactly a choice of isomorphism .
- For any group , the dual of is canonically isomorphic to , id est there is a natural isomorphism
This is the

**Pontryagin duality theorem**. (It is an analogy to the isomorphism for vector spaces .)

### 5.2. The orthonormal basis in the compact case

Now assume is LCA but also compact, and thus has a unique Haar measure such that ; this lets us integrate over . Let be the space of square-integrable functions to , i.e.

Thus we can equip it with the inner form

In that case, we get all the results we wanted before:

**Theorem 16** **(Characters of forms an orthonormal basis)**

Assume is LCA and compact. Then is **discrete**, and the characters

form an orthonormal basis of . Thus for each we have

where

The sum makes sense since is discrete. In particular,

- Letting gives “Fourier transform on finite groups”.
- The special case has its own Wikipedia page.
- Letting gives the “Fourier series” earlier.

### 5.3. The Fourier transform of the non-compact case

If is LCA but not compact, then Theorem~16 becomes false. On the other hand, it is still possible to define a transform, but one needs to be a little more careful. The generic example to keep in mind in what follows is .

In what follows, we fix a Haar measure for . (This is no longer unique up to scaling, since .)

One considers this time the space of absolutely integrable functions. Then one directly defines the Fourier transform of to be

imitating the previous definitions in the absence of an inner product. This may not be , but it is at least bounded. Then we manage to at least salvage:

**Theorem 17** **(Fourier inversion on )**

Take an LCA group and fix a Haar measure on it. One can select a unique **dual measure** on such that if , , the “Fourier inversion formula”

holds almost everywhere. It holds everywhere if is continuous.

Notice the extra nuance of having to select measures, because it is no longer the case that has a single distinguished measure.

Despite the fact that the no longer form an orthonormal basis, the transformed function is still often useful. In particular, they have special names for a few special :

- If , then , and this construction gives the poorly named “(continuous) Fourier transform”.
- If , then , and this construction gives the poorly named “DTFT..

### 5.4. Summary

In summary,

- Given any LCA group , we can transform sufficiently nice functions on into functions on .
- If is compact, then we have the nicest situation possible: is an inner product space with , and form an orthonormal basis across .
- If is
*not*compact, then we no longer get an orthonormal basis or even an inner product space, but it is still possible to define the transformfor . If is also in we still get a “Fourier inversion formula” expressing in terms of .

We summarize our various flavors of Fourier analysis for various in the following. In the first half is compact, in the second half is not.

You might notice that the **various names are awful**. This is part of the reason I got confused as a high school student: every type of Fourier series above has its own Wikipedia article. If it were up to me, we would just use the term “-Fourier transform”, and that would make everyone’s lives a lot easier.

## 6. Peter-Weyl

In fact, if is a Lie group, even if is not abelian we can still give an orthonormal basis of (the square-integrable functions on ). It turns out in this case the characters are attached to complex irreducible representations of (and in what follows all representations are complex).

The result is given by the Peter-Weyl theorem. First, we need the following result:

**Lemma 18** **(Compact Lie groups have unitary reps)**

Any finite-dimensional (complex) representation of a compact Lie group is unitary, meaning it can be equipped with a -invariant inner form. Consequently, is completely reducible: it splits into the direct sum of irreducible representations of .

*Proof:* Suppose is any inner product. Equip with a right-invariant Haar measure . Then we can equip it with an “averaged” inner form

Then is the desired -invariant inner form. Now, the fact that is completely reducible follows from the fact that given a subrepresentation of , its orthogonal complement is also a subrepresentation.

The Peter-Weyl theorem then asserts that the finite-dimensional irreducible unitary representations essentially give an orthonormal basis for , in the following sense. Let be such a representation of , and fix an orthonormal basis of , \dots, for (where ). The th **matrix coefficient** for is then given by

where is the projection onto the th entry of the matrix. We abbreviate to . Then the theorem is:

**Theorem 19** **(Peter-Weyl)**

Let be a compact Lie group. Let denote the (pairwise non-isomorphic) irreducible finite-dimensional unitary representations of . Then

is an orthonormal basis of .

Strictly, I should say is a set of representatives of the isomorphism classes of irreducible unitary representations, one for each isomorphism class.

In the special case is abelian, all irreducible representations are one-dimensional. A one-dimensional representation of is a map , but the unitary condition implies it is actually a map , i.e. it is an element of .

Great post. The Arrow’s paper was one of my favorite combinatorics papers when I was reading stuff in high school. Fourier analysis also shows up a lot when proving lower bounds for parity functions (for instance, lower bounds on statistical query dimension). I’ll link to some papers once I’m at a laptop.

LikeLiked by 1 person

Here are the promised links:

Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis by Blum, Furst, and Jackson. (http://www.mathcs.duq.edu/~jackson/dnfsq.pdf)

Also shows up in one of my recent papers:

Memory, Communication, and Statistical Queries by Steinhardt, Valiant, and Wager. (http://eccc.hpi-web.de/report/2015/126/)

LikeLike