Rant: Matrices Are Not Arrays of Numbers

The following is an excerpt from a current work of mine. I thought I’d share it here, as some people have told me they enjoyed it.

As I’ll stress repeatedly, a matrix represents a linear map between two vector spaces. Writing it in the form of an {m \times n} matrix is merely a very convenient way to see the map concretely. But it obfuscates the fact that this map is, well, a map, not an array of numbers.

If you took high school precalculus, you’ll see everything done in terms of matrices. To any typical high school student, a matrix is an array of numbers. No one is sure what exactly these numbers represent, but they’re told how to magically multiply these arrays to get more arrays. They’re told that the matrix

\displaystyle \left( \begin{array}{cccc} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & 1 \\ \end{array} \right)

is an “identity matrix”, because when you multiply by another matrix it doesn’t change. Then they’re told that the determinant is some magical combination of these numbers formed by this weird multiplication rule. No one knows what this determinant does, other than the fact that {\det(AB) = \det A \det B}, and something about areas and row operations and Cramer’s rule.

Then you go into linear algebra in college, and you do more magic with these arrays of numbers. You’re told that two matrices {T_1} and {T_2} are similar if

\displaystyle T_2 = ST_1S^{-1}

for some invertible matrix {S}. You’re told that the trace of a matrix {\text{Tr } T} is the sum of the diagonal entries. Somehow this doesn’t change if you look at a similar matrix, but you’re not sure why. Then you define the characteristic polynomial as

\displaystyle p_T = \det (XI - T).

Somehow this also doesn’t change if you take a similar matrix, but now you really don’t know why. And then you have the Cayley-Hamilton Theorem in all its black magic: {p_T(T)} is the zero map. Out of curiosity you Google the proof, and you find some ad-hoc procedure which still leaves you with no idea why it’s true.

This is terrible. Who gives a — about {T_2 = ST_1S^{-1}}? Only if you know that the matrices are linear maps does this make sense: {T_2} is just {T_1} rewritten with a different choice of basis.

In my eyes, this mess is evil. Linear algebra is the study of linear maps, but it is taught as the study of arrays of numbers, and no one knows what these numbers mean. And for a good reason: the numbers are meaningless. They are a highly convenient way of encoding the matrix, but they are not the main objects of study, any more than the dates of events are the main objects of study in history.

When I took Math 55a as a freshman at Harvard, I got the exact opposite treatment: we did all of linear algebra without writing down a single matrix. During all this time I was quite confused. What’s wrong with a basis? I didn’t appreciate until later that this approach was the morally correct way to treat the subject: it made it clear what was happening.

Throughout this project, I’ve tried to strike a balance between these two approaches, using matrices to illustrate the maps and to simplify proofs, but writing theorems and definitions in their morally correct form. I hope that this has both the advantage of giving the “right” definitions while being concrete enough to be digested. But I would just like to say for the record that, if I had to pick between the high school approach and the 55a approach, I would pick 55a in a heartbeat.