Some Thoughts on Olympiad Material Design

(This is a bit of a follow-up to the solution reading post last month. Spoiler warnings: USAMO 2014/6, USAMO 2012/2, TSTST 2016/4, and hints for ELMO 2013/1, IMO 2016/2.)

I want to say a little about the process which I use to design my olympiad handouts and classes these days (and thus by extension the way I personally think about problems). The short summary is that my teaching style is centered around showing connections and recurring themes between problems.

Now let me explain this in more detail.

1. Main ideas

Solutions to olympiad problems can look quite different from one another at a surface level, but typically they center around one or two main ideas, as I describe in my post on reading solutions. Because details are easy to work out once you have the main idea, as far as learning is concerned you can more or less throw away the details and pay most of your attention to main ideas.

Thus whenever I solve an olympiad problem, I make a deliberate effort to summarize the solution in a few sentences, such that I basically know how to do it from there. I also make a deliberate effort, whenever I write up a solution in my notes, to structure it so that my future self can see all the key ideas at a glance and thus be able to understand the general path of the solution immediately.

The example I’ve previously mentioned is USAMO 2014/6.

Example 1 (USAMO 2014, Gabriel Dospinescu)

Prove that there is a constant {c>0} with the following property: If {a, b, n} are positive integers such that {\gcd(a+i, b+j)>1} for all {i, j \in \{0, 1, \dots, n\}}, then

\displaystyle  \min\{a, b\}> (cn)^n.

If you look at any complete solution to the problem, you will see a lot of technical estimates involving {\zeta(2)} and the like. But the main idea is very simple: “consider an {N \times N} table of primes and note the small primes cannot adequately cover the board, since {\sum p^{-2} < \frac{1}{2}}”. Once you have this main idea the technical estimates are just the grunt work that you force yourself to do if you’re a contestant (and don’t do if you’re retired like me).

Thus the study of olympiad problems is reduced to the study of main ideas behind these problems.

2. Taxonomy

So how do we come up with the main ideas? Of course I won’t be able to answer this question completely, because therein lies most of the difficulty of olympiads.

But I do have some progress in this way. It comes down to seeing how main ideas are similar to each other. I spend a lot of time trying to classify the main ideas into categories or themes, based on how similar they feel to one another. If I see one theme pop up over and over, then I can make it into a class.

I think olympiad taxonomy is severely underrated, and generally not done correctly. The status quo is that people do bucket sorts based on the particular technical details which are present in the problem. This is correlated with the main ideas, but the two do not always coincide.

An example where technical sort works okay is Euclidean geometry. Here is a simple example: harmonic bundles in projective geometry. As I explain in my book, there are a few “basic” configurations involved:

  • Midpoints and parallel lines
  • The Ceva / Menelaus configuration
  • Harmonic quadrilateral / symmedian configuration
  • Apollonian circle (right angle and bisectors)

(For a reference, see Lemmas 2, 4, 5 and Exercise 0 here.) Thus from experience, any time I see one of these pictures inside the current diagram, I think to myself that “this problem feels projective”; and if there is a way to do so I try to use harmonic bundles on it.

An example where technical sort fails is the “pigeonhole principle”. A typical problem in such a class looks something like USAMO 2012/2.

Example 2 (USAMO 2012, Gregory Galperin)

A circle is divided into congruent arcs by {432} points. The points are colored in four colors such that some {108} points are colored Red, some {108} points are colored Green, some {108} points are colored Blue, and the remaining {108} points are colored Yellow. Prove that one can choose three points of each color in such a way that the four triangles formed by the chosen points of the same color are congruent.

It’s true that the official solution uses the words “pigeonhole principle” but that is not really the heart of the matter; the key idea is that you consider all possible rotations and count the number of incidences. (In any case, such calculations are better done using expected value anyways.)

Now why is taxonomy a good thing for learning and teaching? The reason is that building connections and seeing similarities is most easily done by simultaneously presenting several related problems. I’ve actually mentioned this already in a different blog post, but let me give the demonstration again.

Suppose I wrote down the following:

\displaystyle  \begin{array}{lll} A1 & B11 & C8 \\ A9 & B44 & C27 \\ A49 & B33 & C343 \\ A16 & B99 & C1 \\ A25 & B22 & C125 \end{array}

You can tell what each of the {A}‘s, {B}‘s, {C}‘s have in common by looking for a few moments. But what happens if I intertwine them?

\displaystyle  \begin{array}{lllll} B11 & C27 & C343 & A1 & A9 \\ C125 & B33 & A49 & B44 & A25 \\ A16 & B99 & B22 & C8 & C1 \end{array}

This is the same information, but now you have to work much harder to notice the association between the letters and the numbers they’re next to.

This is why, if you are an olympiad student, I strongly encourage you to keep a journal or blog of the problems you’ve done. Solving olympiad problems takes lots of time and so it’s worth it to spend at least a few minutes jotting down the main ideas. And once you have enough of these, you can start to see new connections between problems you haven’t seen before, rather than being confined to thinking about individual problems in isolation. (Additionally, it means you will never have redo problems to which you forgot the solution — learn from my mistake here.)

3. Ten buckets of geometry

I want to elaborate more on geometry in general. These days, if I see a solution to a Euclidean geometry problem, then I mentally store the problem and solution into one (or more) buckets. I can even tell you what my buckets are:

  1. Direct angle chasing
  2. Power of a point / radical axis
  3. Homothety, similar triangles, ratios
  4. Recognizing some standard configuration (see Yufei for a list)
  5. Doing some length calculations
  6. Complex numbers
  7. Barycentric coordinates
  8. Inversion
  9. Harmonic bundles or pole/polar and homography
  10. Spiral similarity, Miquel points

which my dedicated fans probably recognize as the ten chapters of my textbook. (Problems may also fall in more than one bucket if for example they are difficult and require multiple key ideas, or if there are multiple solutions.)

Now whenever I see a new geometry problem, the diagram will often “feel” similar to problems in a certain bucket. Exactly what I mean by “feel” is hard to formalize — it’s a certain gut feeling that you pick up by doing enough examples. There are some things you can say, such as “problems which feature a central circle and feet of altitudes tend to fall in bucket 6”, or “problems which only involve incidence always fall in bucket 9”. But it seems hard to come up with an exhaustive list of hard rules that will do better than human intuition.

4. How do problems feel?

But as I said in my post on reading solutions, there are deeper lessons to teach than just technical details.

For examples of themes on opposite ends of the spectrum, let’s move on to combinatorics. Geometry is quite structured and so the themes in the main ideas tend to translate to specific theorems used in the solution. Combinatorics is much less structured and many of the themes I use in combinatorics cannot really be formalized. (Consequently, since everyone else seems to mostly teach technical themes, several of the combinatorics themes I teach are idiosyncratic, and to my knowledge are not taught by anyone else.)

For example, one of the unusual themes I teach is called Global. It’s about the idea that to solve a problem, you can just kind of “add up everything at once”, for example using linearity of expectation, or by double-counting, or whatever. In particular these kinds of approach ignore the “local” details of the problem. It’s hard to make this precise, so I’ll just give two recent examples.

Example 3 (ELMO 2013, Ray Li)

Let {a_1,a_2,\dots,a_9} be nine real numbers, not necessarily distinct, with average {m}. Let {A} denote the number of triples {1 \le i < j < k \le 9} for which {a_i + a_j + a_k \ge 3m}. What is the minimum possible value of {A}?

Example 4 (IMO 2016)

Find all integers {n} for which each cell of {n \times n} table can be filled with one of the letters {I}, {M} and {O} in such a way that:

  • In each row and column, one third of the entries are {I}, one third are {M} and one third are {O}; and
  • in any diagonal, if the number of entries on the diagonal is a multiple of three, then one third of the entries are {I}, one third are {M} and one third are {O}.

If you look at the solutions to these problems, they have the same “feeling” of adding everything up, even though the specific techniques are somewhat different (double-counting for the former, diagonals modulo {3} for the latter). Nonetheless, my experience with problems similar to the former was immensely helpful for the latter, and it’s why I was able to solve the IMO problem.

5. Gaps

This perspective also explains why I’m relatively bad at functional equations. There are some things I can say that may be useful (see my handouts), but much of the time these are just technical tricks. (When sorting functional equations in my head, I have a bucket called “standard fare” meaning that you “just do work”; as far I can tell this bucket is pretty useless.) I always feel stupid teaching functional equations, because I never have many good insights to say.

Part of the reason is that functional equations often don’t have a main idea at all. Consequently it’s hard for me to do useful taxonomy on them.

Then sometimes you run into something like the windmill problem, the solution of which is fairly “novel”, not being similar to problems that come up in training. I have yet to figure out a good way to train students to be able to solve windmill-like problems.

6. Surprise

I’ll close by mentioning one common way I come up with a theme.

Sometimes I will run across an olympiad problem {P} which I solve quickly, and think should be very easy, and yet once I start grading {P} I find that the scores are much lower than I expected. Since the way I solve problems is by drawing experience from similar previous problems, this must mean that I’ve subconsciously found a general framework to solve problems like {P}, which is not obvious to my students yet. So if I can put my finger on what that framework is, then I have something new to say.

The most recent example I can think of when this happened was TSTST 2016/4 which was given last June (and was also a very elegant problem, at least in my opinion).

Example 5 (TSTST 2016, Linus Hamilton)

Let {n > 1} be a positive integers. Prove that we must apply the Euler {\varphi} function at least {\log_3 n} times before reaching {1}.

I solved this problem very quickly when we were drafting the TSTST exam, figuring out the solution while walking to dinner. So I was quite surprised when I looked at the scores for the problem and found out that empirically it was not that easy.

After I thought about this, I have a new tentative idea. You see, when doing this problem I really was thinking about “what does this {\varphi} operation do?”. You can think of {n} as an infinite tuple

\displaystyle  \left(\nu_2(n), \nu_3(n), \nu_5(n), \nu_7(n), \dots \right)

of prime exponents. Then the {\varphi} can be thought of as an operation which takes each nonzero component, decreases it by one, and then adds some particular vector back. For example, if {\nu_7(n) > 0} then {\nu_7} is decreased by one and each of {\nu_2(n)} and {\nu_3(n)} are increased by one. In any case, if you look at this behavior for long enough you will see that the {\nu_2} coordinate is a natural way to “track time” in successive {\varphi} operations; once you figure this out, getting the bound of {\log_3 n} is quite natural. (Details left as exercise to reader.)

Now when I read through the solutions, I found that many of them had not really tried to think of the problem in such a “structured” way, and had tried to directly solve it by for example trying to prove {\varphi(n) \ge n/3} (which is false) or something similar to this. I realized that had the students just ignored the task “prove {n \le 3^k}” and spent some time getting a better understanding of the {\varphi} structure, they would have had a much better chance at solving the problem. Why had I known that structural thinking would be helpful? I couldn’t quite explain it, but it had something to do with the fact that the “main object” of the question was “set in stone”; there was no “degrees of freedom” in it, and it was concrete enough that I felt like I could understand it. Once I understood how multiple {\varphi} operations behaved, the bit about {\log_3 n} almost served as an “answer extraction” mechanism.

These thoughts led to the recent development of a class which I named Rigid, which is all about problems where the point is not to immediately try to prove what the question asks for, but to first step back and understand completely how a particular rigid structure (like the {\varphi} in this problem) behaves, and to then solve the problem using this understanding.

On Reading Solutions

(Ed Note: This was earlier posted under the incorrect title “On Designing Olympiad Training”. How I managed to mess that up is a long story involving some incompetence with Python scripts, but this is fixed now.)

Spoiler warnings: USAMO 2014/1, and hints for Putnam 2014 A4 and B2. You may want to work on these problems yourself before reading this post.

1. An Apology

At last year’s USA IMO training camp, I prepared a handout on writing/style for the students at MOP. One of the things I talked about was the “ocean-crossing point”, which for our purposes you can think of as the discrete jump from a problem being “essentially not solved” ({0+}) to “essentially solved” ({7-}). The name comes from a Scott Aaronson post:

Suppose your friend in Boston blindfolded you, drove you around for twenty minutes, then took the blindfold off and claimed you were now in Beijing. Yes, you do see Chinese signs and pagoda roofs, and no, you can’t immediately disprove him — but based on your knowledge of both cars and geography, isn’t it more likely you’re just in Chinatown? . . . We start in Boston, we end up in Beijing, and at no point is anything resembling an ocean ever crossed.

I then gave two examples of how to write a solution to the following example problem.

Problem 1 (USAMO 2014)

Let {a}, {b}, {c}, {d} be real numbers such that {b-d \ge 5} and all zeros {x_1}, {x_2}, {x_3}, and {x_4} of the polynomial {P(x)=x^4+ax^3+bx^2+cx+d} are real. Find the smallest value the product

\displaystyle  (x_1^2+1)(x_2^2+1)(x_3^2+1)(x_4^2+1)

can take.

Proof: (Not-so-good write-up) Since {x_j^2+1 = (x+i)(x-i)} for every {j=1,2,3,4} (where {i=\sqrt{-1}}), we get {\prod_{j=1}^4 (x_j^2+1) = \prod_{j=1}^4 (x_j+i)(x_j-i) = P(i)P(-i)} which equals to {|P(i)|^2 = (b-d-1)^2 + (a-c)^2}. If {x_1 = x_2 = x_3 = x_4 = 1} this is {16} and {b-d = 5}. Also, {b-d \ge 5}, this is {\ge 16}. \Box

Proof: (Better write-up) The answer is {16}. This can be achieved by taking {x_1 = x_2 = x_3 = x_4 = 1}, whence the product is {2^4 = 16}, and {b-d = 5}.

Now, we prove this is a lower bound. Let {i = \sqrt{-1}}. The key observation is that

\displaystyle  \prod_{j=1}^4 \left( x_j^2 + 1 \right) 		= \prod_{j=1}^4 (x_j - i)(x_j + i) 		= P(i)P(-i).

Consequently, we have

\displaystyle  \begin{aligned} 		\left( x_1^2 + 1 \right) 		\left( x_2^2 + 1 \right) 		\left( x_3^2 + 1 \right) 		\left( x_1^2 + 1 \right) 		&= (b-d-1)^2 + (a-c)^2 \\ 		&\ge (5-1)^2 + 0^2 = 16. 	\end{aligned}

This proves the lower bound. \Box

You’ll notice that it’s much easier to see the key idea in the second solution: namely,

\displaystyle  \prod_j (x_j^2+1) = P(i)P(-i) = (b-d-1)^2 + (a-c)^2

which allows you use the enigmatic condition {b-d \ge 5}.

Unfortunately I have the following confession to make:

In practice, most solutions are written more like the first one than the second one.

The truth is that writing up solutions is sort of a chore that people never really want to do but have to — much like washing dishes. So must solutions won’t be written in a way that helps you learn from them. This means that when you read solutions, you should assume that the thing you really want (i.e., the ocean-crossing point) is buried somewhere amidst a haystack of other unimportant details.

2. Diff

But in practice even the “better write-up” I mentioned above still has too much information in it.

Suppose you were explaining how to solve this problem to a friend. You would probably not start your explanation by saying that the minimum is {16}, achieved by {x_1 = x_2 = x_3 = x_4 = 1} — even though this is indeed a logically necessary part of the solution. Instead, the first thing you would probably tell them is to notice that

\displaystyle  \prod_{j=1}^4 \left( x_j^2 + 1 \right) = P(i)P(-i) 	= (b-d-1)^2 + (a-c)^2 \ge 4^2 = 16.

In fact, if your friend has been working on the problem for more than ten minutes, this is probably the only thing you need to tell them. They probably already figured out by themselves that there was a good chance the answer would be {2^4 = 16}, just based on the condition {b-d \ge 5}. This “one-liner” is all that they need to finish the problem. You don’t need to spell out to them the rest of the details.

When you explain a problem to a friend in this way, you’re communicating just the difference: the one or two sentences such that your friend could work out the rest of the details themselves with these directions. When reading the solution yourself, you should try to extract the main idea in the same way. Olympiad problems generally have only a few main ideas in them, from which the rest of the details can be derived. So reading the solution should feel much like searching for a needle in a haystack.

3. Don’t Read Line by Line

In particular: you should rarely read most of the words in the solution, and you should almost never read every word of the solution.

Whenever I read solutions to problems I didn’t solve, I often read less than 10% of the words in the solution. Instead I search aggressively for the one or two sentences which tell me the key step that I couldn’t find myself. (Functional equations are the glaring exception to this rule, since in these problems there sometimes isn’t any main idea other than “stumble around randomly”, and the steps really are all about equally important. But this is rarer than you might guess.)

I think a common mistake students make is to treat the solution as a sequence of logical steps: that is, reading the solution line by line, and then verifying that each line follows from the previous ones. This seems to entirely miss the point, because not all lines are created equal, and most lines can be easily derived once you figure out the main idea.

If you find that the only way that you can understand the solution is reading it step by step, then the problem may simply be too hard for you. This is because what counts as “details” and “main ideas” are relative to the absolute difficulty of the problem. Here’s an example of what I mean: the solution to a USAMO 3/6 level geometry problem, call it {P}, might look as follows.

Proof: First, we prove lemma {L_1}. (Proof of {L_1}, which is USAMO 1/4 level.)

Then, we prove lemma {L_2}. (Proof of {L_2}, which is USAMO 1/4 level.)

Finally, we remark that putting together {L_1} and {L_2} solves the problem. \Box

Likely the main difficulty of {P} is actually finding {L_1} and {L_2}. So a very experienced student might think of the sub-proofs {L_i} as “easy details”. But younger students might find {L_i} challenging in their own right, and be unable to solve the problem even after being told what the lemmas are: which is why it is hard for them to tell that {\{L_1, L_2\}} were the main ideas to begin with. In that case, the problem {P} is probably way over their head.

This is also why it doesn’t make sense to read solutions to problems which you have not worked on at all — there are often details, natural steps and notation, et cetera which are obvious to you if and only if you have actually tried the problem for a little while yourself.

4. Reflection

The earlier sections describe how to extract the main idea of an olympiad solution. This is neat because instead of having to remember an entire solution, you only need to remember a few sentences now, and it gives you a good understanding of the solution at hand.

But this still isn’t achieving your ultimate goal in learning: you are trying to maximize your scores on future problems. Unless you are extremely fortunate, you will probably never see the exact same problem on an exam again.

So one question you should often ask is:

“How could I have thought of that?”

(Or in my case, “how could I train a student to think of this?”.)

There are probably some surface-level skills that you can pick out of this. The lowest hanging fruit is things that are technical. A small number of examples, with varying amounts of depth:

  • This problem is “purely projective”, so we can take a projective transformation!
  • This problem had a segment {AB} with midpoint {M}, and a line {\ell} parallel to {AB}, so I should consider projecting {(AB;M\infty)} through a point on {\ell}.
  • Drawing a grid of primes is the only real idea in this problem, and the rest of it is just calculations.
  • This main claim is easy to guess since in some small cases, the frogs have “violating points” in a large circle.
  • In this problem there are {n} numbers on a circle, {n} odd. The counterexamples for {n} even alternate up and down, which motivates proving that no three consecutive numbers are in sorted order.
  • This is a juggling problem!

(Brownie points if any contest enthusiasts can figure out which problems I’m talking about in this list!)

5. Learn Philosophy, not Formalism

But now I want to point out that the best answers to the above question are often not formalizable. Lists of triggers and actions are “cheap forms of understanding”, because going through a list of methods will only get so far.

On the other hand, the un-formalizable philosophy that you can extract from reading a question, is part of that legendary “intuition” that people are always talking about: you can’t describe it in words, but it’s certainly there. Maybe I would even be better if I reframed the question as:

“What does this problem feel like?”

So let’s talk about our feelings. Here is David Yang’s take on it:

Whenever you see a problem you really like, store it (and the solution) in your mind like a cherished memory . . . The point of this is that you will see problems which will remind you of that problem despite having no obvious relation. You will not be able to say concretely what the relation is, but think a lot about it and give a name to the common aspect of the two problems. Eventually, you will see new problems for which you feel like could also be described by that name.

Do this enough, and you will have a very powerful intuition that cannot be described easily concretely (and in particular, that nobody else will have).

This itself doesn’t make sense without an example, so here is an example of one philosophy I’ve developed. Here are two problems on Putnam 2014:

Problem 2 (Putnam 2014 A4)

Suppose {X} is a random variable that takes on only nonnegative integer values, with {\mathbb E[X] = 1}, {\mathbb E[X^2] = 2}, and {\mathbb E[X^3] = 5}. Determine the smallest possible value of the probability of the event {X=0}.

Problem 3 (Putnam 2014 B2)

Suppose that {f} is a function on the interval {[1,3]} such that {-1\le f(x)\le 1} for all {x} and

\displaystyle  \int_1^3 f(x) \; dx=0.

How large can {\int_1^3 \frac{f(x)}{x} \; dx} be?

At a glance there seems to be nearly no connection between these problems. One of them is a combinatorics/algebra question, and the other is an integral. Moreover, if you read the official solutions or even my own write-ups, you will find very little in common joining them.

Yet it turns out that these two problems do have something in common to me, which I’ll try to describe below. My thought process in solving either question went as follows:

In both problems, I was able to quickly make a good guess as to what the optimal {X}/{f} was, and then come up with a heuristic explanation (not a proof) why that guess had to be correct, namely, “by smoothing, you should put all the weight on the left”. Let me call this optimal argument {A}.

That conjectured {A} gave a numerical answer to the actual problem: but for both of these problems, it turns out that numerical answer is completely uninteresting, as are the exact details of {A}. It should be philosophically be interpreted as “this is the number that happens to pop out when you plug in the optimal choice”. And indeed that’s what both solutions feel like. These solutions don’t actually care what the exact values of {A} are, they only care about the properties that made me think they were optimal in the first place.

I gave this philosophy the name Equality, with poster description “problems where looking at the equality case is important”. This text description feels more or less useless to me; I suppose it’s the thought that counts. But ever since I came up with this name, it has helped me solve new problems that come up, because they would give me the same feeling that these two problems did.

Two more examples of these themes that I’ve come up with are Global and Rigid, which will be described in a future post on how I design training materials.

Against Perfect Scores

One of the pieces of advice I constantly give to young students preparing for math contests is that they should probably do harder problems. But perhaps I don’t preach this zealously enough for them to listen, so here’s a concrete reason (with actual math!) why I give this advice.

1. The AIME and USAMO

In the USA many students who seriously prepare for math contests eventually qualify for an exam called the AIME (American Invitational Math Exam). This is a 3-hour exam with 15 short-answer problems; the median score is maybe about 5 problems.

Correctly solving maybe 10 of the problems qualifies for the much more difficult USAMO. This national olympiad is much more daunting, with six proof-based problems given over nine hours. It is not uncommon for olympiad contestants to not solve a single problem (this certainly happened to me a fair share of times!).

You’ll notice the stark difference in the scale of these contests (Tanya Khovanova has a longer complaint about this here). For students who are qualifying for USAMO for the first time, the olympiad is terrifying: I certainly remember the first time I took the olympiad with a super lofty goal of solving any problem.

Now, my personal opinion is that the difference between AIME and USAMO is generally exaggerated, and less drastic than appearances suggest. But even then, the psychological fear is still there — so what do you think happens to this demographic of students?

Answer: they don’t move on from AIME training. They think, “oh, the USAMO is too hard, I can only solve 10 problems on the AIME so I should stick to solving hard problems on the AIME until I can comfortably solve most of them”. So they keep on working through old AIME papers.

This is a bad idea.

2. Perfect Scores

To understand why this is a bad idea, let’s ask the following question: how good to you have to be to consistently get a perfect score on the AIME?

Consider first a student averages a score of {10} on the AIME, which is a fairly comfortable qualifying score. For illustration, let’s crudely simplify and assume that on a 15-question exam, he has a independent {\frac23} probability of getting each question right. Then the chance he sweeps the AIME is

\displaystyle \left( \frac23 \right)^{15} \approx 0.228\%.

This is pretty low, which makes sense: {10} and {15} on the AIME feel like quite different scores.

Now suppose we bump that up to averaging {12} problems on the AIME, which is almost certainly enough to qualify for the USAMO. This time, the chance of sweeping is

\displaystyle \left( \frac{4}{5} \right)^{15} \approx 3.52\%.

This should feel kind of low to you as well. So if you consistently solve {80\%} of problems in training, your chance at netting a perfect score is still dismal, even though on average you’re only three problems away.

Well, that’s annoying, so let’s push this as far as we can: consider a student who’s averaging {14} problems (thus, {93\%} success), id est a near-perfect score. Then the probability of getting a perfect score

\displaystyle \left( \frac{14}{15} \right)^{15} \approx 35.5\%.

Which is\dots just over {\frac 13}.

At which point you throw up your hands and say, what more could you ask for? I’m already averaging one less than a perfect score, and I still don’t have a good chance of acing the exam? This should feel very unfair: on average you’re only one problem away from full marks, and yet doing one problem better than normal is still a splotchy hit-or-miss.

3. Some Combinatorics

Those of you who either know statistics / combinatorics might be able to see what’s going on now. The problem is that

\displaystyle (1-\varepsilon)^{15} \approx 1 - 15\varepsilon

for small {\varepsilon}. That is, if your accuracy is even a little {\varepsilon} away from perfect, that difference gets amplified by a factor of {15} against you.

Below is a nice chart that shows you, based on this oversimplified naïve model, how likely you are to do a little better than your average.

\displaystyle \begin{array}{lrrrrrr} \textbf{Avg} & \ge 10 & \ge 11 & \ge 12 & \ge 13 & \ge 14 & \ge 15 \\ \hline \mathbf{10} & 61.84\% & 40.41\% & 20.92\% & 7.94\% & 1.94\% & 0.23\% \\ \mathbf{11} & & 63.04\% & 40.27\% & 19.40\% & 6.16\% & 0.95\% \\ \mathbf{12} & & & 64.82\% & 39.80\% & 16.71\% & 3.52\% \\ \mathbf{13} & & & & 67.71\% & 38.66\% & 11.69\% \\ \mathbf{14} & & & & & 73.59\% & 35.53\% \\ \mathbf{15} & & & & & & 100.00\% \\ \end{array}

Even if you’re not aiming for that lofty perfect score, we see the same repulsion effect: it’s quite hard to do even a little better than average. If you get an average score of {k}, the probability of getting {k+1} looks to be about {\frac25}. As for {k+2} the chances are even more dismal. In fact, merely staying afloat (getting at least your average score) isn’t a comfortable proposition.

And this is in my simplified model of “independent events”. Those of you who actually take the AIME know just how costly small arithmetic errors are, and just how steep the difficulty curve on this exam is.

All of this goes to show: to reliably and consistently ace the AIME, it’s not enough to be able to do 95% of AIME problems (which is already quite a feat). You almost need to be able to solve AIME problems in your sleep. On any given AIME some people will get luckier than others, but coming out with a perfect score every time is a huge undertaking.

4. 90% Confidence?

By the way, did I ever mention that it’s really hard to be 90% confident in something? In most contexts, 90% is a really big number.

If you don’t know what I’m talking about:

take three or four minutes and do the following quiz.

This is also the first page of this worksheet. The idea of this quiz is to give you a sense of just how high 90% is. To do this, you are asked 10 numerical questions and must provide an interval which you think the answer lies within with probability 90%. (So ideally, you would get exactly 9 intervals correct.)

As a hint: almost everyone is overconfident. Second hint: almost everyone is overconfident even after being told that their intervals should be embarrassingly wide. Third hint: I just tried this again and got a low score.

(For more fun of this form: calibration game.)

5. Practice

So what do you do if you really want to get a perfect score on the AIME?

Well, first of all, my advice is that you have better things to do (like USAMO). But even if you are unshakeable on your desire to get a 15, my advice still remains the same: do some USAMO problems.

Why? The reason is that going from average {14} to average {15} means going from 95% accuracy to 99% accuracy, as I’ve discussed above.

So what you don’t want to do is keep doing AIME problems. You are not using your time well if you get 95% accuracy in training. I’m well on record saying that you learn the most from problems that are just a little above your ability level, and massing AIME problems is basically the exact opposite of that. You’d maybe only run into a problem you couldn’t solve once every 10 or 20 or 30 problems. That’s just grossly inefficient.

The way out of this is to do harder problems, and that’s why I explicitly suggest people start working on USAMO problems even before they’re 90% confident they will qualify for it. At the very least, you certainly won’t be bored.

Stop Paying Me Per Hour

Occasionally I am approached by parents who ask me if I am available to teach their child in olympiad math. This is flattering enough that I’ve even said yes a few times, but I’m always confused why the question is “can you tutor my child?” instead of “do you think tutoring would help, and if so, can you tutor my child?”.

Here are my thoughts on the latter question.

Charging by Salt

I’m going to start by clearing up the big misconception which inspired the title of this post.

The way tutoring works is very roughly like the following: I meet with the student once every week, with custom-made materials. Then I give them some practice problems to work on (“homework”), which I also grade. I throw in some mock olympiads. I strongly encourage my students to email me with questions as they come up. Rinse and repeat.

The actual logistics vary; for example, for small in-person groups I prefer to do every other week for 3 hours. But the thing that never changes is how the parents pay me. It’s always the same: I get N \gg 0 dollars per hour for the actual in-person meeting, and 0 dollars per hour for preparing materials, grading homework, responding to questions, and writing the mock olympiads.

Now I’m not complaining because N is embarrassingly large. But one day I realized that this pricing system is giving parents the wrong impression. They now think the bulk of the work is from me taking the time to meet with their child, and that the homework is to help reinforce what I talk about in class. After all, this is what high school does, right?

I’m here to tell you that this is completely wrong.

It’s the other way around: the class is meant to supplement the homework. Think salt: for most dishes you can’t get away with having zero salt, but you still don’t price a dish based on how much salt is in it. Similarly, you can’t excise the in-person meeting altogether, but the dirty secret is that the classtime isn’t the core component.

I mean, here’s the thing.

  • When you take the IMO, you are alone with a sheet of paper that says “Problem 1”, “Problem 2”, “Problem 3”.
  • When you do my homework, you are alone with a sheet of paper that says “Problem 1”, “Problem 2”, “Problem 3”.
  • When you’re in my class, you get to see my beautiful smiling face plus a sheet of paper that says “Theorem 1”, “Example 2”, “Example 3”.

Which of these is not like the other?

Active Ingredients

So we’ve established that the main active ingredient is actually you working on problems alone in your room. If so, why do you need a teacher at all?

The answer depends on what the word “need” means. No USA IMO contestant in my recent memory has had a coach, so you don’t need a coach. But there are some good reasons why one might be helpful.

Some obvious reasons are social:

  • Forces you to work regularly; though most top students don’t really have a problem with self-motivation
  • You have a person to talk to. This can be nice if you are relatively isolated from the rest of the math community (e.g. due to geography).
  • You have someone who will answer your questions. (I can’t tell you how jealous I am right now.)
  • Feedback on solutions to problems. This includes student’s written solutions (stylistic remarks, or things like “this lemma you proved in your solution is actually just a special case of X” and so on) as well as explaining solutions to problems the student fails to solve.

In short, it’s much more engaging to study math with a real person.

Those reasons don’t depend so much on the instructor’s actual ability. Here are some reasons which do:

  • Guidance. An instructor can tell you what things to learn or work on based on their own experience in the past, and can often point you to things that you didn’t know existed.
  • It’s a big plus if the instructor has a good taste in problems. Some problems are bad and don’t teach you anything; some (old) problems don’t resemble the flavor of problems that actually appear on olympiads. On the flip side, some problems are very instructive or very pretty, and it’s great if your teacher knows what these are.
  • Ideally, also a good taste in topics. For example, I strongly object to classes titled “collinearity and concurrence” because this may as well be called “geometry”, and I think that such global classes are too broad to do anything useful. Conversely, examples of topics I think should be classes but aren’t: “looking at equality cases”, “explicit constructions”, “Hall’s marriage theorem”, “greedy algorithms”. I make this point a lot more explicitly in Section 2 of this blog post of mine.

In short, you’re also paying for the material and expertise. Past IMO medalists know how the contest scene works. Parents and (beginning) students less so.

Lastly, the reason which I personally think is most important:

  • Conveys strong intuition/heuristics, both globally and for specific problems. It’s hard to give concrete examples of this, but a few global ones I know were particularly helpful for me: “look at maximal things” (Po-Shen Loh on greedy algorithms), “DURR WE WANT STUFF TO CANCEL” (David Yang on FE’s), “use obvious inequalities” (Gabriel Dospinescu on analytic NT), which are take-aways that have gotten me a lot of points. This is also my biggest criteria for evaluating my own written exposition.

You guys know this feeling, I’m sure: when your English teacher assigned you an passage to read, the fastest way to understand it is to not read the passage but to ask the person sitting next to you what it’s saying. I think this is in part because most people are awful at writing and don’t even know how to write for other human beings.

The situation in olympiads is the same. I estimate listening to me explain a solution is maybe 4 to 10 times faster than reading the official solution. Turns out that writing up official solutions for contests is a huge chore, so most people just throw a sequence of steps at the reader without even bothering to identify the main ideas. (As a contest organizer, I’m certainly guilty of this laziness too!)

Aside: I think this is a large part of why my olympiad handouts and other writings have been so well-received. Disclaimer: this was supposed to be a list of what makes a good instructor, but due to narcissism it ended up being a list of things I focus on when teaching.

Caveat Emptor

And now I explain why the top IMO candidates still got by without teachers.

It turns out that the amount of math preparation time that students put in doesn’t seem to be a normal distribution. It’s a log normal distribution. And the reason is this: it’s hard to do a really good job on anything you don’t think about in the shower.

Officially, when I was a contestant I spent maybe 20 hours a week doing math contest preparation. But the actual amount of time is higher. The reason is that I would think about math contests more like 24/7. During English class, I would often be daydreaming about the inequality I worked on last night. On the car ride home, I would idly think about what I was going to teach my middle school students the next week. To say nothing of showers: during my showers I would draw geometry diagrams on the wall with water on my finger.

So spiritually, I maybe spent 10 times as much time on math olympiads compared to an average USA(J)MO qualifier.

And that factor of 10 is enormous. Even if I as a coach can cause you to learn two or three or four times more efficiently, you will still lose to that factor of 10. I’d guess my actual multiplier is somewhere between 2 and 3, so there you go. (Edit: this used to say 3 to 4, I think that’s too high now.)

The best I can do is hope that, in addition to making my student’s training more efficient, I also cause my students to like math more.

On Problem Sets

(It appears to be May 7 — good luck to all the national MathCounts competitors tomorrow!)

1. An 8.044 Problem

Recently I saw a 8.044 physics problem set which contained the problem

Consider a system of {N} almost independent harmonic oscillators whose energy in a microcanonical ensemble is given by {E = \frac 12 \hbar \omega N + \hbar \omega M}. Show that this energy can be obtained is {\frac{(M+N-1)!}{M!(N-1)!}}.

Once you remove the physics fluff, it immediately reduces to

Show the number of nonnegative integer solutions to {M = \sum_{i=1}^N n_i} is {\frac{(M+N-1)!}{M!(N-1)!}}.

And as anyone who has done lots of math contests knows, this is the famous stars and bars problem (also known as balls and urns).

This made me really upset when I saw it, for two reasons. One, the main difficulty of the question isn’t related to the physics at hand at all. Once you plug in the definition you get a fairly elegant combinatorics problem, not a physics problem. And secondly, although the solution to the (unrelated) combinatorics is nice, it’s very tricky. I don’t think I could have come up with it easily if I hadn’t seen it before. Either you’ve seen the stars-and-bars trick before and the problem is trivial, or you haven’t seen the trick, and you could easily spend a couple hours trying to come up with a solution — and none of that two hours is teaching you any physics.

You can see why a physics instructor might give this as a homework problem. The solution is short and elementary, something that a undergraduate student could understand and write down. But somewhere at MIT, some poor non-mathematician just spent a good chunk of their evening struggling with this one-trick classic and probably not learning much from it.

2. Don’t I Like Hard Problems?

Well, “not learning much from it” is not entirely accurate\dots

Something that bothered me (and which I hope also bothers the reader) was I complained that the problem was “tricky”. That seems off, because as you might already know, I like hard problems; in fact, in high school I was well despised for helping teachers find hard extra credit problems to pose. (“Hard” isn’t quite the same as “tricky”, but that’s a different direction altogether.) After all, hard problems from math contests taught me to think, isn’t that right?

Well, maybe what’s wrong is that there’s no physics in the hard part of the problem; the bonus problems I provided for my teachers were all closely tied to the material at hand. But that doesn’t seem right either. Euclidean geometry might be useless outside of high school, but nonetheless all the time I spent developing barycentric coordinates still made me a smarter person. Similarly, Richard Rusczyk will often tell you that geometry problems trained him for running the business that is now the Art of Problem Solving. For exactly the same reason, thinking about the stars and bars problem is certainly good for the mind, isn’t that right? Why was I upset about it?

Well, I still hold my objection that there’s no physics in the problem. Why? So at this point we’re naturally led to ask: what was the point of the problem set in the first place? And that answer this, you have to ask: what was the point of the class in the first place?

On paper, it’s to learn physics. Is that really all? Maybe the professor thinks it’s important to teach students how to think as well. Does she? And the answer here is I really don’t know, because I have no idea who’s teaching the class. So I’ll instead ask the more idealistic question: should she?

And surprisingly, I think the answer can be very different from place to place.

On one extreme, I think high school math should be mainly about teaching students to think. Virtually none of the students will actually use the specific content being taught in the class. Why does the average high school student need to know what {\int_{[0,1]} x^2 \; dx} is? They don’t, and that shouldn’t be the point of the class; not the least of reasons being that in ten years half of them won’t even remember what {\int} means anymore.

But on the other extreme, if you have a math major trying to learn the undergraduate curriculum the picture can change entirely, just because there is so much math to cover. It’s kind of ridiculous, honestly: take the average incoming freshman and the average senior math major, and the latter will know so much more than the former. So in this case I would be much more worried about the content of the course; assuming for example that I’m hoping to be a math major, the chance that the (main ideas of) the specific content will be useful later on is far higher.

This is especially true for, say, students who did math contests extensively in high school, because that ability to solve hard problems is already there; it’s not an interesting use of time to be slowly doing challenging exercises in group theory when there’s still modules, rings, fields, categories, algebraic geometry, homological algebra, all untouched (to say nothing of analysis).

What this boils down to is trying to distinguish between the actual content of the given class (something very local) versus the more general skill of problem-solving or thinking. In high school I focused almost exclusively on the latter; as time passes I’ve been shifting my focus farther and farther to the former.

3. {\text{A} \ge 90\%}

Now suppose that we are interested in teaching how to think on these problem sets. There’s one other difference between the problem sets and math contests. You’re expected to finish your problem sets and you’re not expected to finish math contests.

I want to complain that there seems to be a stigma that you have to do exercises in order to learn math or physics or whatever, and that people who give up on them are somehow lazy or something. It is true in some sense that you can only learn math by doing. It is probably true that thinking about a hard problem will teach you something. What is not true is that you should always stare at a problem until either it or you cracks.

This is obviously true in math contests too. One of the things I was really bad at was giving up on a problem after hours of no progress. In some sense the time limit of contests is kind of nice; it cuts you off from spending too long on any one problem. You can’t be expected to be able to solve all hard problems, or else they’re not hard.

Problem sets fare much more poorly in this respect. The benefit of thinking about the hard problem diminishes over time (e.g. a typical exercise can teach you more in the first hour than it does in the next six) and sometimes you’re just totally dead in the water after a couple hours of staring. The big guy seem to implicitly tell you that you should keep working because it’s supposed to be hard. Is that really true? It certainly wasn’t true in the math contest world, so I don’t see any reason why it’s true here.

In other words, I don’t think our poor physics student would have lost much by giving up on balls and urns after a few hours. And really, for all the warnings that looking up problems online is immoral, is asking your friend to help really that different?

Writing

In high school, I hated English class and thought it was a waste of time. Now I’m in college, and I still hate English class and think it’s a waste of time. (Nothing on my teachers, they were all nice people, and I hope they’re not reading this.)

However, I no longer think writing itself is a waste of time. Otherwise, I wouldn’t be blogging, even about math. This post explains why I changed my mind.

1. Guts

My impression is that teachers in high school got it all wrong.

In high school, students are told to learn algebra because “we all use math every day”. This is obviously false, and somehow the students eventually are led to believe it.

You can’t actually be serious. Do people really think that knowing the Pythagorean Theorem will help in your daily life? I sure don’t, and I’m an aspiring mathematician. (Tip: Even real mathematicians stopped doing Euclidean geometry ages go.) It’s hilarious when you think about it. We’ve convinced millions of kids all over the country that they’re learning math because it’s useful in their lives, and they grudgingly believe it.

The actual answer of why we teach math in schools is that it is supposed to teach students how to think. But even the teachers have lost sight of this. Most high school math teachers are now just interested in making sure their students can “do” certain classes of problems in a short time, where “do” here doesn’t refer to solving the problem but regurgitating the solution that’s already been presented. The process is so repetitive and artificial that in high school I wrote computer programs to do my homework for me, because all the “problems” were just the same thing with numbers changed. If you’re interested in just how far off math is, I encourage you to read Lockhart’s Lament.

How can this happen? I think the answer is that many high schoolers don’t really have the guts to think, “my math teachers don’t have a clue”, even though they like to joke about it. I have the guts to say this now because I know lots of math. And it’s amazing to know that millions and millions of people are just plain wrong about something I believe in.

But on to the topic of this post…

2. The world lied to me

I was always told that the purpose of English class was to learn to write. Why is this important? Because it was important to be able to communicate my ideas.

Dead wrong. Somehow the skill of being able to argue on the nature of love in Romeo and Juliet was going to help me when I was writing a paper on Evan’s Theorem years down the road? That’s what my parents said. It sounds absurd when I put it this way, but people believe it. (And let’s not forget the fact that theorems are named by last name…)

I claim that the situation is just like math. People are just being boneheads. As it turns out, the standard structure of an English essay is nothing more than a historical accident. Even the fact that essays are about literature is a historical accident. But that’s beyond the scope of what I have to say.

So what is the purpose of writing? It turns out that there is one, and that it has nothing to do with communication. It’s that writing clarifies thinking.

3. Writing lets you see everything

“I sometimes find, and I am sure you know the feeling, that I simply have too many thoughts and memories crammed into my mind…. At these times… I use the Pensieve. One simply siphons the excess thoughts from one’s mind, pours them into the basin, and examines them at one’s leisure.”

— Harry Potter and the Goblet of Fire

Here’s some advice to all of you still in doing math contests — start keeping track of the problems you solve.

There’s superficial reasons for doing this. A few days ago I was trying to write a handout on polynomials, and I was looking for some problems on irreducibility. I knew I had seen and done a bunch of these problems in the past, but of course like most people I hadn’t bothered to keep track of every problem I did, so I could only remember a few off my head. So I had to go through the painful process of looking through my old posts on the Art of Problem Solving forums, searching through old databases, mucking through pages of garbage looking for problems that I did ages ago that I could use for my handout. And all the time I was thinking, “man, I should have kept track of all the problems I did”.

But there are deeper reasons for this. As I started collating the problems and solutions into a list, I started noticing some themes in the solutions that I never noticed before. For example, basically every solution started with the line “Assume for contradiction that {f} is not irreducible and write {f = g \cdot h}”. And then from there, one of three things happened.

  • The problem would take the coefficients modulo some prime or prime power, and then deduce some things about {g} and {h}. Obviously this only worked on the problems with integer coefficients.
  • The problem would start looking at absolute values of the coefficients and try to achieve some bound that showed the polynomial had to reduce in a certain way.
  • If the problem had multiple variables, the solution would reduce to a case with just one-variable. This was always the case with problems that had complex coefficients as well.

You can’t really be serious — I’m only noticing this now? Here I was, already a retired contestant, looking at problems I had done long long ago and only realizing now there was a common theme. I had already done all the work by having done all the problems. The only difference was that I didn’t write anything down; as a result I could only look at one problem at a time.

Needless to say, I was very angry for the rest of the day.

4. External and Working Memory

Why does this happen? More profoundly, it turns out that humans have a finite working memory. You can only keep so many things in your head at once. That’s why it’s a stupid idea to not write down problems and (sketches of) solutions after you solve them and keep them somewhere you can look at.

I probably did at least 1000 olympiad problems over the course of my life. Did I manage to keep all the solutions in my head? Of course not. That’s why at the IMO in 2014, I didn’t try a maximality argument despite the {\sqrt n} in the problem. I think if I had kept better records I wouldn’t have missed this. How else do you get exactly {\sqrt n} in the lower bound? It’s not even an integer! Poof. There goes my neat 42.

I didn’t realize this wasn’t just a math thing until much later. I was talking about something along these lines during my interview for Harvard College; my interviewer was an artist. When I was talking about writing things down because I couldn’t keep them all in my head, he said something that surprised me — his easel was covered with sticky notes where he wrote down any ideas that occurred to him. He called it “external memory”, a term I still use now.

It’s actually obvious when you think about it. Why do people have to-do lists and calendars and reminders? Because you can’t keep track of everything in your head. You can try and might even get good at it, but you’ll never do as well as the old-fashioned pen and paper.

This isn’t just about “I need to remember to do {X} in exactly {Y} time”. There’s a reason we use blackboards during math lectures instead of just talking. The ideas in math are really, really hard, because math is only about ideas, and nothing else. If the professors didn’t write the steps on the board, no one would be able to keep more than two or three steps in their head at once. The difficulty is only compounded by the fact that math has its own notation. We didn’t develop this notation because we were bored. We developed notation because the ideas we’re trying to express are so complex that the English language can’t even express them. In other words, mathematicians were forced to create a whole new set of symbols just to write down their ideas.

5. An Imperfect Analogy to Teaching

But so far I haven’t really argued anything other than “if you want to remember something you better write it down”. There’s a difference between a to-do list and an exposition. One is just a collection of disconnected bullet points. The other needs to do more, it needs to explain.

The following quote is excerpted from Richard Rusczyk’s article “Learning Through Teaching” ).

You can’t just “kind of get it” or know it just well enough to get by on a test; teaching calls for complete understanding of the concept.

  • How do you know that?
  • When would you use that?
  • How could you come up with that in the first place?

If you can’t answer these questions for something you “know”, then you can’t teach it.

I knew this was true from my own experiences teaching, but it took me more time to realize that writing well is a similar skill. The difference is the medium: when you’re teaching in person, you get real-time feedback on whether what you said makes sense. You don’t get this live feedback when you’re writing, and so you need to be much more careful. Yet all the nuances of teaching are still there — distinguishing between details, main ideas, hardest steps; deciding what can be worked out from what other things, even deciding which things are worth including and which things should be omitted.

This all really started to become obvious to me when I started my olympiad geometry textbook. In senior year of high school, I decided that I had a good enough understanding of olympiad geometry to write a textbook on it. I felt like I could probably do better than all the existing resources; not as hard as it sounds, since to my knowledge there aren’t any dedicated books for olympiad geometry.

After I had around 200 pages written, I realized that I had gotten a lot better at geometry. There were lots of things that happened in the process of thinking about the best way to teach geometry.

  1. Most basically, I did in fact fill in gaps in my knowledge. For example, I studied projective transformations for the first time in order to write the corresponding section in my book. The ideas definitely clicked much faster when I was thinking about how to teach it.
  2. I made new connections. I realized for the first time that symmedians and harmonic quadrilaterals are actually the same concept; I discovered a lemma about directed angles that I wished I had known before; I found a new proof to Menelaus using an elegant strategy I had used on Monge’s Theorem. None of this would have happened from just doing problems.
  3. Most profoundly, I got a much better understanding for when to apply certain techniques. One of the main goals of my book was to make solutions natural — a reader should be able to understand where a solution came from. That meant that at every page I was constantly fighting to try and explain how I had thought up of something. This unending reflection was exhausting and reduced me to a rate of about one page written per hour\footnote{But conveniently, this process is something that just requires a laptop, not even paper and pencil. So I got a lot of pages written during office assistant.}. But it improved my own ability significantly.

Ultimately what this exemplifies is that trying to explain something lets you understand it better. And that’s in part because you can only manage so many things in your head at once. If you think keeping track of your appointments in your head is hard, try doing that with a complex argument. Can’t do it. Writing solves this problem.

6. Finding the Truth

But that’s not a perfect analogy. What I’ve presented above is a model where you have ideas in your head and you output them onto paper. This isn’t totally accurate, because as you write, something else can happen: the ideas can change.

I’ll draw an analogy from painting, again courtesy of Paul Graham.

The model of painting I used to have is that you would have something you want to draw, and then you sit down and draw it, then polish up the details. (That’s how I did all my high school art projects, anyways.) But this turns out to not be true: Countless paintings, when you look at them in x-rays, turn out to have limbs that have been moved or facial features that have been readjusted. I was surprised when I first read this. But it makes sense if you can think about it: how you can be sure what’s in your head is what you want if you can’t even see it yet?

I propose that writing does the same thing. I don’t start by thinking “these are the ideas and I will now write them down”. Rather, I just write my thoughts down, not sure where they’re going to end up. That’s how my geometry textbook actually got written. I didn’t start with a table of contents. I started by putting down ideas, finding the connections between them, noticing new things I hadn’t before. I created new sections on the fly as the need arose, added new things as I thought of them, and let the whole thing sort itself out with a simple \verb+\tableofcontents+. You can even think of the table of contents as a natural bucket sort — put down related ideas near each others, add section headers as needed, and bam, you have an outline of the main ideas. And I never know what this outline will look like until it’s actually been written.

By the same token, revising shouldn’t be the art of modifying the presentation of an idea to be more convincing. It should be the art of changing the idea itself to be closer to the truth, which will automatically make it more convincing. This is consistent with the Latin: the word “revise” literally means “see again”.

This is where high school and college essays get it really wrong. In a college essay, the goal is to “sell an idea” to the reader. If something in the essay looks unconvincing, you fix it by trickery: re-writing it in a way that it sounds more convincing without changing the underlying idea. The way you say something goes a long way in selling it. That’s what English class should have taught you. Sure, some teachers tell you to make concessions or counterarguments, but you’re doing this to try and pretend to be “honest”. You only write such things with an agenda in mind.

But since when are you always right? That’s absurd. The English class model is “I have a thesis that I know is right, and now I’m going to explain to the reader why”. But how can you know you’re right about a thesis before you’ve written it down? If the thesis and its accompanying argument is even remotely complex, it wouldn’t have been possible to sort through the whole thing in your head. Worse still, if the thesis is nontrivial, odds are that someone who is about as smart as you will disagree with you. And as Yan Zhang often reminds the SPARC attendees, you should really only expect to be right about half the time when you disagree with someone about as smart as you. If an essay is supposed to move you closer to the truth, and your original thesis is wrong half the time, do you scrap half your essays? Unfortunately, I don’t think you’d ever pass English class that way.

The culture that’s been instilled, where the goal of writing is to convince, is intellectually dishonest. I might even go to say it’s dangerous; I’ll have to think about that for a while. There are times when you do want to write to convince others (grant proposals, anyone?) but it seems highly unfortunate that this type of writing has become synonymous with writing as a whole.

7. Conclusion

So this post has a few main ideas. The main purpose of writing is not in fact communication, at least not if you’re interested in thinking well. Rather, the benefits (at least the ones I perceive) are

  • Writing serves as an external memory, letting you see all your ideas and their connections at once, rather than trying to keep them in your head.
  • Explaining the ideas forces you to think well about them, the same way that teaching something is only possible with a full understanding of the concept.
  • Writing is a way to move closer to the truth, rather than to convince someone what the truth is.

So now I’ll tell you how I actually wrote my geometry book, or this blog post, or any of my various olympiad articles. It starts because I have an idea — just a passing thought, like “this would be a good way to explain Masckhe’s Theorem”. Some time later I’ll another such thought which is related to the first. Then a third. My memory is especially bad, so pretty soon it bothers me so much that I have to write it down, because I’m starting to lose track. And as I write the first ideas down, I start noticing new ideas, so I add in these ideas, and then more new ideas start flooding in. There are so many things I want to say and I just keep writing them down. That’s how I ended up with a 400-page textbook written from what originally was just meant to be a short article. There were too many things to say that other people hadn’t said yet, and I just had to write them all down. The miraculous things is that these ideas naturally sorted themselves out. The bulleted main ideas I listed above weren’t things I realized until I looked at the resulting table of contents.

I’m sometimes told by people I respect that they like my writing. But I think this actually just translates to “I like the ideas in your writing”, and so I take it as a big compliment.

Why do roots come in conjugate pairs?

This is an expanded version of an answer I gave to a question that came up while I was assisting the 2014-2015 WOOT class. It struck me as an unusually good way to motivate higher math using stuff that people notice in high school but for some reason decide to not think about.

In high school precalculus, you’ll often be asked to find the roots of some polynomial with integer coefficients. For instance,

\displaystyle x^3 - x^2 - x - 15 = (x-3)(x^2+2x+5)

has roots {3}, {1+2i}, {-1-2i}. Or as another example,

\displaystyle x^3 - 3x^2 - 2x + 2 = (x+1)(x^2-4x+2)

has roots {-1}, {2 + \sqrt 2}, {2 - \sqrt 2}. You’ll notice that the “weird” roots, like {1 \pm 2i} and {2 \pm \sqrt 2}, are coming up in pairs. In fact, I think precalculus explicitly tells you that the imaginary roots come in conjugate pairs. More generally, it seems like all the roots of the form {a + b \sqrt c} come in “conjugate pairs”. And you can see why.

But a polynomial like

\displaystyle x^3 - 8x + 4

has no rational roots. (The roots of this are approximately {-3.0514}, {0.51730}, {2.5341}.) Or even simpler,

\displaystyle x^3 - 2

has only one real root, {\sqrt[3]{2}}. These roots, even though they are irrational, have no “conjugate” pairs. Or do they?

Let’s try and figure out exactly what’s happening. Let {\alpha} be any complex number. We define the minimal polynomial of {\alpha} to be the monic polynomial {P(x)} such that

  • {P(x)} has rational coefficients, and leading coefficient {1},
  • {P(\alpha) = 0}.
  • The degree of {P} is as small as possible.

For example, {\sqrt 2} has minimal polynomial {x^2-2}. Note that {100x^2 - 200} is also a polynomial of the same degree which has {\sqrt 2} as a root; that’s why we want to require the polynomial to be monic. That’s also why we choose to work in the rational numbers; that way, we can divide by leading coefficients without worrying if we get non-integers.

Why do we care? The point is as follows: suppose we have another polynomial {A(x)} such that {A(\alpha) = 0}. Then we claim that {P(x)} actually divides {A(x)}! That means that all the other roots of {P} will also be roots of {A}.

The proof is by contradiction: if not, by polynomial long division, we can find a quotient and remainder {Q(x)}, {R(x)}such that

\displaystyle A(x) = Q(x) P(x) + R(x)

and {R(x) \not\equiv 0}. Notice that by plugging in {x = \alpha}, we find that {R(\alpha) = 0}. But {\deg R < \deg P}, and {P(x)} was supposed to be the minimal polynomial. That’s impossible!

Let’s look at a more concrete example. Consider {A(x) = x^3-3x^2-2x+2} from the beginning. The minimal polynomial of {2 + \sqrt 2} is {P(x) = x^2 - 4x + 2} (why?). Now we know that if {2 + \sqrt 2} is a root, then {A(x)} is divisible by {P(x)}. And that’s how we know that if {2 + \sqrt 2} is a root of {A}, so must {2 - \sqrt 2}.

As another example, the minimal polynomial of {\sqrt[3]{2}} is {x^3-2}. So {\sqrt[3]{2}} actually has two conjugates, namely, {\alpha = \sqrt[3]{2} \left( \cos 120^\circ + i \sin 120^\circ \right)} and {\beta = \sqrt[3]{2} \left( \cos 240^\circ + i \sin 240^\circ \right)}. Thus any polynomial which vanishes at {\sqrt[3]{2}} also has {\alpha} and {\beta} as roots!

You can generalize this by replacing {\mathbb Q} with any field and all of this still works. One central idea of Galois theory is that these “conjugates” all “look the same” as far as {\mathbb Q} can tell.

As another aside: does the minimal polynomial exist for every {\alpha}? It turns out the answer is no, and the numbers for which there is no minimal polynomial are called the transcendental numbers.