I want to say a little about the process which I use to design my olympiad handouts and classes these days (and thus by extension the way I personally think about problems). The short summary is that my teaching style is centered around **showing connections and recurring themes between problems**.

Now let me explain this in more detail.

Solutions to olympiad problems can look quite different from one another at a surface level, but typically they center around one or two **main ideas**, as I describe in my post on reading solutions. Because details are easy to work out once you have the main idea, as far as learning is concerned you can more or less throw away the details and pay most of your attention to main ideas.

Thus whenever I solve an olympiad problem, I make a deliberate effort to summarize the solution in a few sentences, such that I basically know how to do it from there. I also make a deliberate effort, whenever I write up a solution in my notes, to structure it so that my future self can see all the key ideas at a glance and thus be able to understand the general path of the solution immediately.

The example I’ve previously mentioned is USAMO 2014/6.

**Example 1** **(USAMO 2014, Gabriel Dospinescu)**

Prove that there is a constant with the following property: If are positive integers such that for all , then

If you look at any complete solution to the problem, you will see a lot of technical estimates involving and the like. But the main idea is very simple: “consider an table of primes and note the small primes cannot adequately cover the board, since ”. Once you have this main idea the technical estimates are just the grunt work that you force yourself to do if you’re a contestant (and don’t do if you’re retired like me).

Thus the study of olympiad problems is reduced to the study of main ideas behind these problems.

So how do we come up with the main ideas? Of course I won’t be able to answer this question completely, because therein lies most of the difficulty of olympiads.

But I do have some progress in this way. It comes down to seeing how main ideas are similar to each other. I spend a lot of time trying to **classify the main ideas** into categories or themes, based on how similar they feel to one another. If I see one theme pop up over and over, then I can make it into a class.

I think **olympiad taxonomy** is severely underrated, and generally not done correctly. The status quo is that people do bucket sorts based on the particular *technical details* which are present in the problem. This is correlated with the main ideas, but the two do not always coincide.

An example where technical sort works okay is Euclidean geometry. Here is a simple example: harmonic bundles in projective geometry. As I explain in my book, there are a few “basic” configurations involved:

- Midpoints and parallel lines
- The Ceva / Menelaus configuration
- Harmonic quadrilateral / symmedian configuration
- Apollonian circle (right angle and bisectors)

(For a reference, see Lemmas 2, 4, 5 and Exercise 0 here.) Thus from experience, any time I see one of these pictures inside the current diagram, I think to myself that “this problem feels projective”; and if there is a way to do so I try to use harmonic bundles on it.

An example where technical sort fails is the “pigeonhole principle”. A typical problem in such a class looks something like USAMO 2012/2.

**Example 2** **(USAMO 2012, Gregory Galperin)**

A circle is divided into congruent arcs by points. The points are colored in four colors such that some points are colored Red, some points are colored Green, some points are colored Blue, and the remaining points are colored Yellow. Prove that one can choose three points of each color in such a way that the four triangles formed by the chosen points of the same color are congruent.

It’s true that the official solution uses the words “pigeonhole principle” but that is not really the heart of the matter; the key idea is that you consider all possible rotations and count the number of incidences. (In any case, such calculations are better done using expected value anyways.)

Now why is taxonomy a good thing for learning and teaching? The reason is that building connections and seeing similarities is most easily done by simultaneously presenting several related problems. I’ve actually mentioned this already in a different blog post, but let me give the demonstration again.

Suppose I wrote down the following:

You can tell what each of the ‘s, ‘s, ‘s have in common by looking for a few moments. But what happens if I intertwine them?

This is the same information, but now you have to work much harder to notice the association between the letters and the numbers they’re next to.

This is why, if you are an olympiad student, I strongly encourage you to keep a journal or blog of the problems you’ve done. Solving olympiad problems takes lots of time and so it’s worth it to spend at least a few minutes jotting down the main ideas. And once you have enough of these, you can start to see new connections between problems you haven’t seen before, rather than being confined to thinking about individual problems in isolation. (Additionally, it means you will never have redo problems to which you forgot the solution — learn from my mistake here.)

I want to elaborate more on geometry in general. These days, if I see a solution to a Euclidean geometry problem, then I mentally store the problem and solution into one (or more) buckets. I can even tell you what my buckets are:

- Direct angle chasing
- Power of a point / radical axis
- Homothety, similar triangles, ratios
- Recognizing some standard configuration (see Yufei for a list)
- Doing some length calculations
- Complex numbers
- Barycentric coordinates
- Inversion
- Harmonic bundles or pole/polar and homography
- Spiral similarity, Miquel points

which my dedicated fans probably recognize as the ten chapters of my textbook. (Problems may also fall in more than one bucket if for example they are difficult and require multiple key ideas, or if there are multiple solutions.)

Now whenever I see a new geometry problem, the diagram will often “feel” similar to problems in a certain bucket. Exactly what I mean by “feel” is hard to formalize — it’s a certain gut feeling that you pick up by doing enough examples. There are some things you can say, such as “problems which feature a central circle and feet of altitudes tend to fall in bucket 6”, or “problems which only involve incidence always fall in bucket 9”. But it seems hard to come up with an exhaustive list of hard rules that will do better than human intuition.

But as I said in my post on reading solutions, there are deeper lessons to teach than just technical details.

For examples of themes on opposite ends of the spectrum, let’s move on to combinatorics. Geometry is quite structured and so the themes in the main ideas tend to translate to specific theorems used in the solution. Combinatorics is much less structured and many of the themes I use in combinatorics cannot really be formalized. (Consequently, since everyone else seems to mostly teach technical themes, several of the combinatorics themes I teach are idiosyncratic, and to my knowledge are not taught by anyone else.)

For example, one of the unusual themes I teach is called **Global**. It’s about the idea that to solve a problem, you can just kind of “add up everything at once”, for example using linearity of expectation, or by double-counting, or whatever. In particular these kinds of approach ignore the “local” details of the problem. It’s hard to make this precise, so I’ll just give two recent examples.

**Example 3** **(ELMO 2013, Ray Li)**

Let be nine real numbers, not necessarily distinct, with average . Let denote the number of triples for which . What is the minimum possible value of ?

**Example 4** **(IMO 2016)**

Find all integers for which each cell of table can be filled with one of the letters , and in such a way that:

- In each row and column, one third of the entries are , one third are and one third are ; and
- in any diagonal, if the number of entries on the diagonal is a multiple of three, then one third of the entries are , one third are and one third are .

If you look at the solutions to these problems, they have the same “feeling” of adding everything up, even though the specific techniques are somewhat different (double-counting for the former, diagonals modulo for the latter). Nonetheless, my experience with problems similar to the former was immensely helpful for the latter, and it’s why I was able to solve the IMO problem.

This perspective also explains why I’m relatively bad at functional equations. There are *some* things I can say that may be useful (see my handouts), but much of the time these are just technical tricks. (When sorting functional equations in my head, I have a bucket called “standard fare” meaning that you “just do work”; as far I can tell this bucket is pretty useless.) I always feel stupid teaching functional equations, because I never have many good insights to say.

Part of the reason is that functional equations often don’t have a main idea at all. Consequently it’s hard for me to do useful taxonomy on them.

Then sometimes you run into something like the windmill problem, the solution of which is fairly “novel”, not being similar to problems that come up in training. I have yet to figure out a good way to train students to be able to solve windmill-like problems.

I’ll close by mentioning one common way I come up with a theme.

Sometimes I will run across an olympiad problem which I solve quickly, and think should be very easy, and yet once I start grading I find that the scores are much lower than I expected. Since the way I solve problems is by drawing experience from similar previous problems, this must mean that I’ve subconsciously found a general framework to solve problems like , which is not obvious to my students yet. So if I can put my finger on what that framework is, then I have something new to say.

The most recent example I can think of when this happened was TSTST 2016/4 which was given last June (and was also a very elegant problem, at least in my opinion).

**Example 5** **(TSTST 2016, Linus Hamilton)**

Let be a positive integers. Prove that we must apply the Euler function at least times before reaching .

I solved this problem very quickly when we were drafting the TSTST exam, figuring out the solution while walking to dinner. So I was quite surprised when I looked at the scores for the problem and found out that empirically it was not that easy.

After I thought about this, I have a new tentative idea. You see, when doing this problem I really was thinking about “what does this operation do?”. You can think of as an infinite tuple

of prime exponents. Then the can be thought of as an operation which takes each nonzero component, decreases it by one, and then adds some particular vector back. For example, if then is decreased by one and each of and are increased by one. In any case, if you look at this behavior for long enough you will see that the coordinate is a natural way to “track time” in successive operations; once you figure this out, getting the bound of is quite natural. (Details left as exercise to reader.)

Now when I read through the solutions, I found that many of them had not really tried to think of the problem in such a “structured” way, and had tried to directly solve it by for example trying to prove (which is false) or something similar to this. I realized that had the students just ignored the task “prove ” and spent some time getting a better understanding of the structure, they would have had a much better chance at solving the problem. Why had I known that structural thinking would be helpful? I couldn’t quite explain it, but it had something to do with the fact that the “main object” of the question was “set in stone”; there was no “degrees of freedom” in it, and it was concrete enough that I felt like I could understand it. Once I understood how multiple operations behaved, the bit about almost served as an “answer extraction” mechanism.

These thoughts led to the recent development of a class which I named **Rigid**, which is all about problems where the point is not to immediately try to prove what the question asks for, but to first step back and understand completely how a particular rigid structure (like the in this problem) behaves, and to then solve the problem using this understanding.

]]>

Spoiler warnings: USAMO 2014/1, and hints for Putnam 2014 A4 and B2. You may want to work on these problems yourself before reading this post.

At last year’s USA IMO training camp, I prepared a handout on writing/style for the students at MOP. One of the things I talked about was the “ocean-crossing point”, which for our purposes you can think of as the discrete jump from a problem being “essentially not solved” () to “essentially solved” (). The name comes from a Scott Aaronson post:

Suppose your friend in Boston blindfolded you, drove you around for twenty minutes, then took the blindfold off and claimed you were now in Beijing. Yes, you do see Chinese signs and pagoda roofs, and no, you can’t immediately disprove him — but based on your knowledge of both cars and geography, isn’t it more likely you’re just in Chinatown? . . .

We start in Boston, we end up in Beijing, and at no point is anything resembling an ocean ever crossed.

I then gave two examples of how to write a solution to the following example problem.

**Problem 1** **(USAMO 2014)**

Let , , , be real numbers such that and all zeros , , , and of the polynomial are real. Find the smallest value the product

can take.

*Proof:* (Not-so-good write-up) Since for every (where ), we get which equals to . If this is and . Also, , this is .

*Proof:* (Better write-up) The answer is . This can be achieved by taking , whence the product is , and .

Now, we prove this is a lower bound. Let . The key observation is that

Consequently, we have

This proves the lower bound.

You’ll notice that it’s much easier to see the key idea in the second solution: namely,

which allows you use the enigmatic condition .

Unfortunately I have the following confession to make:

In practice, most solutions are written more like the first one than the second one.

The truth is that writing up solutions is sort of a chore that people never really want to do but have to — much like washing dishes. So must solutions won’t be written in a way that helps you learn from them. This means that when you read solutions, you should assume that the thing you really want (i.e., the ocean-crossing point) is buried somewhere amidst a haystack of other unimportant details.

But in practice even the “better write-up” I mentioned above still has too much information in it.

Suppose you were explaining how to solve this problem to a friend. You would probably not start your explanation by saying that the minimum is , achieved by — even though this is indeed a logically necessary part of the solution. Instead, the first thing you would probably tell them is to notice that

In fact, if your friend has been working on the problem for more than ten minutes, this is probably the *only* thing you need to tell them. They probably already figured out by themselves that there was a good chance the answer would be , just based on the condition . This “one-liner” is all that they need to finish the problem. You don’t need to spell out to them the rest of the details.

When you explain a problem to a friend in this way, you’re communicating just the difference: the one or two sentences such that your friend could work out the rest of the details themselves with these directions. When reading the solution yourself, you should try to extract the main idea in the same way. Olympiad problems generally have only a few main ideas in them, from which the rest of the details can be derived. So reading the solution should feel much like searching for a needle in a haystack.

In particular: you should rarely read most of the words in the solution, and you should almost never read every word of the solution.

Whenever I read solutions to problems I didn’t solve, I often read less than 10% of the words in the solution. Instead I search aggressively for the one or two sentences which tell me the key step that I couldn’t find myself. (Functional equations are the glaring exception to this rule, since in these problems there sometimes isn’t any main idea other than “stumble around randomly”, and the steps really are all about equally important. But this is rarer than you might guess.)

I think a common mistake students make is to treat the solution as a sequence of logical steps: that is, reading the solution line by line, and then verifying that each line follows from the previous ones. This seems to entirely miss the point, because not all lines are created equal, and most lines can be easily *derived* once you figure out the main idea.

If you find that the only way that you can understand the solution is reading it step by step, then the problem may simply be too hard for you. This is because what counts as “details” and “main ideas” are relative to the absolute difficulty of the problem. Here’s an example of what I mean: the solution to a USAMO 3/6 level geometry problem, call it , might look as follows.

*Proof:* First, we prove lemma . (Proof of , which is USAMO 1/4 level.)

Then, we prove lemma . (Proof of , which is USAMO 1/4 level.)

Finally, we remark that putting together and solves the problem.

Likely the main difficulty of is actually *finding* and . So a very experienced student might think of the sub-proofs as “easy details”. But younger students might find challenging in their own right, and be unable to solve the problem even after being told what the lemmas are: which is why it is hard for them to tell that were the main ideas to begin with. In that case, the problem is probably way over their head.

This is also why it doesn’t make sense to read solutions to problems which you have not worked on at all — there are often details, natural steps and notation, et cetera which are obvious to you if and only if you have actually tried the problem for a little while yourself.

The earlier sections describe how to extract the main idea of an olympiad solution. This is neat because instead of having to remember an entire solution, you only need to remember a few sentences now, and it gives you a good understanding of the solution at hand.

But this still isn’t achieving your ultimate goal in learning: you are trying to maximize your scores on *future* problems. Unless you are extremely fortunate, you will probably never see the exact same problem on an exam again.

So one question you should often ask is:

“How could I have thought of that?”

(Or in my case, “how could I train a student to think of this?”.)

There are probably some surface-level skills that you can pick out of this. The lowest hanging fruit is things that are technical. A small number of examples, with varying amounts of depth:

- This problem is “purely projective”, so we can take a projective transformation!
- This problem had a segment with midpoint , and a line parallel to , so I should consider projecting through a point on .
- Drawing a grid of primes is the only real idea in this problem, and the rest of it is just calculations.
- This main claim is easy to guess since in some small cases, the frogs have “violating points” in a large circle.
- In this problem there are numbers on a circle, odd. The counterexamples for even alternate up and down, which motivates proving that no three consecutive numbers are in sorted order.
- This is a juggling problem!

(Brownie points if any contest enthusiasts can figure out which problems I’m talking about in this list!)

But now I want to point out that the best answers to the above question are often *not formalizable*. Lists of triggers and actions are “cheap forms of understanding”, because going through a list of methods will only get so far.

On the other hand, the un-formalizable philosophy that you can extract from reading a question, is part of that legendary “intuition” that people are always talking about: you can’t describe it in words, but it’s certainly there. Maybe I would even be better if I reframed the question as:

“What does this problem feel like?”

So let’s talk about our feelings. Here is David Yang’s take on it:

Whenever you see a problem you really like, store it (and the solution) in your mind like a cherished memory . . . The point of this is that

you will see problems which will remind you of that problem despite having no obvious relation.You will not be able to say concretely what the relation is, but think a lot about it and give a name to the common aspect of the two problems. Eventually, you will see new problems for which you feel like could also be described by that name.Do this enough, and you will have a very powerful intuition that cannot be described easily concretely (and in particular, that nobody else will have).

This itself doesn’t make sense without an example, so here is an example of one philosophy I’ve developed. Here are two problems on Putnam 2014:

**Problem 2** **(Putnam 2014 A4)**

Suppose is a random variable that takes on only nonnegative integer values, with , , and . Determine the smallest possible value of the probability of the event .

**Problem 3** **(Putnam 2014 B2)**

Suppose that is a function on the interval such that for all and

How large can be?

At a glance there seems to be nearly no connection between these problems. One of them is a combinatorics/algebra question, and the other is an integral. Moreover, if you read the official solutions or even my own write-ups, you will find very little in common joining them.

Yet it turns out that these two problems do have something in common to me, which I’ll try to describe below. My thought process in solving either question went as follows:

In both problems, I was able to quickly make a good guess as to what the optimal / was, and then come up with a heuristic explanation (not a proof) why that guess had to be correct, namely, “by smoothing, you should put all the weight on the left”. Let me call this optimal argument .

That conjectured gave a numerical answer to the actual problem: but for both of these problems, it turns out that numerical answer is completely uninteresting, as are the exact details of . It should be philosophically be interpreted as “this is the number that happens to pop out when you plug in the optimal choice”. And indeed that’s what both solutions feel like. These solutions don’t actually care what the exact values of are, they only care about the properties that made me think they were optimal in the first place.

I gave this philosophy the name **Equality**, with poster description “problems where looking at the equality case is important”. This text description feels more or less useless to me; I suppose it’s the thought that counts. But ever since I came up with this name, it has helped me solve new problems that come up, because they would give me the same feeling that these two problems did.

Two more examples of these themes that I’ve come up with are **Global** and **Rigid**, which will be described in a future post on how I design training materials.

]]>

Let be a holomorphic function. A **holomorphic th root** of is a function such that for all . A **logarithm** of is a function such that for all . The main question we’ll try to figure out is: when do these exist? In particular, what if ?

To start us off, can we define for any complex number ?

The first obvious problem that comes up is that there for any , there are *two* numbers such that . How can we pick one to use? For our ordinary square root function, we had a notion of “positive”, and so we simply took the positive root.

Let’s expand on this: given (here ) we should take the root to be

such that ; there are two choices for , differing by .

For complex numbers, we don’t have an obvious way to pick . Nonetheless, perhaps we can also get away with an arbitrary distinction: let’s see what happens if we just choose the with .

Pictured below are some points (in red) and their images (in blue) under this “upper-half” square root. The condition on means we are forcing the blue points to lie on the right-half plane.

Here, for each , and we are constraining the to lie in the right half of the complex plane. We see there is an obvious issue: there is a big discontinuity near the point and ! The nearby point has been mapped very far away. This discontinuity occurs since the points on the negative real axis are at the “boundary”. For example, given , we send it to , but we have hit the boundary: in our interval , we are at the very left edge.

The negative real axis that we must not touch is is what we will later call a *branch cut*, but for now I call it a **ray of death**. It is a warning to the red points: if you cross this line, you will die! However, if we move the red circle just a little upwards (so that it misses the negative real axis) this issue is avoided entirely, and we get what seems to be a “nice” square root.

In fact, the ray of death is fairly arbitrary: it is the set of “boundary issues” that arose when we picked . Suppose we instead insisted on the interval ; then the ray of death would be the *positive* real axis instead. The earlier circle we had now works just fine.

What we see is that picking a particular -interval leads to a different set of edge cases, and hence a different ray of death. The only thing these rays have in common is their starting point of zero. In other words, given a red circle and a restriction of , I can make a nice “square rooted” blue circle as long as the ray of death misses it.

So, what exactly is going on?

To get a picture of what’s happening, we would like to consider a more general problem: let be holomorphic. Then we want to decide whether there is a such that

Our previous discussion when tells us we cannot hope to achieve this for ; there is a “half-ray” which causes problems. However, there are certainly functions such that a exists. As a simplest example, should definitely have a square root!

Now let’s see if we can fudge together a square root. Earlier, what we did was try to specify a rule to force one of the two choices at each point. This is unnecessarily strict. Perhaps we can do something like the following: start at a point in , pick a square root of , and then try to “fudge” from there the square roots of the other points. What do I mean by fudge? Well, suppose is a point very close to , and we want to pick a square root of . While there are two choices, we also would expect to be close to . Unless we are highly unlucky, this should tells us which choice of to pick. (Stupid concrete example: if I have taken the square root of and then ask you to continue this square root to , which sign should you pick for ?)

There are two possible ways we could get unlucky in the scheme above: first, if , then we’re sunk. But even if we avoid that, we have to worry that we are in a situation, where we run around a full loop in the complex plane, and then find that our continuous perturbation has left us in a different place than we started. For concreteness, consider the following situation, again with :

We started at the point , with one of its square roots as . We then wound a full red circle around the origin, only to find that at the end of it, the blue arc is at a different place where it started!

The interval construction from earlier doesn’t work either: no matter how we pick the interval for , any ray of death must hit our red circle. The problem somehow lies with the fact that we have enclosed the very special point .

Nevertheless, we know that if we take , then we don’t run into any problems with our “make it up as you go” procedure. So, what exactly is going on?

By now, if you have read the part of algebraic topology. this should all seem very strangely familiar. The “fudging” procedure exactly describes the idea of a lifting.

More precisely, recall that there is a covering projection

Let . For , we already have the square root . So the burden is completing .

Then essentially, what we are trying to do is construct a lifting for the following diagram: Our map can be described as “winding around twice”. From algebraic topology, we now know that this lifting exists if and only if

is a subset of the image of by . Since and are both punctured planes, we can identify them with .

**Ques 1**

Show that the image under is exactly once we identify .

That means that for any loop in , we need to have an *even* winding number around . This amounts to

since has no poles.

Replacing with and carrying over the discussion gives the first main result.

**Theorem 2** **(Existence of Holomorphic th Roots)**

Let be holomorphic. Then has a holomorphic th root if and only if

for every contour in .

The multivalued nature of the complex logarithm comes from the fact that

So if , then any complex number is also a solution.

We can handle this in the same way as before: it amounts to a lifting of the following diagram. There is no longer a need to work with a separate since:

**Ques 3**

Show that if has any zeros then possibly can’t exist.

In fact, the map is a universal cover, since is simply connected. Thus, is *trivial*. So in addition to being zero-free, cannot have any winding number around at all. In other words:

**Theorem 4** **(Existence of Logarithms)**

Let be holomorphic. Then has a logarithm if and only if

for every contour in .

The most common special case is

**Corollary 5** **(Nonvanishing Functions from Simply Connected Domains)**

Let be continuous, where is simply connected. If for every , then has both a logarithm and holomorphic th root.

Finally, let’s return to the question of from the very beginning. What’s the best domain such that we can define ? Clearly cannot be made to work, but we can do almost as well. For note that the only zero of is at the origin. Thus if we want to make a logarithm exist, all we have to do is make an incision in the complex plane that renders it impossible to make a loop around the origin. The usual choice is to delete negative half of the real axis, our very first ray of death; we call this a **branch cut**, with **branch point** at (the point which we cannot circle around). This gives

**Theorem 6** **(Branch Cut Functions)**

There exist holomorphic functions

satisfying the obvious properties.

There are many possible choices of such functions ( choices for the th root and infinitely many for ); a choice of such a function is called a **branch**. So this is what is meant by a “branch” of a logarithm.

The **principal branch** is the “canonical” branch, analogous to the way we arbitrarily pick the positive branch to define . For , we take the such that and the imaginary part of lies in (since we can shift by integer multiples of ). Often, authors will write to emphasize this choice.

**Example 7**

Let be the complex plane minus the real interval . Then the function by has a holomorphic square root.

**Corollary 8**

A holomorphic function has a holomorphic th root for all if and only if it has a holomorphic logarithm.

]]>

Let or , depending on taste.

**Definition 1**

A **Lie group** is a group which is also a -manifold; the multiplication maps (by ) and the inversion map (by ) are required to be smooth.

A **morphism of Lie groups** is a map which is both a map of manifolds and a group homomorphism.

Throughout, we will let denote the identity, or if we need further emphasis.

Note that in particular, every group can be made into a Lie group by endowing it with the discrete topology. This is silly, so we usually require only focus on connected groups:

**Proposition 2** **(Reduction to connected Lie groups)**

Let be a Lie group and the connected component of which contains . Then is a normal subgroup, itself a Lie group, and the quotient has the discrete topology.

In fact, we can also reduce this to the study of *simply connected* Lie groups as follows.

**Proposition 3** **(Reduction to simply connected Lie groups)**

If is connected, let be its universal cover. Then is a Lie group, is a morphism of Lie groups, and .

Here are some examples of Lie groups.

**Example 4** **(Examples of Lie groups)**

- under addition is a real one-dimensional Lie group.
- under addition is a complex one-dimensional Lie group (and a two-dimensional real Lie group)!
- The unit circle is a real Lie group under multiplication.
- is a Lie group of dimension . This example becomes important for representation theory: a
**representation**of a Lie group is a morphism of Lie groups . - is a Lie group of dimension .

As geometric objects, Lie groups enjoy a huge amount of symmetry. For example, any neighborhood of can be “copied over” to any other point by the natural map . There is another theorem worth noting, which is that:

**Proposition 5**

If is a connected Lie group and is a neighborhood of the identity , then generates as a group.

Recall the following result and its proof from representation theory:

**Claim 6**

For any finite group , is semisimple; all finite-dimensional representations decompose into irreducibles.

*Proof:* Take a representation and equip it with an arbitrary inner form . Then we can *average* it to obtain a new inner form

which is -invariant. Thus given a subrepresentation we can just take its orthogonal complement to decompose .

We would like to repeat this type of proof with Lie groups. In this case the notion doesn’t make sense, so we want to replace it with an integral instead. In order to do this we use the following:

**Theorem 7** **(Haar measure)**

Let be a Lie group. Then there exists a unique Radon measure (up to scaling) on which is left-invariant, meaning

for any Borel subset and “translate” . This measure is called the **(left) Haar measure**.

**Example 8** **(Examples of Haar measures)**

- The Haar measure on is the standard Lebesgue measure which assigns to the closed interval . Of course for any , for .
- The Haar measure on is given by
In particular, . One sees the invariance under multiplication of these intervals.

- Let . Then a Haar measure is given by
- For the circle group , consider . We can define
across complex arguments . The normalization factor of ensures .

Note that we have:

**Corollary 9**

If the Lie group is compact, there is a unique Haar measure with .

This follows by just noting that if is Radon measure on , then . This now lets us deduce that

**Corollary 10** **(Compact Lie groups are semisimple)**

is semisimple for any *compact* Lie group .

Indeed, we can now consider

as we described at the beginning.

In light of the previous comment about neighborhoods of generating , we see that to get some information about the entire Lie group it actually suffices to just get “local” information of at the point (this is one formalization of the fact that Lie groups are super symmetric).

To do this one idea is to look at the **tangent space**. Let be an -dimensional Lie group (over ) and consider the tangent space to at the identity . Naturally, this is a -vector space of dimension . We call it the **Lie algebra** associated to .

**Example 11** **(Lie algebras corresponding to Lie groups)**

- has a real Lie algebra isomorphic to .
- has a complex Lie algebra isomorphic to .
- The unit circle has a real Lie algebra isomorphic to , which we think of as the “tangent line” at the point .

**Example 12** **()**

Let’s consider , an open subset of . Its tangent space should just be an -dimensional -vector space. By identifying the components in the obvious way, we can think of this Lie algebra as just the set of all matrices.

This Lie algebra goes by the notation .

**Example 13** **()**

Recall is a Lie group of dimension , hence its Lie algebra should have dimension . To see what it is, let’s look at the special case first: then

Viewing this as a polynomial surface in , we compute

and in particular the tangent space to the identity matrix is given by the orthogonal complement of the gradient

Hence the tangent plane can be identified with matrices satisfying . In other words, we see

By repeating this example in greater generality, we discover

Right now, is just a vector space. However, by using the group structure we can get a map from back into . The trick is “differential equations”:

**Proposition 14** **(Differential equations for Lie theorists)**

Let be a Lie group over and its Lie algebra. Then for every there is a *unique* homomorphism

which is a morphism of Lie groups, such that

We will write to emphasize the argument being thought of as “time”. Thus this proposition should be intuitively clear: the theory of differential equations guarantees that is defined and unique in a small neighborhood of . Then, the group structure allows us to extend uniquely to the rest of , giving a trajectory across all of . This is sometimes called a **one-parameter subgroup** of , but we won’t use this terminology anywhere in what follows.

This lets us define:

**Definition 15**

Retain the setting of the previous proposition. Then the **exponential map** is defined by

The exponential map gets its name from the fact that for all the examples I discussed before, it is actually just the map . Note that below, for a matrix ; this is called the matrix exponential.

**Example 16** **(Exponential Maps of Lie algebras)**

- If , then too. We observe (where ) is a morphism of Lie groups . Hence
- Ditto for .
- For and , the map given by works. Hence
- For , the map given by works nicely (now is a matrix). (Note that we have to check is actually invertible for this map to be well-defined.) Hence the exponential map is given by
- Similarly,
Here we had to check that if , meaning , then . This can be seen by writing in an upper triangular basis.

Actually, taking the tangent space at the identity is a functor. Consider a map of Lie groups, with lie algebras and . Because is a group homomorphism, . Now, by manifold theory we know that maps between manifolds gives a linear map between the corresponding tangent spaces, say . For us we obtain a linear map

In fact, this fits into a diagram

Here are a few more properties of :

- , which is immediate by looking at the constant trajectory .
- , i.e. the total derivative is the identity. This is again by construction.
- In particular, by the inverse function theorem this implies that is a diffeomorphism in a neighborhood of , onto a neighborhood of .
- commutes with the commutator. (By the above diagram.)

Right now is *still* just a vector space, the tangent space. But now that there is map , we can use it to put a new operation on , the so-called *commutator*.

The idea is follows: we want to “multiply” two elements of . But is just a vector space, so we can’t do that. However, itself has a group multiplication, so we should pass to using , use the multiplication *in * and then come back.

Here are the details. As we just mentioned, is a diffeomorphism near . So for , close to the origin of , we can look at and , which are two elements of close to . Multiplying them gives an element still close to , so its equal to for some unique , call it .

One can show in fact that can be written as a Taylor series in two variables as

where is a *skew-symmetric* bilinear map, meaning . It will be more convenient to work with than itself, so we give it a name:

**Definition 17**

This is called the **commutator** of .

Now we know multiplication in is associative, so this should give us some nontrivial relation on the bracket . Specifically, since

we should have that , and this should tell us something. In fact, the claim is:

**Theorem 18**

The bracket satisfies the Jacobi identity

*Proof:* Although I won’t prove it, the third-order terms (and all the rest) in our definition of can be written out explicitly as well: for example, for example, we actually have

The general formula is called the **Baker-Campbell-Hausdorff formula**.

Then we can force ourselves to expand this using the first three terms of the BCS formula and then equate the degree three terms. The left-hand side expands initially as , and the next step would be something ugly.

This computation is horrifying and painful, so I’ll pretend I did it and tell you the end result is as claimed.

There is a more natural way to see why this identity is the “right one”; see Qiaochu. However, with this proof I want to make the point that this Jacobi identity is not our decision: instead, the Jacobi identity is **forced upon us** by associativity in .

**Example 19** **(Examples of commutators attached to Lie groups)**

- If is an abelian group, we have by symmetry and from . Thus in for any
*abelian*Lie group . - In particular, the brackets for are trivial.
- Let . Then one can show that
- Ditto for .

In any case, with the Jacobi identity we can define an *general* Lie algebra as an intrinsic object with a Jacobi-satisfying bracket:

**Definition 20**

A **Lie algebra** over is a -vector space equipped with a skew-symmetric bilinear bracket satisfying the Jacobi identity.

A **morphism of Lie algebras** and preserves the bracket.

Note that a Lie algebra may even be infinite-dimensional (even though we are assuming is finite-dimensional, so that they will never come up as a tangent space).

**Example 21** **(Associative algebra Lie algebra)**

Any associative algebra over can be made into a Lie algebra by taking the same underlying vector space, and using the bracket .

We finish this list of facts by stating the three “fundamental theorems” of Lie theory. They are based upon the functor

we have described earlier, which is a functor

- from the category of Lie groups
- into the category of finite-dimensional Lie algebras.

The first theorem requires the following definition:

**Definition 22**

A **Lie subgroup** of a Lie group is a subgroup such that the inclusion map is also an injective immersion.

A **Lie subalgebra** of a Lie algebra is a vector subspace preserved under the bracket (meaning that ).

**Theorem 23** **(Lie I)**

Let be a real or complex Lie group with Lie algebra . Then given a Lie subgroup , the map

is a bijection between Lie subgroups of and Lie subalgebras of .

**Theorem 24** **(The Lie functor is an equivalence of categories)**

Restrict to a functor

- from the category of
**simply connected**Lie groups over - to the category of finite-dimensional Lie algebras over .

Then

- (Lie II) is fully faithful, and
- (Lie III) is essentially surjective on objects.

If we drop the “simply connected” condition, we obtain a functor which is faithful and exact, but not full: non-isomorphic Lie groups can have isomorphic Lie algebras (one example is and ).

]]>

One of the most fundamental problems in graph theory is that of a *graph coloring*, in which one assigns a color to every vertex of a graph so that no two adjacent vertices have the same color. The most basic invariant related to the graph coloring is the chromatic number:

**Definition 1**

A simple graph is **-colorable** if it’s possible to properly color its vertices with colors. The smallest such is the **chromatic number** .

In this exposition we study a more general notion in which the set of permitted colors is different for each vertex, as long as at least colors are listed at each vertex. This leads to the notion of a so-called choice number, which was introduced by Erdös, Rubin, and Taylor.

**Definition 2**

A simple graph is **-choosable** if its possible to properly color its vertices given a list of colors at each vertex. The smallest such is the **choice number** .

**Example 3**

We have for any integer (here is the cycle graph on vertices). To see this, we only have to show that given a list of two colors at each vertex of , we can select one of them.

- If the list of colors is the same at each vertex, then since is bipartite, we are done.
- Otherwise, suppose adjacent vertices , are such that some color at is not in the list at . Select at , and then greedily color in , \dots, in that order.

We are thus naturally interested in how the choice number and the chromatic number are related. Of course we always have

Näively one might expect that we in fact have an equality, since allowing the colors at vertices to be different seems like it should make the graph easier to color. However, the following example shows that this is not the case.

**Example 4** **(Erdös)**

We claim that for any integer we have

The latter equality follows from being partite.

Now to see the first inequality, let have vertex set , where is the set of functions and . Then consider colors for . On a vertex , we list colors , , \dots, . On a vertex , we list colors , , \dots, . By construction it is impossible to properly color with these colors.

The case is illustrated in the figure below (image in public domain).

This surprising behavior is the subject of much research: how can we bound the choice number of a graph as a function of its chromatic number and other properties of the graph? We see that the above example requires exponentially many vertices in .

**Theorem 5** **(Noel, West, Wu, Zhu)**

If is a graph with vertices then

In particular, if then .

One of the most major open problems in this direction is the following.

**Definition 6**

A **claw-free** graph is a graph with no induced . For example, the line graph (also called edge graph) of any simple graph is claw-free.

If is a claw-free graph, then . In particular, this conjecture implies that for *edge* coloring, the notions of “chromatic number” and “choice number” coincide.

In this exposition, we prove the following result of Alon.

**Theorem 7** **(Alon)**

A bipartite graph is choosable, where

is half the maximum of the average degree of subgraphs .

In particular, recall that a *planar* bipartite graph with vertices contains at most edges. Thus for such graphs we have and deduce:

**Corollary 8**

A planar bipartite graph is -choosable.

This corollary is sharp, as it applies to which we have seen in Example 4 has .

The rest of the paper is divided as follows. First, we begin in §2 by stating Theorem 9, the famous combinatorial nullstellensatz of Alon. Then in §3 and §4, we provide descriptions of the so-called *graph polynomial*, to which we then apply combinatorial nullstellensatz to deduce Theorem 18. Finally in §5, we show how to use Theorem 18 to prove Theorem 7.

The main tool we use is the Combinatorial Nullestellensatz of Alon.

**Theorem 9** **(Combinatorial Nullstellensatz)**

Let be a field, and let be a polynomial of degree . Let such that for all .

Assume the coefficient of of is not zero. Then we can pick , \dots, such that

**Example 10**

Let us give a second proof that

for every positive integer . Our proof will be an application of the Nullstellensatz.

Regard the colors as real numbers, and let be the set of colors at vertex (hence , and ). Consider the polynomial

The coefficient of is . Therefore, one can select a color from each so that does not vanish.

Motivated by Example 10, we wish to apply a similar technique to general graphs . So in what follows, let be a (simple) graph with vertex set .

**Definition 11**

The **graph polynomial** of is defined by

We observe that coefficients of correspond to differences in directed orientations. To be precise, we introduce the notation:

**Definition 12**

Consider **orientations** on the graph with vertex set , meaning we assign a direction to every edge of to make it into a directed graph . An oriented edge is called **ascending** if and , i.e. the edge points from the smaller number to the larger one.

Then we say that an orientation is

**even**if there are an even number of ascending edges, and**odd**if there are an odd number of ascending edges.

Finally, we define

- to the be set of all even orientations of in which vertex has indegree .
- to the be set of all odd orientations of in which vertex has indegree .

Set .

**Example 13**

Consider the following orientation:

There are exactly two ascending edges, namely and . The indegrees of are , and . Therefore, this particular orientation is an element of . In terms of , this corresponds to the choice of terms

which is a term.

*Proof:* Consider expanding . Then each expanded term corresponds to a choice of or from each , as in Example 13. The term has coefficient is the orientation is even, and if the orientation is odd, as desired.

Thus we have an explicit combinatorial description of the coefficients in the graph polynomial .

We now give a second description of the coefficients of .

**Definition 15**

Let , viewed as a directed graph. An **Eulerian suborientation** of is a subgraph of (not necessarily induced) in which every vertex has equal indegree and outdegree. We say that such a suborientation is

**even**if it has an even number of edges, and**odd**if it has an odd number of edges.

Note that the empty suborientation is allowed. We denote the even and odd Eulerian suborientations of by and , respectively.

Eulerian suborientations are brought into the picture by the following lemma.

*Proof:* Consider any orientation , Then we define a suborietation of , denoted , by including exactly the edges of whose orientation in is in the opposite direction. It’s easy to see that this induces a bijection

Moreover, remark that

- is even if and are either both even or both odd, and
- is odd otherwise.

The lemma follows from this.

*Proof:* Combine Lemma 14 and Lemma 16.

We now arrive at the main result:

**Theorem 18**

Let be a graph on , and let be an orientation of . If , then given a list of colors at each vertex of , there exists a proper coloring of the vertices of .

In particular, is -choosable.

*Proof:* Combine Corollary 17 with Theorem 9.

Armed with Theorem 18, we are almost ready to prove Theorem 7. The last ingredient is that we need to find an orientation on in which the maximal degree is not too large. This is accomplished by the following.

*Proof:* This is an application of Hall’s marriage theorem.

Let . Construct a bipartite graph

Connect and if is an endpoint of . Since we satisfy Hall’s condition (as is a condition for all subgraphs ) and can match each edge in to a (copy of some) vertex in . Since there are exactly copies of each vertex in , the conclusion follows.

Now we can prove Theorem 7. *Proof:* According to Lemma 19, pick where . Since is bipartite, we obviously have , since cannot have any odd cycles. So Theorem 18 applies and we are done.

]]>

`loadkeys`

and `keyfuzz`

that are the first search entries don’t work for me, so some more sophisticated black magic was necessary.
This step is technically optional, but I did it because the function keys are a pain anyways. Normally on Apple keyboards one needs to use the `Fn`

key to get the normal Fn keys to behave as a `F<n>`

keystroke. I prefer to reverse this behavior, so that the SysRq combinations is `Alt+F13+F`

rather than `Fn+Alt+F13+F`

, say.

For this, the advice on the Arch Wiki worked, although it is not thorough on some points that I think should’ve been said. On newer kernels, one does this by creating the file `/etc/modprobe.d/hid_apple.conf`

and writing

```
options hid_apple fnmode=2
```

Then I edited the file `/etc/mkinitcpio.conf`

to include the new file:

```
...
BINARIES=""
# FILES
# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way. This is useful for config files.
FILES="/etc/modprobe.d/hid_apple.conf"
# HOOKS
...
```

Finally, recompile the kernel for this change to take effect. On Arch Linux one can just do this by issuing the command

```
$ sudo pacman -S linux
```

which will reinstall the entire kernel.

Next, I needed to get the scancode of the key I wanted to turn into the SysRQ key. For me attempting `showkey -s`

did not work so I instead had to use evtest, as described in this Arch Wiki.

```
$ sudo pacman -S evtest
$ sudo evtest
No device specified, trying to scan all of /dev/input/event*
Available devices:
/dev/input/event0: Logitech USB Receiver
/dev/input/event1: Logitech USB Receiver
/dev/input/event2: Apple, Inc Apple Keyboard
/dev/input/event3: Apple, Inc Apple Keyboard
/dev/input/event4: Apple Computer, Inc. IR Receiver
/dev/input/event5: HDA NVidia Headphone
/dev/input/event6: HDA NVidia HDMI/DP,pcm=3
/dev/input/event7: Power Button
/dev/input/event8: Sleep Button
/dev/input/event9: Power Button
/dev/input/event10: Video Bus
/dev/input/event11: PC Speaker
/dev/input/event12: HDA NVidia HDMI/DP,pcm=7
/dev/input/event13: HDA NVidia HDMI/DP,pcm=8
Select the device event number [0-13]: 2
Input driver version is 1.0.1
Input device ID: bus 0x3 vendor 0x5ac product 0x220 version 0x111
Input device name: "Apple, Inc Apple Keyboard"
```

This is on my Mac Mini; the list of devices looks different on my laptop. After this pressing the desired key yields something which looked like

```
Event: time 1456870457.844237, -------------- SYN_REPORT ------------
Event: time 1456870457.924097, type 4 (EV_MSC), code 4 (MSC_SCAN), value 70068
Event: time 1456870457.924097, type 1 (EV_KEY), code 183 (KEY_F13), value 1
```

This is the F13 key which I want to map into a SysRq — the keycode 70068 above (which is in fact a hex code) is the one I wanted.

Now that I had the scancode, I cd’ed to `/etc/udev/hwdb.d`

and added a file

`90-keyboard-sysrq.hwdb`

with the content

```
evdev:input:b0003*
KEYBOARDKEY_70068=sysrq
```

One then updates `hwdb.bin`

by running the command

```
$ sudo udevadm hwdb --update
$ sudo udevadm trigger
```

The latter command makes the changes take effect immediately. You should be able to test this by running `sudo evtest`

again; `evtest`

should now report the new keycode (but the same scancode).

One can test the SysRQ key by running Alt+SysRq+H, and then checking the `dmesg`

output to see if anything happened:

```
$ dmesg | tail -n 1
[ 283.001240] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) ...
```

It remains to actually enable SysRQ, according to the bitmask described here. My system default was apparently 16:

```
$ sysctl kernel.sysrq
kernel.sysrq = 16
```

For my purposes, I then edited `/etc/sysctl.d/99-sysctl.conf`

and added the line

```
kernel.sysrq=254
```

This gave me everything except the nicing of real-time tasks. Of course the choice of value here is just personal preference.

Personally, my main use for this is killing Chromium, which has a bad habit of freezing up my computer (especially if Firefox is open too). I remedy the situation by repeatedly running Alt+SysRq+F to kill off the memory hogs. If this doesn’t work, just Alt+SysRq+K kills off all the processes in the current TTY.

]]>

In algebraic topology you (for example) associate every topological space with a group, like or . All of these operations turn out to be *functors*. This isn’t surprising, because as far as I’m concerned the definition of a functor is “any time you take one type of object and naturally make another object”.

The surprise is that these objects also respect homotopy in a nice way; proving this is a fair amount of the “setup” work in algebraic topology.

Note that is a functor

i.e. to every space we can associate a group . (Of course, replace by integer of your choice.) Recall that:

Thus for a map we can take its **homotopy class** (the equivalence class under this relationship). This has the nice property that and so on.

**Definition 2**

Two spaces and are **homotopic** if there exists a pair of maps and such that and .

In light of this, we can define

**Definition 3**

The category is defined as follows:

- The objects are topological spaces .
- The morphisms are
*homotopy classes*of continuous maps .

**Remark 4**

Composition is well-defined since . Two spaces are isomorphic in if they are homotopic.

Then the big result is that:

**Theorem 6**

The induced map of a map depends only on the homotopy class of . Thus is a functor

The proof of this is geometric, using the so-called *prism operators*. In any case, as with all functors we deduce

**Corollary 7**

if and are homotopic.

In particular, the *contractible* spaces are those spaces which are homotopy equivalent to a point. In which case, for all .

In fact, we also defined homology groups

for . We will now show this is functorial too.

**Definition 8**

Let and be subspaces, and consider a map . If we write

We say is a **map of pairs**, between the pairs and .

**Definition 9**

We say that are **pair-homotopic** if they are “homotopic through maps of pairs”.

More formally, a **pair-homotopy** is a map , which we’ll write as , such that is a homotopy of the maps and each is itself a map of pairs.

Thus, we naturally arrive at two categories:

- , the category of
*pairs*of toplogical spaces, and - , the same category except with maps only equivalent up to homotopy.

**Definition 10**

As before, we say pairs and are **pair-homotopy equivalent** if they are isomorphic in . An isomorphism of is a **pair-homotopy equivalence**.

Then, the prism operators now let us derive

**Theorem 11**

We have a functor

The usual corollaries apply.

Now, we want an analog of contractible spaces for our pairs: i.e. pairs of spaces such that for . The correct definition is:

**Definition 12**

Let . We say that is a **deformation retract** of if there is a map of pairs which is a pair homotopy equivalence.

**Example 13** **(Examples of Deformation Retracts)**

- If a single point is a deformation retract of a space , then is contractible, since the retraction (when viewed as a map ) is homotopic to the identity map .
- The punctured disk deformation retracts onto its boundary .
- More generally, deformation retracts onto its boundary .
- Similarly, deformation retracts onto a sphere .

Of course in this situation we have that

As a special case of the above, we define

**Definition 14**

The category is defined as follows:

- The objects are pairs of spaces with a distinguished basepoint . We call these
**pointed spaces**. - The morphisms are maps , meaning is continuous and .

Now again we mod out:

**Definition 15**

Two maps of **pointed spaces** are homotopic if there is a homotopy between them which also fixes the basepoints. We can then, in the same way as before, define the quotient category .

And lo and behold:

**Theorem 16**

We have a functor

Same corollaries as before.

]]>

Hmm, so hopefully this will be finished within the next 10 years.— An email of mine at the beginning of this project

My Euclidean geometry book was published last March or so. I thought I’d take the time to write about what the whole process of publishing this book was like, but I’ll start with the disclaimer that my process was probably not very typical and is unlikely to be representative of what everyone else does.

I’m trying to pin-point exactly when this project changed from “daydream” to “let’s do it”, but I’m not quite sure; here’s the best I can recount.

It was sometimes in the fall of 2013, towards the start of the school year; I think late September. I was a senior in high school, and I was only enrolled in two classes. It was fantastic, because it meant I had lots of time to study math. The superintendent of the school eventually found out, though, and forced me to enroll as an “office assistant” for three periods a day. Nonetheless, office assistant is not a very busy job, and so I had lots of time, all the time, every day.

Anyways, I had written a bit of geometry material for my math club the previous year, which was intended to be a light introduction. But in doing so I realized that there was much, much more I wanted to say, and so somewhere on my mental to-do list I added “flesh these notes out”. So one day, sitting in the office, after having spent another hour playing StarCraft, I finally got down to this item on the list. I hadn’t meant it to be a book; I was just wanted to finish what I had started the previous year. But sometimes your own projects spiral out of your control, and that’s what happened to me.

Really, I hadn’t come up with a brilliant idea that no one had thought of before. To my knowledge, no one had even *tried* yet. If I hadn’t gone and decided to write this book, someone else would have done it; maybe not right away, but within many years. Indeed, I was honestly surprised that I was the first one to make an attempt. The USAMO has been a serious contest since at least the 1990’s and 2000’s, and the demand for this book certainly existed well before my time. Really, I think this all just goes to illustrate that the Efficient Market Hypothesis is not so true in these kind of domains.

Initially, this text was titled *A Voyage in Euclidean Geometry* and the filename Voyage.pdf would persist throughout the entire project even though the title itself would change throughout.

The beginning of the writing was actually quite swift. Like everyone else, I started out with an empty LaTeX file. But it was different from blank screens I’ve had to deal with in my life; rather than staring in despair (think English essay mode), I exploded. I was *bursting* with things I wanted to write. It was the result of having years of competitive geometry bottled up in my head. In fact, I still have the version 0 of the table of contents that came to life as I started putting things together.

- Angle Chasing (include “Fact 5”)
- Centers of the Triangle
- The Medial Triangle
- The Euler Line
- The Nine-Point Circle

- Circles
- Incircles and Excircles
- The Power of a Point
- The Radical Axis

- Computational Geometry
- All the Areas (include Extended Sine Law, Ceva/Menelaus)
- Similar Triangles
- Homothety
- Stewart’s Theorem
- Ptolemy’s Theorem

- Some More Configurations (include symmedians)
- Simson lines
- Incircles and Excenters, Revisited
- Midpoints of Altitudes

- Circles Again
- Inversion
- Circles Inscribed in Segments
- The Miquel Point (include Brokard, this could get long)
- Spiral Similarity

- Projective Geometry
- Harmonic Division
- Brokard’s Theorem
- Pascal’s Theorem

- Computational Techniques
- Complex Numbers
- Barycentric Coordinates

Of course the table of contents changed drastically over time, but that wasn’t important. The point of the initial skeleton was to provide a **bucket sort** for all the things that I wanted to cover. Often, I would have three different sections I wanted to write, but like all humans I can only write one thing at a time, so I would have to create section headers for the other two and try to get the first section done as quickly as I could so that I could go and write the other two as well.

I did take the time to do some things correctly, mostly LaTeX. Some examples of things I did:

- Set up proper amsthm environments: earlier versions of the draft had “lemma”, “theorem”, “problem”, “exercise”, “proposition”, all distinct
- Set up an organized master LaTeX file with \include’s for the chapters, rather than having just one fat file.
- Set up shortcuts for setting up diagrams and so on.
- Set up a “hints” system where hints to the problems would be printed in random order at the end of the book.
- Set up a special command for new terms (\vocab). At the beginning all it did was made the text bold, but I suspected that later I might it do other things (like indexing).

In other words, whenever possible I would pay O(1) cost to get back O(n) returns. Indeed the point of using LaTeX for a long document is so that you can “say what you mean”: you type \begin{theorem} … \end{theorem}, and all the formatting is taken care of for you. Decide you want to change it later, and you only have to change the relevant code in the beginning.

And so, for three hours a day, five days a week, I sat in the main office of Irvington High School, pounding out chapter after chapter. I was essentially typing up what had been four years of competition experience; when you’re 17 years old, that’s a big chunk of your life.

I spent surprisingly little time revising (before first submission). Mostly I just fired away. I have always heard things about how important it is to rewrite things and how first drafts are always terrible, but I’m glad I ignored that advice at least at the beginning. It was immensely helpful to have the skeleton of the book laid out in a tangible form that I could actually see. That’s one thing I really like about writing; helps you collect your thoughts together.

It’s possible that this is part of my writing style; compared to what everyone says I should do, I don’t do very much rewriting. My first and final drafts tend to look pretty similar. I think this is just because when I write something that’s not an English essay, I already have a reasonably good idea what I want to say, and that the process of writing it out does much of the polishing for me. I’m also typically pretty hesitant when I write things: I do a lot of pausing for a few minutes deciding whether this sentence is really what I want before actually writing it down, even in drafts.

By late October, I had about 80 or so pages content written. Not that impressive if you think about it; I think it works out to something like 4 pages per day. In fact, looking through my data, I’m pretty sure I had a pretty consistent writing rate of about 30 minutes per page. It didn’t matter, since I had so much time.

At this point, I was beginning to think about possibly publishing the book, so it was coming out reasonably well. It was a bit embarrassing, since as far as I could tell, publishing books was done by people who were actually professionals in some way or another. So I reached out to a couple of teachers of mine (not high school) who I knew had published textbooks in one form or another; I politely asked them what their thoughts were, and if they had any advice. I got some gentle encouragement, but also a pointer to self-publishing: turns out in this day and age, there are services like Lulu or CreateSpace that will just let you publish… whatever you want. This gave me the guts to keep working on this, because it meant that there was a minimal floor: even if I couldn’t get a traditional publisher, the worst I could do was self-publish through Amazon, which was at any rate strictly better than the plan of uploading a PDF somewhere.

So I kept writing. The seasons turned, and by February, the draft was 200 pages strong. In April, I had staked out a whopping 333 pages.

I was finally beginning to run out of things I wanted to add, after about six months of endless typing. So I decided to reach out again; this time I contacted a professor (henceforth Z) that I knew, whom I knew well from time at the Berkeley Math Circle. After some discussion, Z agreed to look briefly at an early draft of the manuscript to get a feel for what it was like. I must have exceeded his expectations, because Z responded enthusiastically suggesting that I submit it to the Problem Book Series of the MAA. As it turns out, he was on the editorial board, so in just a few days my book was in the official queue.

This was all in April. The review process was scheduled to begin in June, and likely take the entirety of the summer. I was told that if I had a more revised draft before the review that I should also send it in.

It was then I decided I needed to get some feedback. So, I reached out to a few of my close friends asking them if they’d be willing to review drafts of the manuscript. This turned out to not go quite as well as I hoped, since

- Many people agreed eagerly, but then didn’t actually follow through with going through and reading chapter by chapter.
- I was stupid enough to send the entire manuscript rather than excerpts, and thus ran myself a huge risk of getting the text leaked. Fortunately, I have good friends, but it nagged at me for quite a while. Learned my lesson there.

That’s not to say it was completely useless; I did get some typos fixed. But just not as many as I hoped.

Not very much happened for the rest of the summer while I waited impatiently; it was a long four month wait for me. Finally, in the end of August 2014, I got the comments from the board; I remember I was practicing the piano at Harvard when I saw the email.

There had been six reviews. While I won’t quote the exact reviews, I’ll briefly summarize them.

- There is too much idiosyncratic terminology.
- This is pretty impressive, but will need careful editing.
- This project is fantastic; the author should be encouraged to continue.
- This is well developed; may need some editing of contents since some topics are very advanced.
- Overall I like this project. That said, it could benefit from some reading and editing. For example, here are some passages in particular that aren’t clear.
- This manuscript reads well, written at a fairly high level. The motivation provided are especially good. It would be nice if there were some solutions or at least longer hints for the (many) problems in the text. Overall the author should be encouraged to continue.

The most surprising thing was how short the comments were. I had expected that, given the review had consumed the entire summer, the reviewers would at least have read the manuscript in detail. But it turns out that mostly all that had been obtained were cursory impressions from the board members: the first four reviews were only a few sentences long! The fifth review was more detailed, but it was essentially a “spot check”.

I admit, I was really at a loss for how I should proceed. The comments were not terribly specific, and the only real action-able item were to use less extravagant terms in response to 1 (I originally had “configuration”, “exercise” vs “problem”, etc.) and to add solutions (in response to 5). When I showed he comments to Z, he commented that while they were positive, they seemed to suggest that the publication may not be anytime soon. So I decided to try submitting a second draft to the MAA, but if that didn’t work I would fall back on the self-publishing route.

The reviewers had commented about finding a few typos, so I again enlisted the help of some friends of mine to eliminate them. This time I was a lot smarter. First, I only sent the relevant excerpts that I wanted them to read, and watermarked the PDF’s with the names of the recipients. Secondly, this time I paid them as well: specifically, I gave dollars for each chapter read, where was the number of errors found. I also gave a much clearer “I need this done by X” deadline. This worked significantly better than my first round of edits. Note to self: people feel more obliged to do a good job if you pay them!

All in all my friends probably eliminated about 500 errors.

I worked as rapidly as I could, and within four weeks I had the new version. The changes that I made were:

- In response to the first board comment, I eliminated some of the most extravagant terminology (“demonstration”, “configuration”, etc.) in favor of more conventional terms (“example”, “lemma”).
- I picked about 5-10 problems from each chapter and added full solutions for them. This inflated the manuscript by another 70 pages, for a new total of 400 pages.
- Many typos and revisions were corrected, thanks to my team of readers.
- Some formatting changes; most notably, I got the idea to put theorems and lemmas in boxes using mdframed (most of my recent olympiad handouts have the same boxes).
- Added several references.

I sent this out and sat back.

What followed was another long waiting process for what again were ended up being cursory comments The delay between the first and second review was definitely the most frustrating part — there seemed to be nothing I could do other than sit and wait. I seriously considered dropping the MAA and self-publishing during this time.

I had been told to expect comments back in the spring. Finally, in early April I poked the editorial board again asking whether there had been any progress, and was horrified to find out that the process hadn’t even started out due to a miscommunication. Fortunately, the editor was apologetic enough about the error that she asked the board to try to expedite the process a little. The comments then arrived in mid-May, six weeks afterwards.

There were eight reviewers this time. In addition to some stylistic changes suggested (e.g. avoid contractions), here were some of the main comments.

- The main complaint was that I had been a bit too informal. They were right on all accounts here: in the draft I had sent, the chapters had opened with some quotes from years of MOP (which confused the board, for obvious reasons) and I had some snarky comments about high school geometry (since I happen to despise the way Euclidean geometry is taught in high school.) I found it amusing that no one had brought it up yet, and happily obliged to fix them.
- Some reviewers had pointed out that some of the topics were very advanced. In fact, one of the reviewers actually recommend
*against*the publication of the book on the account that no one would want to buy it. Fortunately, the book ended up getting accepted anyways. - In that vein, there were some remarks that this book, although it serves its target audience well, is written at a fairly advanced level.

Some of the reviews were cursory like before, but some of them were line-by-line readings of a random chapter, and so this time I had something more tangible to work with.

So I proceeded to make the changes. For the first time, I finally had the brains to start using **git** to track the changes I made to the book. This was an enormously good idea, and I wish I had done so earlier.

Here are some selected changes that were made (the full list of changes is quite long).

- Eliminate a bunch of snarky comments about high school, and the MOP quotes.
- Eliminate about 250 contractions.
- Eliminate about 50 instances of unnecessary future tense.
- Eliminate the real product from the text.
- Added in about seven new problems.
- Added and improved significantly on the index of the book, making it far more complete.
- Fix more references.
- Change the title to “Euclidean Geometry in Mathematical Olympiads” (it was originally “Geometra Galactica”).
- Change the name of Part II from “Dark Arts” to “Analytic Techniques”. (Hehe.)
- Added people to the acknowledgments.
- Changes in formatting: most notably I change the font size from 11pt to 10pt to decrease the page count, since my book was already twice as long as many of the other books in the series. This dropped me from about 400 pages back to about 350 pages.
- Fix about 200 more typos. Thanks to those of you who found them!

I sent out the third draft just as June started, about three weeks after I had received the comments. (I like to work fast.)

There were another two rounds afterwards. In late June, I got a small set of about three pages of additional typos and clarifying suggestions. I sent back the third draft one day later.

Six days later, I got back a list of four remaining edits to make. I sent an updated fourth draft 17 minutes after receiving those comments. Unfortunately, it then took another five weeks for the four changes I made to be acknowledged. Finally, in early August, the changes were approved and the editorial board forwarded an official recommendation to MAA to publish the book.

In summary, the timeline of the review process was

- First draft submitted: April 6, 2014
- Feedback received: August 28, 2014

Second draft submitted: November 5, 2014 - Feedback received: May 19, 2015

Third draft submitted: June 23, 2015 - Feedback received: June 29, 2015

Fourth draft submitted: June 29, 2015 - Official recommendation to MAA made: August 2015

I think with traditional publishers there is a lot of waiting; my understanding is that the editorial board largely consists of volunteers, so this seems inevitable.

On September 3, 2015, I got the long-awaited message:

It is a pleasure to inform you that the MAA Council on Books has approved the recommendation of the MAA Problem Books editorial board to publish your manuscript,

Euclidean Geometry in Mathematical Olympiads.

I got a fairly standard royalty contract from the publisher, which I signed off without much thought.

I had a total of zero math editors and one copy editor provided. It shows through on the enormous list of errors (and this is *after* all the mistakes my friends helped me find).

Fortunately, my copy editor was quite good (and I have a lot of sympathy for this poor soul, who had to read every word of the entire manuscript). My Git history indicates that approximately 1000 corrections were made; on average, this is about 2 per page, which sounds about right. I got the corrections on hard copy in the mail; the entire printout of my book, except well marked with red ink.

Many of the changes fell into general shapes:

- Capitalization. I was unwittingly inconsistent with “Law of Cosines” versus “Law of cosines” versus “law of cosines”, etc and my copy editor noticed every one of these. Similarly, cases of section and chapter titles were often not consistent; should I use “Angle Chasing” or “Angle chasing”? The main point is to pick one convention and stick with it.
- My copy editor pointed out every time I used “Problems for this section” and had only one problem.
- Several unnecessary “quotes” and
*italics*were deleted. - Oxford commas. My god, so many Oxford commas. You just don’t notice when the IMO Shortlist says “the circle through the points E, G, and H” but the European Girls’ Olympiad says “show that KH, EM and BC are concurrent”. I swear there were at least 100 of these in the boko. I tried to write a regular expression to find such mistakes, but there were lots of edge cases that came up, and I still had to do many of these manually.
- Inconsistency of em dashes and en dashes. This one worked better with regular expressions.

But of course there were plenty of other mistakes like missing spaces, missing degree spaces, punctuation errors, etc.

This was handled for me by the publisher: they gave me a choice of five or so designs and I picked one I liked.

(If you are self-publishing, this is actually one of the hardest parts of the publishing logistics; you need to design the cover on your own.)

It turns out that after all the hard work I spent on formatting the draft, the MAA has a standard template and had the production team re-typeset the entire book using this format. Fortunately, the publisher’s format is pretty similar to mine, and so there were no huge cosmetic changes.

At this point I got the proofs, which are essentially the penultimate drafts of the book as they will be sent to the printers.

There was a bit more back-and-forth with the publisher towards the end. For example, they asked me if I would like my affiliation to be listed as MIT or to not have an affiliation. I chose the latter. I also send them a bio and photograph, and an author questionaire, asking me for some standard details.

Marketing was handled by the publisher based on these details.

Without warning, I got an email on March 25 announcing that the PDF versions of my book were now available on MAA website. The hard copies followed a few months afterwards. That marked the end of my publication process.

If I were to do this sort of thing again, I guess the main decision would be whether to self-publish or go through a formal publisher. The main disadvantage seems to be the time delay, and possibly also that the royalties are lesser than in self-publishing. On the flip side, the advantages of a formal publisher were:

- Having a real copy editor read through the entire manuscript.
- Having a committee of outsiders knock some common sense into me (e.g. not calling the book “Geometra Galactica”).
- Having cover art and marketing completely done for me.
- It’s more prestigious; having a real published book is (for whatever reason) a very nice CV item.

Overall I think publishing formally was the right thing to do for this book, but your mileage may vary.

Other advice I would give to my past self, mentioned above already: keep paying O(1) for O(n), use git to keep track of all versions, and be conscious about which grammatical conventions to use (in particular, stay consistent).

Here’s a better concluding question: what surprised me about the process, i.e, what was different than what I expected? Here’s a partial list of answers:

- It took even longer than I was expecting. Large committees are inherently slow; this is no slight to the MAA, it is just how these sorts of things work.
- I was surprised that at no point did anyone really check the manuscript for mathematical accuracy. In hindsight this should have been obvious; I expect reading the entire book properly takes at least 1-2 years.
- I was astounded by how many errors there were in the text, be it math or grammatical or so on. During the entire process something like 2000 errors were corrected (admittedly several were minor, like Oxford commas). Yet even as I published the book, I
*knew*that there had to be errors left. But it was still irritating to hear about them post-publication.

All in all, the entire process started in September 2013 and ended in March 2016, which is 30 months. The time was roughly 30% writing, 50% review, and 20% production.

]]>

`/etc/hosts`

, oops.)
For the whole process, useful commands to test with are:

`nslookup hmmt.co`

will tell you the IP used and the server from which it came.`dig http://www.hmmt.co`

gives much more detailed information to this effect. (From`bind-tools`

.)`dig @127.0.0.1 http://www.hmmt.co`

lets you query a specific DNS server (in this case 127.0.0.1).`drill @127.0.0.1 http://www.hmmt.co`

behaves similarly.

First, `pacman -S pdnsd dnscrypt-proxy`

(with `sudo`

ostensibly, but I’ll leave that out here and henceforth).

Run `systemctl edit dnscrypt-proxy.socket`

and fill in `override.conf`

with

```
[Socket]
ListenStream=
ListenDatagram=
ListenStream=127.0.0.1:40
ListenDatagram=127.0.0.1:40
```

Optionally, one can also specify which server which DNS serve to use with `systemctl edit dnscrypt-proxy.service`

. For example for cs-uswest I write

```
[Service]
ExecStart=
ExecStart=/usr/bin/dnscrypt-proxy \
-R cs-uswest
```

The empty `ExecStart=`

is necessary, since otherwise `systemctl`

will complain about multiple ExecStart commands.

This thus configures `dnscrypt-proxy`

to listen on 127.0.0.1, port 40.

Now we configure `pdnsd`

to listen on port 53 (default) for cache, and relay cache misses to `dnscrypt-proxy`

. This is accomplished by using the following for `/etc/pdnsd.conf`

:

```
global {
perm_cache = 1024;
cache_dir = "/var/cache/pdnsd";
run_as = "pdnsd";
server_ip = 127.0.0.1;
status_ctl = on;
query_method = udp_tcp;
min_ttl = 15m; # Retain cached entries at least 15 minutes.
max_ttl = 1w; # One week.
timeout = 10; # Global timeout option (10 seconds).
neg_domain_pol = on;
udpbufsize = 1024; # Upper limit on the size of UDP messages.
}
server {
label = "dnscrypt-proxy";
ip = 127.0.0.1;
port = 40;
timeout = 4;
proxy_only = on;
}
source {
owner = localhost;
file = "/etc/hosts";
}
```

Now it remains to change the DNS server from whatever default is used into 127.0.0.1. For NetworkManager users, it is necessary to edit `/etc/NetworkManager/NetworkManager.conf`

to prevent it from overriding this file:

```
[main]
...
dns=none
```

This will cause `resolv.conf`

to be written as an empty file by NetworkManager: in this case, the default 127.0.0.1 is used as the nameserver, which is what we want.

Needless to say, one finishes with

```
systemctl enable dnscrypt-proxy
systemctl start dnscrypt-proxy
systemctl enable pdnsd
systemctl start pdnsd
```

]]>

This post as in overview of the proof of:

**Theorem 1** **(Green-Tao)**

The prime numbers contain arbitrarily long arithmetic progressions.

Here, Szemerédi’s theorem isn’t strong enough, because the primes have density approaching zero. Instead, one can instead try to prove the following “relativity” result.

**Theorem** **(Relative Szemerédi)**

Let be a sparse “pseudorandom” set of integers. Then subsets of with positive density in have arbitrarily long arithmetic progressions.

In order to do this, we have to accomplish the following.

- Make precise the notion of “pseudorandom”.
- Prove the Relative Szemerédi theorem, and then
- Exhibit a “pseudorandom” set which subsumes the prime numbers.

This post will use the graph-theoretic approach to Szemerédi as in the exposition of David Conlon, Jacob Fox, and Yufei Zhao. In order to motivate the notion of pseudorandom, we return to the graph-theoretic approach of Roth’s theorem, i.e. the case of Szemerédi’s theorem.

Roth’s theorem can be phrased in two ways. The first is the “set-theoretic” formulation:

**Theorem 2** **(Roth, set version)**

If is 3-AP-free, then .

The second is a “weighted” version

**Theorem 3** **(Roth, weighted version)**

Fix . Let with . Then

We sketch the idea of a graph-theoretic proof of the first theorem. We construct a tripartite graph on vertices , where . Then one creates the edges

- if ,
- if , and
- if .

This construction is selected so that arithmetic progressions in correspond to triangles in the graph . As a result, if has no 3-AP’s (except trivial ones, where ), the graph has exactly one triangle for every edge. Then, we can use the theorem of Ruzsa-Szemerédi, which states that this graph has edges.

Now for the generalized version, we start with the second version of Roth’s theorem. Instead of a set , we consider a function

which we call a majorizing measure. Since we are now dealing with of low density, we normalize so that

Our goal is to now show a result of the form:

**Theorem** **(Relative Roth, informally, weighted version)
**

If , , and satisfies a “pseudorandom” condition, then .

The prototypical example of course is that if , then we let .

So, how can we put the pseudorandom condition? Initially, consider the tripartite graph defined earlier, and let ; since is sparse we expect small. The main idea that turns out to be correct is: The number of embeddings of in is “as expected”, namely . Here is actually the -blow-up of a triangle. This condition thus gives us control over the distribution of triangles in the sparse graph : knowing that we have approximately the correct count for is enough to control distribution of triangles.

For technical reasons, in fact we want this to be true not only for but all of its subgraphs .

Now, let’s move on to the weighted version. Let’s consider a tripartite graph , which we can think of as a collection of three functions

We think of as normalized so that . Then we can define

**Definition 4**

A weighted tripartite graph satisfies the **-linear forms condition** if

and similarly if any of the twelve factors are deleted.

Then the pseudorandomness condition is according to the graph we defined above:

**Definition 5**

A function is satisfies the **-linear forms condition** if , and the tripartite graph defined by

satisfies the -linear forms condition.

Finally, the relative version of Roth’s theorem which we seek is:

**Theorem 6** **(Relative Roth)**

Suppose satisfies the -linear forms condition. Then for any bounded above by and satisfying , we have

We of course have:

**Theorem 7** **(Szemerédi)**

Suppose , and with . Then

For , rather than considering weighted tripartite graphs, we consider a -uniform -partite hypergraph. For example, given with and , we use the construction

Thus 4-AP’s correspond to the simplex (i.e. a tetrahedron). We then consider the two-blow-up of the simplex, and require the same uniformity on subgraphs of .

Here is the compiled version:

**Definition 8**

A -uniform -partite weighted hypergraph satisfies the **-linear forms condition** if

for all exponents .

**Definition 9**

A function satisfies the **-linear forms condition** if , and

for all exponents . This is just the previous condition with the natural induced by .

The natural generalization of relative Szemerédi is then:

**Theorem 10** **(Relative Szemerédi)**

Suppose , and satisfies the -linear forms condition. Let with , . Then

The proof of Relative Szeremédi uses two key facts. First, one replaces with a bounded which is near it:

**Theorem 11** **(Dense model)**

Let . There exists such that if:

- satisfies , and
- ,

then there exists a function such that .

Here we have a new norm, called the **cut norm**, defined by

This is actually an extension of the cut norm defined on a -uniform -partite hypergraph (*not* -uniform like before!): if is such a graph, we let

Taking , gives the analogy.

For the second theorem, we define the norm

**Theorem 12** **(Relative simplex counting lemma)**

Let , , be weighted -uniform -partite weighted hypergraphs on . Assume that satisfies the -linear forms condition, and for all , . If then

One then combines these two results to prove Szemerédi, as follows. Start with and in the theorem. The -linear forms condition turns out to imply . So we can find a nearby by the dense model theorem. Then, we induce , , from , , respectively. The counting lemma then reduce the bounding of to the bounding of , which is by the usual Szemerédi theorem.

We now sketch how to obtain Green-Tao from Relative Szemerédi. As expected, we need to us the von Mangoldt function .

Unfortunately, is biased (e.g. “all decent primes are odd”). To get around this, we let tend to infinity slowly with , and define

In the **-trick** we consider only primes . The modified von Mangoldt function then is defined by

In accordance with Dirichlet, we have .

So, we need to show now that

**Proposition 13**

Fix . We can find such that for prime, we can find which satisfies the -linear forms condition as well as

for .

In that case, we can let

Then . The presence of allows us to avoid “wrap-around issues” that arise from using instead of . Relative Szemerédi then yields the result.

For completeness, we state the construction. Let be supported on with , and define a normalizing constant . Inspired by , we define a truncated by

Let , . Now, we define by

This turns out to work, provided grows sufficiently slowly in .

]]>