DRMacIver's Notebook
A brief introduction to mathematics with one example many times
A brief introduction to mathematics with one example many times
There’s a story repeated so often that it’s a cliche of Gauss’s mathematics teacher setting the class the task of adding up the numbers 1 to 100, supposedly to waste their time and give them a break, only for Gauss to almost immediately give the correct answer.
I think it’s a usefully illustrative example to look at for learning some of the basics of mathematics. Partly this is because it illustrates one of the key uses and methods of mathematics: Getting an answer that you could in theory work out laboriously much faster through logical deducation.
Mostly, though, I’m going to use it to showcase a bunch of different ways of thinking through a mathematical problem, and use it to teach some common mathematical notation.
So, let’s figure out how to do this.
The first thing we need to figure out is to how to write it down. Here are two obvious ways:
- “The sum of all numbers from 1 to 100.”
- 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 + 40 + 41 + 42 + 43 + 44 + 45 + 46 + 47 + 48 + 49 + 50 + 51 + 52 + 53 + 54 + 55 + 56 + 57 + 58 + 59 + 60 + 61 + 62 + 63 + 64 + 65 + 66 + 67 + 68 + 69 + 70 + 71 + 72 + 73 + 74 + 75 + 76 + 77 + 78 + 79 + 80 + 81 + 82 + 83 + 84 + 85 + 86 + 87 + 88 + 89 + 90 + 91 + 92 + 93 + 94 + 95 + 96 + 97 + 98 + 99 + 100
Neither of these are actually very good.
One of the first lessons of doing mathematics: A lot of mathematics is about developing good notation for representing the problem. The problem with the first, plain English, one is that it’s hard to manipulate. We want to be able to do algebra with it, and it’s hard to do algebra with plain English language. Modern mathematics owes a lot to Robert Recorde for giving us the equals sign (and popularising the plus and minus signs), because trying to manipulate sums with plain English language is an absolute pain.
The second one is bad for two main reasons:
- It’s very clunky and unwieldy. If you’re going to be repeating the expression a lot, you don’t want to write that out over and over again. Also, are you sure you’re reading it correctly? Would you have noticed if I’d dropped 31 from the middle, or repeated 23 twice?
- It doesn’t generalise well. Suppose we wanted instead to add up all the numbers from 1 to 50, or 1 to 1000, we’d have to write an entirely new expression for it. There’s no way to do algebra where the end point is variable with this expression.
So let me give you two other ways to write it:
- \(1 + \ldots + 100\)
- \(\sum\limits_{i = 1}^{100} i\)
In the first, the \(\ldots\) just means “and so on”. You’re expected to figure out what it means from context, but you’re also expected not to write it unless it’s obvious from context what you mean. If we’d wanted to make it a bit more obvious we could have written it \(1 + 2 + \ldots + 99 + 100\).
The second is a much more precise notation and looks completely inscrutable when you first encounter it, but has a fairly straightforward meaning: The \(\sum\) represents a sum of an expression, where the expression references a variable \(i\) that varies over some range. The \(i = 1\) at the bottom indicates that the sum is over a range of values of \(i\) starting with \(1\), and the \(100\) at the top indicates that \(100\) is the last element. The \(i\) to the right means that the expression we are summing is \(i\).
So, for example, \(\sum\limits_{i = 1}^3 i = 1 + 2 + 3\), or \(\sum\limits_{i = 2}^3 i^2 = 2^2 + 3^2\).
The \(\sum\) notation is very convenient because it allows us to compactly represent complicated expressions. It’s also useful for making generalisations. e.g. if we wanted to work out the answer to this for any number, we could write \(\sum\limits_{i=1}^n i\) to indicate that the end point is an arbitrary integer \(n\).There are some implicit notational conventions here by the way. \(i\) is usually an index variable. \(i\) and \(n\) are both expected to be integers by that choice. This isn’t required, but it’s conventional. However, you would only ever use integers as sum indexed or bounds like this, and that’s not conventional, it doesn’t really make sense to use non-integers here.
To start with though, we’ll keep working with the \(\ldots\) notation.
Now, let’s solve the problem.
We can write \(1 + \ldots + 100\) as \((1 + \ldots 50) + (51 + \ldots 100)\), because we can break up the sum however we like.Technically speaking we are using what’s called the associativity rule of addition here.
We can also reverse a sum if we want, so we could write the second one equally well as \(100 + \ldots + 51\).
Now, there are exactly 50 elements in each sum, so we can pair them up and rearrange the sum as follows. \((1 + 100) + (2 + 99) + \ldots + (49 + 52) + (50 + 51)\). Now, notice that each of these terms adds up to exactly 101. You can check that \(1 + 100 = 101, 2 + 99 = 101\) etc but you can also this because each time you move right, you add one to the first term and subtract one from the second, so the value stays constant. This means that this sum is exactly 50 copies of 101, so the sum is \(50 \times 101 = 5050\).
I think a pretty reasonable question for you to ask at this point is how on earth you’re expected to come up with a trick like that. If you’ve seen this problem before it’s old hat, but if you had to come up with it on the spot, how would you do that?
The honest answer is I don’t know. A lot of things like this you just figure out by fiddling with the problem and getting an instinct for what works. This sort of approach is what I think of as a “trick”, and sometimes you spot the trick which lets you massively simplify the problem, and sometimes you don’t. As you do more mathematics, you develop both a familiar bag of tricks, and you learn to apply them in new circumstances. The trick here is something like “try grouping different parts of the sum together to see if they add up to something convenient”.
For example, here’s another way you could use a similar trick. Suppose I asked you to calculate \(1 - 2 + 3 - 4 + \ldots + 99 - 100\) (you could also write this as \(\sum\limits_{i = 1}^{100} (-1)^{i - 1} i\) ) - that is, it’s the same sum as before, but this time the signs alternate between positive and negative.
Well, here you could group it as follows: \((1 - 2) + (3 - 4) + \ldots + (99 - 100)\). Each of those expressions in brackets is \(-1\), because each time the number on the right is exactly one more than the number on the left. As before we have exactly 50 elements, so the result is \((-1) \times 50 = -50\).
This is basically the same trick, but with a different grouping.
Mathematics doesn’t necessarily rely on knowing and spotting things like this, but it’s a very common part of mathematical practice. Sometimes they’re essential and you can’t figure out any other way to solve the problem without some sort of trick like this, but often they’re just a way of bypassing some tedious calculation or another.
Now, one problem with the way we solved this is that it doesn’t generalise all that well. For example, suppose I asked you to add up all the numbers between \(1\) and \(101\), how would you do it?Well you’d add 101 to 5050 and get 5151 of course, because you just worked out the sum from 1 to 100. But suppose you hadn’t done that.
The previous trick doesn’t work because it relied on there being an even number of values in the sum. If we tried to do the same thing as before we’d have to break the sum up as something like \((1 + \ldots + 50) + 51 + (52 + \ldots 101)\), because there’s a number left over in the middle. We can then do the same pairing as before and get \((1 + 101) + (2 + 100) + \ldots + (50 + 52)\) for the paired sums, and the result is \(51 + 102 \times 50 = 51 + 5100 = 5151\). Which is, conveniently, \(5050 + 101\), so we’ve not messed up the calculation somewhere.
This works but is unsatisfying. In particular, let’s try to work it out for a general number \(n\). In order to do this as above, we’ll have to consider two special cases: First, let’s assume that \(n\) is an even number. Let’s say \(n = 2m\).
Then we can write \(1 + \ldots n = (1 + \ldots m) + ((m + 1) + \ldots n)\) as before, and use the same pairing trick to get \((1 + n) + (2 + (n - 1)) + \ldots (m + (m + 1))\). As before, each of these individual bracketed expressions are always equal to \(n + 1\), and we have \(m\) copies of the expression, so the answer is \(m (n + 1) = \frac{n (n + 1)}{2}\) (because \(m = \frac{n}{2}\)).
Now, suppose \(n\) is odd. We could repeat the pairing trick, but we can also use \(1 + \ldots + n = (1 + \ldots + (n - 1)) + n\), and if \(n\) is odd then \(n - 1\) is even, so we can use the previous expression for the sum when \(n\) is even, plugging in \(n - 1\) in place of the \(n\). This gives us \(\frac{(n - 1)(n + 1 - 1)}{2} = \frac{(n - 1)n}{2}\) for the sum up to \(n - 1\), and thus \(\frac{(n - 1)n}{2} + n\) for the whole sum.
Now, let’s make that expression a bit nicer. \(\frac{(n - 1)n}{2} + n = \frac{n^2 - n + 2n}{2} = \frac{n^2 + n}{2} = \frac{n(n + 1)}{2}\).
Something worth noting here is that although we had to handle the odd and even cases differently, we got the same expression out in both cases. This should make us a little suspicious that we’re missing something and that there’s a nicer proof here.
Another thing that should make us suspicious but that is actually fine is that the answer to this question should be an integer, as we’ve only added up integers, but we’ve got a fraction in the answer. It’s worth doing a sanity check that this is fine, but it is: Exactly one of \(n\) and \(n + 1\) must be even (because if you add one to an even number you get an odd number, and vice versa), so the expression on top is always even, and dividing it by two gets you an even number.It’s very easy to make mistakes in mathematics, so it’s good to have a habit of doing little cross checks like this where you look at the answer you came up with and ask if it could possibly be correct.
Anyway, there’s another trick, and that fraction is a clue. We made this work by pairing up numbers together, and there’s a 2 in the end, so perhaps we would find it easier to work out twice the answer than we do to work out the answer directly.
We can write
\[\begin{align*} 2 \times (1 + \ldots + n) & = (1 + \ldots + n) + (1 + \ldots + n) \\ & = (1 + \ldots + n) + (n + \ldots + 1) \\ & = (n + 1) + \ldots + (1 + n) \\ \end{align*}\]
This gives us \(n\) copies of \(n + 1\), so twice the answer is \(n (n + 1))\) and the answer is \(\frac{n(n + 1)}{2}\) as desired. This is a much neater proof, but I think a slightly harder trick to spot.
I think this is a good time to write some Greek.
We can write \(\sum\limits_{i = 1}^n i = \sum\limits_{i = 1}^n (n + 1 - i)\). This is basically “running the sum backwards” like we did above. The first element is \(n + 1 - 1 = n\), then \(n + 1 - 2 = n - 1\), etc. all the way to the end where it’s \(n + 1 - n = 1\).
This means that \(2 \sum\limits_{i = 1}^n i = \sum\limits_{i = 1}^n i + \sum\limits_{i = 1}^n i = \sum\limits_{i = 1}^n i + \sum\limits_{i = 1}^n (n + 1 - i)\).
Now, because these sums are over the same range, we can group the expressions inside them together, so \(2 \sum\limits_{i = 1}^n i = \sum\limits_{i = 1}^n i + (n + 1 - i) = \sum\limits_{i = 1}^n (n + 1) = n (n + 1)\).
If you’re unused to dense mathematical notation this may not seem obviously clearer or better than the above, but it has the advantage of being compact and easy to manipulate, and it lets you result a lot of complex reasoning to fairly simple mechanical manipulation.
So now we’ve seen how to work out this answer. But why is it true?
This is another key feature of doing mathematics:I wrote about this in algebra and insight but that example only works if you’re already a moderately advanced mathematician. Often you should be unsatisfied with mere answers, and looking at the problem from another angle will help you really understand it.
Is this particular example worth really understanding? Well… probably not. I think if you find yourself doing a lot of this sort of sumsApplications include “being a mathematician in the early to mid 20th century”, “taking a maths exam”, “for the sheer joy of it”, and “if you do a really large amount of mathematics every now and then you will run into a case where you really do have a use for this and you will be so delighted that you’ll ignore how rusty you’ve got at it”. Also probably combinatorics or something like that., it’s worth playing around with a lot of simple examples like this in order to get a really good intuition for how everything fits together.
But right now it’s worth really understanding this because it’s a nice illustrative example of the general mathematical phenomenon.
Imagine, or drawI really should include a diagram here but I can’t be bothered, sorry. If you’re wondering how to draw \(n\) boxes, the answer is you don’t you draw some fixed small number and assume that it generalises. This is why I don’t like visual proofs that much. the numbers as rows of blocks stacked on top of each other. On the bottom row you’ve got \(n\) numbers, then \(n - 1\), and so on, until you’ve got \(1\) block on the final row. The result is a sort of ragged triangle.
Now, take a copy of that, and flip it upside down, and stack it on top of the other one. The jagged edges will line up, and you’ll get a rectangle. How big is the rectangle? Well, it’s got \(n\) items on the base, because that was your bottom row, and it’s \(n + 1\) tall, because your triangle is \(n\) tall and you had to stack an extra row on top. So there are \(n(n + 1)\) blocks in the rectangle, of which exactly half must have come from our original triangle, so there were \(\frac{n(n + 1)}{2}\) blocks in the original triangle, and \(1 + \ldots + n = \frac{n(n + 1)}{2}\) as desired.
This is, in some sense, the same proof as we had before, but the geometrical representation of it seems helpful for people, and for more complicated examples it can be genuinely very useful.
Here is, I promise, one finalFor the purpose of this post I mean. There are infinitely many ways to do this, and at least several of the other ones are interesting in their own right. way to do it, which is also often useful: We take a guess, and prove that that guess was correct. For example, suppose that we guessed the answer was \(\frac{n(n + 1)}{2}\). How might we prove that we were right?
The answer is to use something called “proof by induction”Any philosophers reading this, don’t panic! It’s a completely different meaning of the word “induction” that has nothing to do with the philosophical notion of induction.
The way a proof by induction of some claim (your “induction property”) works is that you establish:
- It’s true for \(1\) (or anywhere else you want to start - your “base case”)
- If it’s true for \(n - 1\) then it’s true for \(n\)
And this is sufficient to prove that it’s true for every integer greater than or equal to your base case.
Why? Well, because of a proof by contradiction. A proof by contradiction is when you assume something that you’re not sure is true, work through the consequences, and conclude something false, which means your original assumption must be false.
To see that proof by induction works, assume the opposite: Suppose there were some \(n\) for which the induction property were false. Then there must be some smallest such \(n\) for which the property is false. If so, then the property must be true for \(n - 1\), as the property holds for \(1\), so we must have \(n > 1\) so \(n - 1\) is still an integer greater than or equal to \(1\), and it’s smaller than \(n\), which we picked to be the smallest integer for which the property is false. But the second part of our induction conditions was that it’s true for \(n - 1\) then it’s true for \(n\), which contradicts our claim that it was false for \(n\).
Anyway, proof by induction.
Our inductive property is that \(\sum\limits_{i = 1}^n i = \frac{n(n + 1)}{2}\). This is true for \(1\), you can just check that by calculating \(\frac{2}{2}\).
Suppose it’s true for \(n - 1\). That is, \(\sum\limits_{i = 1}^{n - 1} i = \frac{(n - 1)n}{2}\). ThenThis is in fact the same calculation we did in the even/odd case above.
\[ \begin{align*} \sum\limits_{i = 1}^n i & = n + \sum\limits_{i = 1}^{n - 1} i \\ & = n + \frac{n(n -1)}{2} \\ & = \frac{n(n + 1)}{2} \\ \end{align*} \]
So the property holds for \(n\) as well, and the result is proved by induction.
This does, of course, invite the question of how on earth we would have made this guess if we didn’t already know the answer.
I think there are roughly three ways that can come about.
The first is that you can just look at the first couple elements of the sequence and try to puzzle out a formula that works. \(1, 3, 6, 10, 15, \ldots\) seems plausible that there should be a factor of \(n\) in there, so what happens if we divide by \(n\)? We get \(1, \frac{3}{2}, 2, \frac{5}{2}, 3, \ldots\), so that looks like it’s going up by half each time, so \(\frac{n + 1}{2}\), and that gives us our guess.
The second is that sometimes you just have a hunch. I think it would be weird to have a hunch in this case and not be able to spot the better proof, but in other cases it will often be the case. This is particularly true when think it’s true because of some sort of “heuristic argument” that doesn’t actually work rigorously, but does give you a pretty good idea of what the answer should be.This is particularly common when you’re trying to show something a bit weaker than strict equality, such as e.g. that some quantity never gets too large.
The third common way is that you proved it another way first, but that proof was long and complicated or involved some advanced concepts you don’t want to introduce here, so you “guessed” in the sense that you already knew the right answer, but it was easier to prove it by induction than convince the reader of it another way.
OK, that’s enough adding numbers.
What should you take away from this post?
The first thing I’d like you to take away from it is a certain familiarity with the notation. Being able to read this sort of notation is entirely essential if you’re to read any mathematics at all. Mathematics is built on notation, and the notation I’ve used here is very common.
Another thing I’d like you to take away from this is that there are many ways to get the answer and they all get the same answer. There isn’t a single right way to do it, and there can’t be. Mathematics is a discipline built on solving problems, and if there were always a single correct way to do it, every problem would be trivial. Doing mathematics is about developing facility with mathematical problems, and getting a sense of how to solve them.
Another is that figuring out which mathematical tool to use is a skilled practice like any other - through repeated encounters with mathematics you develop intuitions about what works and what to try.
Reading other people’s mathematics is a good way to start with this but importantly the proof that people show you is often not the proof they used to come up with the idea. This means that often when reading someone’s proof you will have some reaction like “Oh, that’s clever, but I have no idea how you’d come up with that.” or “What the fuck is going on here? I have no intuition for any of this”. If you’re lucky you can just ask them how they came up with it, but if not then I think it’s worth trying to prove it yourself by figuring out how all the different pieces fit together and see how you’d do it for yourself.
One reason these multiple avenues of proof are important is that mathematics is not just a discipline of problem solving, it’s a discipline of concept formation, and one of the things that drives concepts is developing a certain aesthetic sense and desire for intuition around the problems you solve. Many concepts are discovered because you look at a proof and go “I don’t like that proof. Can I do better?”