The sum of the first n squares

I took the Physics 1 AP test yesterday and finished with quite a bit of time left. Instead of double-checking my answers, though, I decided to mess around with some math concepts on my scratch paper and accidentally found this pretty elegant way to find the sum of the first \(n\) square numbers. This article is just going to go through my thought process while finding this method, along with some other interesting results that I found along the way.

Let's start by looking at Pascal's triangle.

\(\displaystyle \begin{array}{cc} & & & & & & 1 \\ & & & & & 1 & & 1 \\ & & & & 1 & & 2 & & 1 \\ & & & 1 & & 3 & & 3 & & 1 \\ & & 1 & & 4 & & 6 & & 4 & & 1 \\ & 1 & & 5 & & 10& & 10& & 5 & & 1 \\ 1 & & 6 & & 15& & 20& & 15& & 6 & & 1 \end{array} \)

Pascal's triangle is formed by two rules:

The top number is a one.
Every other number is the sum of the two numbers above it. For example, the 10s in the 6th row are formed by adding 6+4.

Pascal's triangle contains a lot of patterns, and I could talk about them for a very long time, but in this blog post I'm really just going to go over two of them.

Lemma 1: The diagonals of Pascal's triangle accumulate each other

Here's Pascal's triangle with the third diagonal bolded:

\(\displaystyle \begin{array}{cc} & & & & & & 1 \\ & & & & & 1 & & 1 \\ & & & & 1 & & 2 & & \bold 1 \\ & & & 1 & & 3 & & \bold 3 & & 1 \\ & & 1 & & 4 & & \bold 6 & & 4 & & 1 \\ & 1 & & 5 & & \bold {10}& & 10& & 5 & & 1 \\ 1 & & 6 & & \bold {15}& & 20& & 15& & 6 & & 1 \end{array} \)

Note that this diagonal contains the triangular number sequence; the first number is 1, the second is 1+2, the third is 1+2+3, the fourth is 1+2+3+4,and so on. The reason for this is pretty obvious, the diagonal just above this one contains the integers, and when we calculate this diagonal we just add up all the numbers in the diagonal above it.

Lemma 2: A "doped" Pascal's triangle gets us the sum of the first n squares

Here's a modified Pascal's triangle, where I've changed a 1 into a 2:

\(\displaystyle \begin{array}{cc} & & & & & & 1 \\ & & & & & 2 & & 1 \\ & & & & 2 & & 3 & & 1 \\ & & & 2 & & 5 & & 4 & & 1 \\ & & 2 & & 7 & & 9 & & 5 & & 1 \\ & 2 & & 9 & & 16& & 14& & 6 & & 1 \\ 2 & & 11& & 25& & 30& & 20& & 7 & & 1 \end{array} \)

EDIT: This article originally contained a typo, where the 30 at the bottom of the pyramid was written as 36. I've since corrected this.

The first diagonal is mostly twos, the second is the odd numbers, the third is the squares, and the fourth is the sum of the squares.

Lemma 3: The doped Pascal's triangle is the sum of two regular Pascal's triangles

Here's the doped Pascal's triangle minus the regular one:

\(\displaystyle \begin{array}{cc} & & & & & & 0 \\ & & & & & 1 & & 0 \\ & & & & 1 & & 1 & & 0 \\ & & & 1 & & 2 & & 1 & & 0 \\ & & 1 & & 3 & & 3 & & 1 & & 0 \\ & 1 & & 4 & & 6 & & 4 & & 1 & & 0 \\ 1 & & 5 & & 10& & 10& & 5 & & 1 & & 0 \end{array} \)

This is just another Pascal's triangle, just shifted 1 place. This happens because the doped Pascal's triangle was formed by adding to a regular Pascal's triangle.

Lemma 4: Pascal's triangle can be calculated with the choose function

Let's say I'm making a salad. I have six different ingredients to choose from and I want to use exactly two of them. How many possible salads can I make?

We can figure out the answer by looking at Pascal's triangle. We have six ingredients and can pick two. We add one to both of those numbers to get \(7, 3\). Then, we look at the third number in the seventh row; we can make 15 different salads with two of six ingredients.

This interesting result comes from the way that Pascal's triangle is constructed. If I want to pick 2 items from six ingredients, I have two options: I can either pick 1 of five ingredients and include the sixth one, or I can pick 2 of five ingredients and exclude the sixth one. If we know how many ways there are to pick 1 of five ingredients or 2 of five ingredients, we can add those up to figure out how many ways there are to pick 2 of six ingredients. This is the same rule that constructs Pascal's triangle, if we know the second and third numbers in row six, we can add them up to find the third number in row seven.

Lemma 5: With a constant k, the choose function reduces to a polynomial

If you've taken a pre-calculus class, you may have seen this formula for the choose function:

\(\displaystyle {n \choose k} = \frac{n!}{(n-k)!k!} \)

We can rearrange this equation like this:

\(\displaystyle {n \choose k} = \frac{1}{k!}\left(\frac{n!}{(n-k)!}\right) \)

We can expand the factorials

\(\displaystyle {n \choose k} = \frac{1}{k!}\left(\frac{n(n-1)(n-2)\cdots(n-k+1)(n-k)(n-k-1)\cdots(3)(2)(1)}{(n-k)(n-k-1)(n-k-2)\cdots(3)(2)(1)}\right) \)

Many of these terms cancel out

\(\displaystyle {n \choose k} = \frac{1}{k!}\left(\frac{n(n-1)(n-2)\cdots(n-k+2)(n-k+1)}{1}\right) \)

We can rearrange again

\(\displaystyle {n \choose k} = \frac{n(n-1)(n-2)\cdots(n-k+2)(n-k+1)}{k!} \)

If \(k\) is a constant, then we have a constant number of terms in the numerator, then this entire equation just reduces to a constant. For example, if \(k=3\), then

\(\displaystyle {n \choose 3} = \\ \frac{n(n-1)(n-2)}{3!} = \\ \frac{n(n-1)(n-2)}{6} = \\ \frac{n(n^2 - 3n + 2)}{6} = \\ \frac{n^3 - 3n^2 + 2n}{6} = \\ \frac 1 6 n^3 - \frac 1 2 n^2 + \frac 1 3 n \\ \)

Yes, I realize that this looks terrible, I don't understand LaTeX

Lemma 6: Any given diagonal of a Pascal's triangle can be calculated in constant time with a polynomial

A diagonal in Pascal's triangle is just the set of all values within Pascal's triangle with the same column number, which we've just shown is equivalent to a polynomial.

Lemma 7: Each diagonal on a doped Pascal's triangle is also a polynomial

Since the doped Pascal's triangle is just the sum of two regular Pascal's triangles, a diagonal can be calculated by adding two polynomials.

Conclusion: The sum of the first n squares can be calculated with a polynomial

If the sum of the first \(n\) squares is just the sum of two shifted polynomials, then that's just another polynomial. We can actually figure out that polynomial pretty easily:

\(\displaystyle p(n) = \frac 1 6 n^3 - \frac 1 2 n^2 + \frac 1 3 n \\ f(n) = p(n) + p(n-1) \)

I'm too lazy to expand that out, but this function would give us an answer in O(1) time.