15 March 2017

When students come out of middle school, or even high school, a lot of them have a mistaken idea of the point of math. Of course, the students aren't to blame. It's conceivable, likely even, that the teachers have the same idea. Math is fundamentally about communicating ideas, often very precise ideas. You might also be familiar with the argument that math is art but of course art is also fundamentally about the communication of ideas from the artist.

The problem, if you can call it a problem, is that the baseline of math is so old and so well tested that to a young student it is indistinguishable from fact. Instead of thinking of math in terms of communication, these students think of math in terms of right and wrong answers. This misunderstanding leads to more misunderstandings down the road, and I want to highlight a couple examples.

1 is not prime

When you learned about prime numbers, you were probably told something like "A number is prime if it is only divisible by 1 and itself." So numbers like 2, 3, 5, and 7 are prime. On the other hand, 1 also seems to fit this definition, as it is not divisible by anything other than 1 and itself (although the two are the same in this case). You might've even gotten a bullshit reason like "Oh, primes have to have exactly two factors" for why 1 is not prime.

The real reason that 1 is not prime is because it is more convenient to say that 1 is not prime. It is true that 1 has a lot of properties that primes have, but it also has a lot of properties that primes do not have, and there are a huge number of situations where we would want to separate 1 out. Here's the usual definition of a few terms in commutative ring theory (here I use the word "number" to mean "element of a commutative ring"):

A unit is a number \( u \) with an inverse. i.e. There is some other number \( v \) such that \( uv = 1 \). A prime is a nonunit number \( p \) such that if \( p \) divides \( ab \) then either \( p \) divides \( a \) or \( p \) divides \( b \) (or both). An irreducible is a nonunit number \( p \) such that if \( p = ab \) then either \( a \) or \( b \) is a unit.

For the integers, the units are 1 and -1. The concepts of primes and irreducible both correspond to what we usually think of as primes, but also include the negative primes -2, -3, -5, etc. Notice the sneaky "nonunit" in both the definitions of primes and irreducibles. In fact, units satisfy both of the other criteria for primes and irreducibles.

So why do mathematicians want to not include units in the list of primes? Well, for example, when we define a unique factorization domain, we call a factorization of a number \( n \) as a product \( u p_1^{e_1} p_2^{e_2}\cdots p_k^{e_k} \) where \( u \) is a unit and \( p_1, \ldots, p_k \) are primes.

If units were included in primes, then instead of "unit" and "prime", we'd be saying "unit" and "nonunit prime". When you look at the interesting theorems in ring theory and the topics based on it, the prime-without-unit idea shows up in a huge number of places, and the prime-with-unit idea shows up essentially not at all. It's like the question of whether a hotdog is a sandwich. Maybe you could come up with an argument of why hotdogs should technically be sandwiches, but the practical result is that if you want a sandwich you'd have to ask for a "non-hotdog sandwich" instead. So it's more convenient to just say that hotdogs are not sandwiches.

0.999... = 1

This one is common to the point of absurdity. It is absolutely and unquestionably true that the real number denoted by 0.999... and the real number denoted by 1 are the same number. There are plenty of basic "proofs" of this fact, that look something like the following:

  1. 1/3 = 0.3333...
  2. 3*1/3 = 0.999...
  3. 1 = 0.999...

The problem with these proofs is that they don't actually answer the fundamental misunderstandings of the people who believe that 0.999... and 1 are different. These people usually have the following gaps:

  1. They don't know what a real number is.
  2. They don't know what 0.999... means.

If you've seen enough of these discussions I'm sure that you've seen cases where the person accepts the given "proof" but then still believes that 0.999... and 1 are different with one of the following resolutions:

  • Well, 0.333... is very close to 1/3, but not exactly.
  • The sum \( \sum_{n=1}^{\infty} \frac{9}{10^n} \) is 1 but 0.999... is something different.

These outcomes are because nobody addressed the fundamental gaps of knowledge. The real numbers aren't usually formally defined until you take a real analysis course, which means either college or never for most people. Instead, they have this fuzzy understanding that the real numbers include the integers, the rational numbers, as well as numbers like \( \sqrt{2} \) and \( \pi \) and \( e \), but not things like \( \sqrt{-1} \).

There are two typical definitions of real numbers that you are likely to see in a real analysis course. The first is in terms of Cauchy sequences. A sequence \( x_1, x_2, \ldots \) is said to be a Cauchy sequence if for every rational \( \varepsilon > 0 \) there is some \( N \) such that for any \( n, m > N \), we have \( |x_n - x_m| < \varepsilon \). The intuition here is that every sequence with this property ought to converge, because we can find smaller and smaller intervals narrowing down what the limit should be, but when we narrow it down "all the way", the point that we want might be missing!

However, some sequences should converge to the same number. For example, we don't want the sequence \( 0, 1, 1, 1, 1, \ldots \) to be considered differently from \( 1, 1, 1, 1, \ldots \). They should both simply represent the number 1. Essentially, two sequences should converge to the same value if they eventually become arbitrarily close: for any \( \varepsilon > 0 \), there is some \( N \) such that for any \( n > 0 \), we have \( |x_n - y_n| < \varepsilon \). This notion of "should converge to the same value" is an equivalence relation on Cauchy sequences, and we call one of these equivalence classes a real number.

The second definition is in terms of Dedekind cuts. If you had a totally ordered set \( S \), you can always "cut" it at any element (call it \( x \)) into two sets, \( A = \{ y \in S\ |\ y < x \} \) and \( B = \{ y \in S\ |\ y \geq x \} \). But there are some pairs of sets that look like cuts that don't come from elements of \( S \). For example, if \( S \) is the set of rational numbers, then \( A = \{ y \in \mathbb{Q}\ |\ y < 0 \text{ or } y^2 < 2 \} \), \( B = \{ y \in \mathbb{Q}\ |\ y > 0 \text{ and } y^2 \geq 2 \} \) looks like a cut that should correspond to a number that squares to 2, but no rational number squares to 2. In the Dedekind cut construction of real numbers, each of these pairs of sets is called a real number.

I have left out a lot of details, such as the full definition of a Dedekind cut, how arithmetic is defined, and why these are sensible definitions of what we think of as "real numbers", why these two definitions give the same result, and so on. However, if you trust me that these definitions work, it's not too difficult to see why 0.999... = 1.

If we're working with the Cauchy sequence construction, when we write 0.999... we probably mean the Cauchy sequence \( 0, 0.9, 0.99, 0.999, \ldots \). It is straightforward to check that this is in the same equivalence class as the sequence \( 1, 1, 1, 1, \ldots \), which is clearly the number 1.

If we're working with the Dedekind cut construction, when we write 0.999... we probably mean the cut \( A = \{ y \in \mathbb{Q}\ |\ \exists N. y < 1 - \frac{1}{10^n} \} \), \( B = \{ y \in \mathbb{Q}\ |\ \forall N. y \geq 1 - \frac{1}{10^n} \} \). It is not too difficult to check that \( A = \{ y \in \mathbb{Q}\ |\ y < 1 \) and \( B = \{ y \in \mathbb{Q}\ |\ y \geq 1 \} \), which is the cut corresponding to the number 1.