That which we call an identity

The Aperiodical 2019-11-27

I’m grateful to Jemma Sherwood and Rob Low for reading an early draft of this and for their comments thereon. All opinions are, of course, my own.

This post is inspired by something that I see crop up now and again in discussions with other Maths teachers. It usually manifests itself as a rallying cry to use ≡ in place of = in identities and reserve = for equations. My standard response is to mutter something about identities being equations and leave it at that. But in the latest round, Jemma Sherwood challenged me, in the nicest possible way, to explain a bit further. This is that explanation.

Although I’m going to state my case here, I’m well aware that there are different opinions. In matters of opinion, such as this, agreement and disagreement is less important than that all sides think. So if what I write seems to you wrong, that’s fine so long as it makes you think about why you think that it is wrong.

I’m actually going to give two answers to the question “Should we use ≡ for identities?”. Both are “No”, but for different reasons:

No, because it is trying to solve the wrong problem.
No, because, in the words of Inigo Montoya: “You keep using that word. I don’t think it means what you think it means.”

The second answer is the one that I usually mutter about when I come across this idea of using ≡ but it’s the first that is the important one.

In preparation for writing this I posted a poll on Twitter with four mathematical statements and asked which of them were identities. The four statements were:

$\sin (180 n) = 0$,
$a^{2} + b^{2} = c^{2}$,
$ (x + y)^{2} = x^{2} + 2 x y + y^{2}$,
if $2 x + 6 = 10$ then $x = 2$.

You may wish to ponder what your answer would be before continuing.

For Some Values of True

From the discussion that ensues whenever anyone posts about ≡, the rationale for insisting on it would seem to be that students find it difficult to distinguish between identities and equations so using notation to clarify the difference would be a good idea.

Seems reasonable. But to my mind, it’s trying to solve the wrong problem.

In the comments around my twitter poll, someone linked to the Wikipedia entry on Mathematical Identity which starts (emphasis mine):

In mathematics an identity is an equality relation $A = B$, such that $A$ and $B$ contain some variables and $A$ and $B$ produce the same value as each other regardless of what values (usually numbers) are substituted for the variables.

Another person gave a similar criterion for an identity which involved, as I understood it, putting “$\forall x$” at the start (or whatever unbound variables existed in the expressions).

The poll wasn’t long published before someone made a comment that slightly let the cat out of the bag. They queried the $\sin (180 n) = 0$ and said that it would be okay if $n$ was an integer but that I hadn’t made that clear. (Actually, they also queried the fact that I’d written $180$ rather than $180^{\circ }$; I must confess that one was due to me not being bothered to hunt down a unicode degree symbol but it really just underlines my point.) After that, some others remarked that they wanted to change their vote as they hadn’t noticed that.

So just putting $\forall x$ or $\forall n$ in front of an expression and seeing if it is still true isn’t a valid test of anything. We have to provide a context for the variables, and that allows me the freedom to make any of my equations into an identity or not.

$\sin (180^{\circ }n) = 0$ is an identity with $\forall n \in \mathbb{N}$ but not with $\forall n \in \mathbb{R}$.
$a^{2} + b^{2} = c^{2}$ is an identity with “$\forall a,b,c \in \mathbb{R}$ where $a$, $b$, $c$ are the sides of a right-angled triangle with $c$ the hypotenuse”, but is not an identity with just $\forall a,b,c \in \mathbb{R}$.
$(x + y)^{2} = x^{2} + 2 x y + y^{2}$ is an identity with $\forall x,y \in \mathbb{R}$, but is not an identity with $\forall x, y \in M_{2}(\mathbb{R})$, the space of $2 \times 2$–matrices.
“If $2 x + 6 = 10$ then $x = 2$” might surprise you: it is actually an identity with $\forall x \in \mathbb{R}$ since it then asserts that for any real number $x$, if $x$ satisfies $2 x + 6 = 10$ then $x = 2$. However, it is not an identity in $\mathbb{Z}/12\mathbb{Z}$ since both $2$ and $8$ satisfy $2 x + 6 = 10$.

To be a valid mathematical sentence, an identity requires a context. My contention is that the real problem behind the equation vs identity debate is that students are filling in the missing context for themselves and often getting it wrong. And once the context is made explicit, we no longer think of the identity as anything special and no longer need special notation for it.

I would also contend that the distinction between a double and triple line is not sufficient. If someone is having difficulty with the difference between an equation and an identity then an extra horizontal line will not make it clear.

None other than the great Don Knuth once said that in a mathematical document it should be possible to replace all the bits of maths by “blah” and for it to still make grammatical sense. I strongly suspect that my students do the opposite and replace all non-maths by “blah”. For example, fill in the “blah”s in these two questions and consider how the different possibilities would lead you down different routes to an answer:

Blah $x^{2} + 5 x + 6 = 0$
Blah $x^{2} + 5 x + 6$

Then add in the fact that a novice learner is likely to overlook the fact that the second doesn’t have an “$= 0$” in it and try to “solve” that quadratic.

If we make the context clearer, we are lessening the work that the student has to do to understand what they are being asked to do. And this is not an artificial weakening: context becomes more and more important the deeper one goes into mathematics. In school, certainly pre-16, it is a safe assumption that the context is “numbers”. It is only later that students learn that the context could be vectors, functions, matrices, sets, objects, morphisms, groups, rings, fields, manifolds, sheaves, schemes, … if I missed your favourite, I apologise.

But even a context of “numbers” can be misconstrued. How many students look at an answer with extreme puzzlement when it turns out to be a fraction? They were expecting a whole number.

And wouldn’t it set up expectations for quadratics and trigonometry much better if we consistently said “Find all (real) numbers $x$ for which …” instead of just “Solve”? And “Show that for all real numbers $x$ …” instead of just “Show that”?

The language doesn’t even have to be that formal, we don’t need $\forall $ or $\exists $ in Y7, but it should make clear the context. It can even be something like “I’m thinking of a real number, call it $x$; it satisfies $2 x + 6 = 10$. What is it?”

So What, Exactly, is an Identity?

I have very few memories of my own time at school, but one that I do recall very vividly is my A-level Chemistry teacher announcing at the start of the course that everything we’d been told up to then had been a lie. “Sodium,” he declared, “doesn’t want to lose an electron. It doesn’t want anything.”

It was dramatic, I’ll give him that, but it did make me lose a bit of faith in Chemistry. For all I knew, everything I was going to be told in A-level would also be a lie (spoiler: it was).

I try my utmost not to do the same in my own teaching.

Of course, I can’t tell my students the whole truth. When teaching about negative integers, for example, I don’t set up an equivalence relation on pairs of positive integers and prove that the operations of arithmetic descend through the relation. What I aim for is the following thought experiment: suppose that one of my students did go on to do a mathematics degree, possibly even further, and encountered some fancy part of mathematics that recast something that they’d learnt in school. What I would hope is that they would feel that the recasting fitted in with the story that they already knew. That if they ever came back to visit, they’d say, “Now I understand why you told the story that way.”

So when I consider something like identities, I think about how the concept is used later on and try to use that to inform how I talk about it in school.

And that’s a bit tricky with identities because, in my mathematical experience, they all but disappear. The Wikipedia page does rather give the game away when it says (emphasis mine):

In other words, $A = B$ is an identity if $A$ and $B$ define the same functions. This means that an identity is an equality between functions that are differently defined.

Thus once we are happy talking about functions, the need for the word identity disappears.

When I think of the word identity, the first concept that springs to mind is the identity function (or, rather, the identity functions since there are rather a lot of them), which might happen to be representable by the identity matrix. There’s also the identity element in a group or ring.

The closest I get to the concept of identity under discussion here is in a topic called universal algebra. Very briefly, this is the area of mathematics that studies operations like $+$ or $\times $ in the abstract. Such operations satisfy relations which are sometimes called identities. These are things like $x + y = y + x$. The catch is that in this area, the identities are imposed. They don’t occur by accident but by design.

This idea of imposing identities also chimes with where I see the ≡ sign used. I don’t think of it as “is identically equal to” but as “is equivalent to in this context”. The classic situation is in modular arithmetic, where I will happily write things like $4 \equiv 1 \mod 3$, by which I mean that in the context where I ignore multiples of $3$ then I can view $4$ as equivalent to $1$. In the wider context of integers then I know that $4$ and $1$ are different, but in the smaller context of modular arithmetic then I can consider them equivalent.

So I feel that I should exercise caution in using the term “identity” to refer to what is an equality of functions, and where the term is used differently later on. Particularly because, as I argue above, using ≡ is unlikely to solve the underlying issue of establishing context.