Topos Theory (Part 1)

Azimuth 2020-01-05

I’m teaching an introduction to topos theory this quarter, loosely based on Mac Lane and Moerdijk’s Sheaves in Geometry and Logic.

I’m teaching one and a half hours each week for 10 weeks, so we probably won’t make it far very through this 629-page book. I may continue for the next quarter, but still, to make good progress I’ll have to do various things.

First, I’ll assume basic knowledge of category theory, a lot of which is explained in the Categorical Preliminaries and Chapter 1 of this book. I’ll start in with Chapter 2. Feel free to ask questions!

Second, I’ll skip a lot of proofs and focus on stating definitions and theorems, and explaining what they mean and why they’re interesting.

These notes to myself will be compressed versions of what I will later write on the whiteboard.

Sheaves

Topos theory emerged from Grothendieck’s work on algebraic geometry; he developed it as part of his plan to prove the Weil Conjectures. It was really just one of many linked innovations in algebraic geometry that emerged from the French school, and it makes the most sense if you examine the whole package. Unfortunately algebraic geometry takes a long time to explain! But later Lawvere and Tierney realized that topos theory could serve as a grand generalization of logic and set theory. This logical approach is more self-contained, and easier to explain, but also a bit more dry—at least to me. I will try to steer a middle course, and the title Sheaves in Geometry and Logic shows that Mac Lane and Moerdijk were trying to do this too.

The basic idea of algebraic geometry is to associate to a space the commutative ring of functions on that space, and study the geometry and topology of this space using that ring. For example, if X is a compact Hausdorff space there’s a ring C(X) consisting of all continuous real-valued functions on X, and you can recover X from this ring. But algebraic geometers often deal with situations where there aren’t enough everywhere-defined functions (of the sort they want to consider) on a space. For example, the only analytic functions on the Riemann sphere are constant functions. That’s not good enough! Most analytic functions on the Riemann sphere have poles, and are only defined away from these poles. (I’m giving an example from complex analysis, in hopes that more people will get what I’m talking about, but there are plenty of purely algebraic examples.)

This forced algebraic geometers to invent ‘sheaves’, around 1945 or so. The idea of a sheaf is that instead of only considering functions defined everywhere, we look at functions defined on open sets.

So, let X be a topological space and let \mathcal{O}(X) be the collection of open subsets of X. This is a poset with inclusion as the partial ordering, and thus it is a category. A presheaf is a functor

F \colon \mathcal{O}(X)^{\mathrm{op}} \to \mathsf{Set}

So, a sheaf assigns to each open set U a set F U. It allows us to restrict an element of F U to any smaller open set U' \subseteq U, and a couple of axioms hold, which are encoded in the word ‘functor’. Note the ‘op’: that’s what lets us restrict elements of F U to smaller open sets.

The example to keep in mind is where F U consists of functions on U (that is, functions of the sort we want to consider, such as continuous or smooth or analytic functions). However, other examples are important too.

In many of these examples something nice happens. First, suppose we have s \in F U and an open cover of U by open sets U_i. Then we can restrict s to U_i getting something we can call

s|_{U_i}. We can then further restrict this to U_i \cap U_j. And by the definition of presheaf, we have

(s|_{U_i})|_{U_i \cap U_j} = (s|_{U_j})|_{U_i \cap U_j}

In other words, if we take a guy in F U and restrict it to a bunch of open sets covering U, the resulting guys agree on the overlaps U_i \cap U_j. Check that this follows from the definition of functor! and some other obvious facts!

This is true for any presheaf. A presheaf is a sheaf if we can start the other way around, with a bunch of guys s_i \in F U_i that agree on overlaps:

s_i|_{U_i \cap U_j} = s_j|_{U_i \cap U_j}

and get a unique s \in F U that restricts to all these guys:

s|_{U_i} = s_i

Note this definition secretly has two clauses: I’m saying that in this situation s exists and is unique. If we have uniqueness but not necessarily existence, we say our presheaf is a separated presheaf.

The point of a sheaf is that you can tell if something is in F U by examining it locally. These examples explain what I mean:

Puzzle. Let X = \mathbb{R} and for each open set U \subseteq \mathbb{R} take F U to be the set of continuous real-valued functions on U. Show that with the usual concept of restriction of functions, F is a presheaf and in fact a sheaf.

Puzzle. Let X = \mathbb{R} and for each open set U \subseteq \mathbb{R} take F U to be the set of bounded continuous real-valued functions on U. Show that with the usual concept of restriction of functions, F is a separated presheaf but not a sheaf.

The problem is that a function can be bounded on each open set in an open cover of U yet not bounded on U. You can tell if a function is continuous by examining it locally, but you can’t tell if its bounded!

So, in a sense that should gradually become clear, sheaves are about ‘local truth’.

The category of sheaves on a space

There’s a category of presheaves on any topological space X. Since a presheaf on X is a functor

F \colon \mathcal{O}(X)^{\mathrm{op}} \to \mathsf{Set},

a morphism between presheaves is a natural transformation between such functors.

Remember, if \mathsf{C} and \mathsf{D} are categories, we use \mathsf{C}^{\mathsf{D}} to stand for the category where the objects are functors from \mathsf{D} to \mathsf{C}, and the morphisms are natural transformations. This is called a functor category.

So, a category of presheaves is just an example of a functor category, and the category of presheaves on X is called

\mathsf{Set}^{\mathcal{O}(X)^{\mathrm{op}}}

But this name is rather ungainly, so we make an abbreviation

\widehat{\mathsf{C}} = \mathsf{Set}^{\mathsf{C}^{\mathrm{op}}}

Then the category of presheaves on X is called

\widehat{\mathcal{O}(X)}

Sheaves are subtler, but we define morphisms of sheaves the exact same way. Every sheaf has an underlying presheaf, so we define a morphism between sheaves to be a morphism between their underlying presheaves. This gives the category of sheaves on X, which we call \mathsf{Sh}(X).

By how we’ve set things up, \mathsf{Sh}(X) is a full subcategory of \widehat{\mathcal{O}(X)}.

Now, what Grothendieck realized is that \mathsf{Sh}(X) acts a whole lot like the category of sets. For example, in the category of sets we can define ‘commutative rings’, but we can copy the definition in \mathsf{Sh}(X) and get ‘sheaves of commutative rings’, and so on. The point is that we’re copying ordinary math, but doing it locally, in a topological space.

Elementary topoi

Lawvere and Tierney summarized what was going on here by inventing the concept of ‘elementary topos’. I’ll throw the definition at you now and explain all the pieces in future classes:

Definition. An elementary topos, or topos for short, is a category with finite limits and colimits, exponentials and a subobject classifier.

I hope you know limits and colimits, since that’s the kind of basic category theory definition I’m assuming. Given two objects x and y in a category, their exponential is an object x^y that acts like the thing of all maps from y to x. I’ll give the actual definition later. A subobject classifier is, roughly, an object \Omega that generalizes the usual set of truth values

2 = \{0,1\}

Namely, subobjects of any object x are in one-to-one correspondence with morphisms from x to \Omega, which serve as ‘characteristic functions’. Again, this is just a sketch: I’ll give the actual definition later, or you can click on the link and read it now.

The point is that an elementary topos has enough bells and whistles that we can ‘do mathematics inside it’. It’s like an alternative universe, a variant of our usual category of sets and functions, where mathematicians can live. But beware: in general, the kind of mathematics we do in an elementary topos is finitistic mathematics using intuitionistic logic.

You see, the category of finite sets is an elementary topos, so you can’t expect to have ‘infinite objects’ like the set of natural numbers in an elementary topos—unless you decree that you want them (which people often do).

Also, we will see that while 2 = \{0,1\} is a Boolean algebra, the subobject classifier of an elementary topos need only be a ‘Heyting algebra’: a generalization of a Boolean algebra in which the law of excluded middle fails. This is actually not weird: it’s connected to the fact that a category of sheaves lets us reason ‘locally’. For example, we don’t just care if two functions are equal or not, we care if they’re equal or not in each open set. So we need a subtler form of logic than classical Boolean logic.

There’s a lot more to say, and I’m just sketching out the territory now, but one of the first big theorems we’re aiming for is this:

Theorem. For any topological space X, \mathsf{Sh}(X) is an elementary topos.

Grothendieck topoi

You’ll notice that sheaves on X were defined starting with the poset \mathcal{O}(X) of open sets of X. In fact, to define them we never used anything about X except this poset! This suggests that we could define sheaves more generally starting from any poset.

And that’s true—but Grothendieck went further: he defined sheaves starting from any category, as long as that category was equipped with some extra structure saying when a bunch of morphisms f_i \colon x_i \to x serve to ‘cover’ the object x. This extra data is called a ‘coverage’ or more often (rather confusingly) a ‘Grothendieck topology’. A category equipped with a Grothendieck topology is called a ‘site’.

So, Grothendieck figured out how to talk about the category of sheaves \mathsf{Sh}(\mathsf{C}) on any site \mathsf{C}. He did this before Lawvere and Tierney came along, and this was his definition of a topos. So, nowadays we say a category of sheaves on a site is a Grothendieck topos. However:

Theorem. Any Grothendieck topos is an elementary topos.

So, Lawvere and Tierney’s approach subsumes Grothendieck’s, in a sense. Not every elementary topos is a Grothendieck topos, though! For example, the category of finite sets is an elementary topos but not a Grothendieck topos. (Any Grothendieck topos has, not just finite limits and colimits, but arbitrary small limits and colimits.) So both concepts of topos are important and still used. But when I say just ‘topos’, I’ll mean ‘elementary topos’.

Why did Grothendieck bother to generalize the concept of sheaves from sheaves on a topological space to sheaves on a site? He wasn’t just doing it for fun: it was a crucial step in his attempt to prove the Weil Conjectures!

Basically, when you’re dealing with spaces that algebraic geometers like—say, algebraic varieties—there aren’t enough open sets to do everything we want, so we need to use covering spaces as a generalization of open covers. So, instead of defining sheaves using the poset of open subsets of our space X, Grothendieck needed to use the category of covering spaces of X.

That’s the rough idea, anyway.

Geometric morphisms

As you probably know if you’re reading this, category theory is all about the morphisms. This is true not just within a category, but between them. The point of topos theory is not just to study one topos, but many. We don’t want merely to do mathematics in alternative universes: we want to be able to translate mathematics from one alternative universe to another!

So, what are the morphisms between topoi?

First, if you have a continuous map f \colon X \to Y between topological spaces, you can take the ‘direct image’ of a presheaf on X to get a presheaf on Y. Here’s how this works.

The inverse image of any open set is open, so we get an inverse image map

f^{-1} \colon \mathcal{O}(Y) \to \mathcal{O}(X)

sending each open set V \subseteq Y to the open set

f^{-1} V = \{x \in X :\; f(x) \in V \} \subseteq X

Given a presheaf F on X, we define its direct image to be the presheaf on Y given by

(f_\ast F)(V) = F(f^{-1} V)

Note the double reversal here: f maps points in X to points in Y, but open sets in Y give open sets in X, and then presheaves on X give presheaves on Y.

Of course we need to check that it works:

Puzzle. Show that f_\ast F is a presheaf. That is, explain how we can restrict an element of (f_\ast F)(V) to any open set contained in V, and check that we get a presheaf this way.

In fact it works very nicely:

Puzzle. Show that taking direct images gives a functor from the category of presheaves on X to the category of presheaves on Y.

Puzzle. Show that if F is a sheaf on X, its direct image f_\ast F is a sheaf on Y.

The upshot of all this is that a continuous map between topological spaces

f \colon X \to Y

gives a functor between sheaf categories

f_\ast \colon \mathsf{Sh}(X) \to \mathsf{Sh}(Y)

And this functor turns out to be very nice! This is another big theorem we aim to prove later:

Theorem. If f \colon X \to Y is a continuous map between topological spaces, the functor

f_\ast \colon \mathsf{Sh}(X) \to \mathsf{Sh}(Y)

has a left adjoint

f^\ast \colon \mathsf{Sh}(Y) \to \mathsf{Sh}(X)

that preserves finite limits.

This left adjoint is called the inverse image map. Note that because f_\ast has a left adjoint, it is a right adjoint, so it preserves limits. Because f^\ast is a left adjoint, it preserves colimits. The fact that f^\ast preserves finite limits is extra gravy on top of an already nice situation!

We bundle all this niceness into a definition:

Definition. A functor f_\ast \mathsf{T} \to \mathsf{T'} between topoi is a geometric morphism if it has a left adjoint that preserves finite limits.

And this is the most important kind of morphism between topoi. It’s not a very obvious definition, but it’s extracted straight from what happens in examples.

To wrap up, I should add that people usually call the pair consisting of f_\ast \colon \mathsf{T} \to \mathsf{T'} and its left adjoint f^\ast \colon \mathsf{T'} \to \mathsf{T} a geometric morphism. A functor has at most one adjoint, up to natural isomorphism, so my definition is at least tolerable. But I’ll probably switch to the standard one when we get serious about geometric morphisms.

And we will eventually see that geometric morphisms let us translate mathematics from one alternative universe to another!

Conclusion

If this seemed like too much too soon, fear not, I’ll go over it again and actually define a lot of the concepts I merely sketched, like ‘exponentials’, ‘subobject classifier’, ‘Heyting algebra’, ‘Grothendieck topology’, and ‘Grothendieck topos’. I just wanted to get a lot of the main concepts on the table quickly. You should do the puzzles to see if you understand what I wanted you to understand. Unless I made a mistake, all of these are straightforward definition-pushing if you’re comfortable with some basic category theory.

For more background on topos theory I highly recommend this:

• Colin McLarty, The uses and abuses of the history of topos theory.

The view that toposes originated as generalized set theory is a figment of set theoretically educated common sense. This false history obstructs understanding of category theory and especially of categorical foundations for mathematics. Problems in geometry, topology, and related algebra led to categories and toposes. Elementary toposes arose when Lawvere’s interest in the foundations of physics and Tierney’s in the foundations of topology led both to study Grothendieck’s foundations for algebraic geometry. I end with remarks on a categorical view of the history of set theory, including a false history plausible from that point of view that would make it helpful to introduce toposes as a generalization from set theory.