The Space of Physical Frameworks (Part 4)

Azimuth 2024-09-16

In Part 1, I explained my hopes that classical statistical mechanics reduces to thermodynamics in the limit where Boltzmann’s constant k approaches zero. In Part 2, I explained exactly what I mean by ‘thermodynamics’. I also showed how, in this framework, a quantity called ‘negative free entropy’ arises as the Legendre transform of entropy.

In Part 3, I showed how a Legendre transform can arise as a limit of something like a Laplace transform.

Today I’ll put all the puzzle pieces together. I’ll explain exactly what I mean by ‘classical statistical mechanics’, and how negative free entropy is defined in this framework. Its definition involves a Laplace transform. Finally, using the result from Part 3, I’ll show that as k \to 0, negative free entropy in classical statistical mechanics approaches the negative free entropy we’ve already seen in thermodynamics!

Thermodynamics versus statistical mechanics

In a certain important approach to thermodynamics, called classical thermodynamics, we only study relations between the ‘macroscopic observables’ of a system. These are the things you can measure at human-sized distance scales, like the energy, temperature, volume and pressure of a canister of gas. We don’t think about individual atoms and molecules! We say the values of all the macroscopic observables specify the system’s macrostate. So when I formalized thermodynamics using ‘thermostatic systems’ in Part 2, the ‘space of states’ X was really a space of macrostates. Real-valued functions on X were macroscopic observables.

I focused on the simple case where the macrostate is completely characterized by a single macroscopic observable called its energy E \in [0,\infty). In this case the space of macrostates is X = [0,\infty). If we can understand this case, we can generalize later.

In classical statistical mechanics we go further and consider the set \Omega of microstates of a system. The microstate specifies all the microscopic details of a system! For example, if our system is a canister of helium, a microstate specifies the position and momentum of each atom. Thus, the space of microstates is typically a high-dimensional manifold — where by ‘high’ I mean something like 10^{23}. On the other hand, the space of macrostates is often low-dimensional — where by ‘low’ I mean something between 1 and 10.

To connect thermodynamics to classical statistical mechanics, we need to connect macrostates to microstates. The relation is that each macrostate is a probability distribution of microstates: a probability distribution that maximizes entropy subject to constraints on the expected values of macroscopic observables.

To see in detail how this works, let’s focus on the simple case where our only macroscopic observable is energy.

Classical statistical mechanical systems

Definition. A classical statistical mechanical system is a measure space (\Omega,\mu) equipped with a measurable function

H \colon \Omega \to [0,\infty)

We call \Omega the set of microstates, call H the Hamiltonian, and call H(x) the energy of the microstate x \in \Omega.

It gets tiring to say ‘classical statistical mechanical system’, so I’ll abbreviate this as classical stat mech system.

When we macroscopically measure the energy of a classical stat mech system to be E, what’s really going on is that the system is in a probability distribution of microstates for which the expected value of energy is E. A probability distribution is defined to be a measurable function

p \colon \Omega \to [0,\infty)

with

\displaystyle{ \int_\Omega p(x) \, d\mu(x) = 1 }

The expected energy in this probability distribution is defined to be

\displaystyle{ \langle H \rangle = \int_\Omega H(x) \, p(x) \, d \mu(x) }

So what I’m saying is that p must have

\langle H \rangle = E

But lots of probability distributions have \langle H \rangle = E. Which one is the physically correct one? It’s the one that maximizes the Gibbs entropy:

\displaystyle{  S = - k \int_\Omega p(x) \, \ln p(x) \, d\mu(x) }

Here k is a unit of entropy called Boltzmann’s constant. Its value doesn’t affect which probability distribution maximizes the entropy! But it will affect other things to come.

Now, there may not exist a probability distribution p that maximizes S subject to the constraint \langle H \rangle = E, but there often is — and when there is, we can compute what it is. If you haven’t seen this computation, you can find it in my book What is Entropy? starting on page 24. The answer is the Boltzmann distribution:

\displaystyle{  p(x) = \frac{e^{-C H(x)/k}}{\int_\Omega e^{-C H(x)/k} \, d \mu(x)} }

Here C is a number called the inverse temperature. We have to cleverly choose its value to ensure \langle H \rangle = E. That might not even be possible. But if we get that to happen, p will be the probability distribution we seek.

The normalizing factor in the formula above is called the partition function

Z_k(C) = \int_\Omega e^{-C H(x)/k} \, d\mu(x)

and it turns out to be important in its own right. The integral may not always converge, but when it does not we’ll just say it equals +\infty, so we get

Z_k \colon [0,\infty) \to [0,\infty]

One reason the partition function is important is that

- k \ln Z_k(C)  = C \langle H \rangle - S

where \langle H \rangle and S are computed using the Boltzmann distribution for the given value of C For a proof see pages 67–71 of my book, though beware that I use different notation. The quantity above is called the negative free entropy of our classical stat mech system. In my book I focus on a closely related quantity called the ‘free energy’, which is the negative free entropy divided by C. Also, I talk about \beta = 1/k T instead of the inverse temperature C = 1/T.

Let’s call the negative free entropy \Psi_k(C), so

\displaystyle{ \Psi_k(C) = -k \ln Z_k(C) = - k \ln  \int_\Omega e^{-C H(x)/k} \, d\mu(x) }

I’ve already discussed negative free entropy in Part 2, but that was for thermostatic systems, and it was defined using a Legendre transform. This new version of negative free entropy applies to classical stat mech systems, and we’ll see it’s defined using a Laplace transform. But they’re related: we’ll see the limit of the new one as k \to 0 is the old one!

The limit as k \to 0

To compute the limit of the negative free entropy \Psi_k(C) as k \to 0 it will help to introduce some additional concepts.

First, given a classical stat mech system with measure space (\Omega, \mu) and Hamiltonian H \colon \Omega \to \mathbb{R}, let

\nu(E) = \mu(\{x \in \Omega \vert \; H(x) \le E \}

be the measure of the set of microstates with energy \le E. This is an increasing function of E \in \mathbb{R} which is right-continuous, so it defines a Lebesgue–Stieltjes measure \nu on the real line. Yes, I taught real analysis for decades and always wondered when I’d actually use this concept in my own work: today is the day!

The reason I care about this measure \nu is that it lets us rewrite the partition function as an integral over the nonnegative real numbers:

\displaystyle{ Z_k(C) = \int_0^\infty e^{-C E/k} \, d\nu(E) }

Very often the measure \nu is absolutely continuous, which means that

d\nu(E) = g(E) \, d E

for some locally integrable function g \colon \mathbb{R} \to \mathbb{R}. I will assume this from now on. We thus have

\displaystyle{ Z_k(C) = \int_0^\infty e^{-C E/k} \, g(E) \, d E }

Physicists call g the density of states because if we integrate it over some interval [E, E + \Delta E] we get ‘the number of states’ in that energy range. At least that’s what physicists say. What we actually get is the measure of the set

\{x \in X: \; E \le H(x) \le E + \Delta E \}

Before moving on, a word about dimensional analysis. I’m doing physics, so my quantities have dimensions. In particular, E and d E have units of energy, while the measure d\nu(E) is dimensionless, so the density of states g(E) has units of energy-1.

This matters because right now I want to take the logarithm of g(E), yet the rules of dimensional analysis include a stern finger-wagging prohibition against taking the logarithm of a quantity unless it’s dimensionless. There are legitimate ways to bend these rules, but I won’t. Instead I’ll follow most physicists and introduce a constant with dimensions of energy, w, called the energy width. It’s wise to think of this as an arbitrary small unit of energy. Using this we can make all the calculations to come obey the rules of dimensional analysis. If you find that ridiculous, you can mentally set w equal to 1.

With that said, now let’s introduce the so-called microcanonical entropy, often called the Boltzmann entropy:

S_{\mathrm{micro}}(E) = k \ln (w g(E))

Here we are taking Boltzmann’s old idea of entropy as k times the logarithm of the number of states and applying it to the density of states. This allows us to define an entropy of our system at a specific fixed energy E. Physicists call the set of microstates with energy exactly equal to some number E the microcanonical ensemble, and they say the microcanonical entropy is the entropy of the microcanonical ensemble. This is a bit odd, because the set of microstates with energy exactly E typically has measure zero. But it’s a useful way of thinking.

In terms of the microcanonical entropy, we have

\displaystyle{ g(E) = \frac{1}{w} e^{S_{\mathrm{micro}}(E)/k} }

Combining this with our earlier formula

\displaystyle{ Z_k(C) = \int_0^\infty e^{-C E/k} g(E) \, d E }

we get this formula for the partition function:

\displaystyle{ Z_k(C) = \int_0^\infty e^{-(C E - S_{\mathrm{micro}}(E))/k} \, \frac{d E}{w} }

Now things are getting interesting!

First, the quantity C E - S_{\mathrm{micro}}(E) should remind you of the formula we saw in Part 2 for the negative free entropy of a thermostatic system. Remember, that formula was

\Psi(C) = \inf_E (C E - S(E))

Second, we instantly get a beautiful formula for the negative free entropy of a classical stat mech system:

\displaystyle{  \Psi_k(C) = - k \ln Z_k(C) = - k \ln  \int_0^\infty e^{-(C E - S_{\mathrm{micro}}(E))/k} \, \frac{d E}{w} }

Using this we can show the following cool fact:

Main Result. Suppose S_{\mathrm{micro}} \colon [0,\infty) \to \mathbb{R} is a concave function with continuous second derivative. Suppose that for some C > 0 the quantity C E - S_{\mathrm{micro}}(E) has a unique minimum as a function of E, and S''_{\mathrm{micro}} < 0 at that minimum. Then

\displaystyle{ \lim_{k \to 0}  \Psi_k(C) \quad = \quad \inf_E \left(C E - S_{\mathrm{micro}}(E)\right) }

The quantity at right deserves to be called the microcanonical negative free entropy. So, when the hypotheses hold,

As k \to 0, the free entropy of a classical statistical mechanical system approaches its microcanonical free entropy!

Here I’ve left off the word ‘negative’ twice, which is fine. But this sentence still sounds like a mouthful. Don’t feel bad if you find it confusing. But it could be the result we need to see how classical statistical mechanics approaches classical thermodynamics as k \to 0. So I plan to study this result further, and hope to explain it much better!

But today I’ll just prove the main result and quit. I figure it’s good to get the math done before talking more about what it means.

Proof of the main result

Suppose all the hypotheses of the main result hold. Spelling out the definition of the negative free entropy \Psi_k(C), what we need to show is

\displaystyle{ \lim_{k \to 0} - k \ln  \int_0^\infty e^{-(C E - S_{\mathrm{micro}}(E))/k} \, \frac{d E}{w}  \quad = }   \inf_E \left(C E - S_{\mathrm{micro}}(E)\right)

You’ll notice that the left hand side involves the energy width w. In fact it involves the energy width twice: once in a visible way, and once in a concealed way, since S_{\mathrm{micro}}(E) = k \ln (w g(E)). These two occurrences of w cancel out, so that the left hand side is independent of w. You can either check this directly, or note that the negative free entropy is -k \ln Z_k(C) and the partition function Z_k was originally defined in a way that didn’t involve w.

So, we are allowed to let w be any positive number we want, and from now on I’ll take w = 1.

Next, we need a theorem from Part 3. My argument for that theorem was not a full mathematical proof — I explained the hole I still need to fill — so I cautiously called it an ‘almost proved theorem’. Here it is:

Almost Proved Theorem. Suppose that f \colon [0,\infty) \to \mathbb{R} is a concave function with continuous second derivative. Suppose that for some s > 0 the function s x - f(x) has a unique minimum at x_0, and f''(x_0) < 0. Then as \beta \to +\infty we have

\displaystyle{ \lim_{\beta \to +\infty} -\frac{1}{\beta} \ln \int_0^\infty e^{-\beta (s x - f(x))} \, d x  \; = \; \inf_x \left( s x - f(x)\right) }

Now let’s use this to prove our main result! To do this, take

s = C, \quad x = E, \quad f(x) = S_{\mathrm{micro}}(E), \quad \beta = 1/k

Then we get

\displaystyle{\lim_{k \to 0} - k \ln \int_0^\infty e^{(C E - S_{\mathrm{micro}}(E))/k} \, d E \quad = }   \inf_E \left(C E - S_{\mathrm{micro}}(E) \right)

and this is exactly what we want… in the case where w = 1, which is sufficiently general.       ∎