Given a threshold
, a
-smooth number (or
-friable number) is a natural number
whose prime factors are all at most
. We use
to denote the number of
-smooth numbers up to
. In studying the asymptotic behavior of
, it is customary to write
as
(or
as
) for some
. For small values of
, the behavior is straightforward: for instance if
, then all numbers up to
are automatically
-smooth, so

in this case. If

, the only numbers up to

that are not

-smooth are the multiples of primes

between

and

, so



where we have employed
Mertens’ second theorem. For

, there is an additional correction coming from multiples of
two primes between

and

; a straightforward inclusion-exclusion argument (which we omit here) eventually gives

in this case.
More generally, for any fixed
, de Bruijn showed that

where

is the
Dickman function. This function is a piecewise smooth, decreasing function of

, defined by the delay differential equation

with initial condition

for

.
The asymptotic behavior of
as
is rather complicated. Very roughly speaking, it has inverse factorial behavior; there is a general upper bound
, and a crude asymptotic

With a more careful analysis one can refine this to

and with a
very careful application of the Laplace inversion formula one can in fact show that

where

is the
Euler-Mascheroni constant and

is defined implicitly by the equation

One cannot write

in closed form using elementary functions, but one can express it in terms of the
Lambert
function as

. This is not a particularly enlightening expression, though. A more productive approach is to work with approximations. It is not hard to get the initial approximation

for large

, which can then be re-inserted back into
(3) to obtain the more accurate approximation

and inserted once again to obtain the refinement

We can now see that
(2) is consistent with previous asymptotics such as
(1), after comparing the integral

to

For more details of these results, one can see for instance
this survey by Granville.
This asymptotic (2) is quite complicated, and so one does not expect there to be any simple argument that could recover it without extensive computation. However, it turns out that one can use a “maximum entropy” analysis to get a reasonably good heuristic approximation to (2), that at least reveals the role of the mysterious function
. The purpose of this blog post is to give this heuristic.
Viewing
, the task is to try to count the number of
-smooth numbers of magnitude
. We will propose a probabilistic model to generate
-smooth numbers as follows: for each prime
, select the prime
with an independent probability
for some coefficient
, and then multiply all the selected primes together. This will clearly generate a random
-smooth number
, and by the law of large numbers, the (log-)magnitude of this number should be approximately

(where we will be vague about what “

” means here), so to obtain a number of magnitude about

, we should impose the constraint

The indicator
of the event that
divides this number is a Bernoulli random variable with mean
, so the Shannon entropy of this random variable is

If

is not too large, then Taylor expansion gives the approximation

Because of independence, the total entropy of this random variable

is

inserting the previous approximation as well as
(5), we obtain the heuristic approximation

The
asymptotic equipartition property of entropy, relating entropy to microstates, then suggests that the set of numbers

that are typically generated by this random process should be approximately


Using the
principle of maximum entropy, one is now led to the approximation

where the weights

are chosen to maximize the right-hand side subject to the constraint
(5).
One could solve this constrained optimization problem directly using Lagrange multipliers, but we simplify things a bit by passing to a continuous limit. We take a continuous ansatz
, where
is a smooth function. Using Mertens’ theorem, the constraint (5) then heuristically becomes

and the expression
(6) simplifies to

So the entropy maximization problem has now been reduced to the problem of minimizing the functional

subject to the constraint
(7). The astute reader may notice that the integral in
(8) might diverge at

, but we shall ignore this technicality for the sake of the heuristic arguments.
This is a standard calculus of variations problem. The Euler-Lagrange equation for this problem can be easily worked out to be

for some Lagrange multiplier

; in other words, the optimal

should have an exponential form

. The constraint
(7) then becomes

and so the Lagrange multiplier

is precisely the mysterious quantity

appearing in
(2)! The formula
(8) can now be evaluated as



where

is the divergent constant

This recovers a large fraction of
(2)! It is not completely accurate for multiple reasons. One is that the hypothesis of joint independence on the events

is unrealistic when trying to confine

to a single scale

; this comes down ultimately to the subtle differences between the Poisson and Poisson-Dirichlet processes, as discussed in
this previous blog post, and is also responsible for the otherwise mysterious

factor in
Mertens’ third theorem; it also morally explains the presence of the same

factor in
(2). A related issue is that the law of large numbers
(4) is not exact, but admits gaussian fluctuations as per the central limit theorem; morally, this is the main cause of the

prefactor in
(2).
Nevertheless, this demonstrates that the maximum entropy method can achieve a reasonably good heuristic understanding of smooth numbers. In fact we also gain some insight into the “anatomy of integers” of such numbers: the above analysis suggests that a typical
-smooth number
will be divisible by a given prime
with probability about
. Thus, for
, the probability of being divisible by
is elevated by a factor of about
over the baseline probability
of an arbitrary (non-smooth) number being divisible by
; so (by Mertens’ theorem) a typical
-smooth number is actually largely comprised of something like
prime factors all of size about
, with the smaller primes contributing a lower order factor. This is in marked contrast with the anatomy of a typical (non-smooth) number
, which typically has
prime factors in each hyperdyadic scale
in
, as per Mertens’ theorem.