Primitive sets and von Mangoldt chains: Erdős Problem #1196 and beyond
What's new 2026-05-04
Boris Alexeev, Kevin Barreto, Yanyang Li, Jared Duker Lichtman, Liam Price, Jibran Iqbal Shah, Quanyu Tang, and I have just uploaded to the arXiv our paper Primitive sets and von Mangoldt chains: Erdős Problem \#1196 and beyond. This paper (which is a work in progress) represents our efforts to digest and document the recent flurry of developments around the following problem of Erdős, Sárközy, and Szemerédi on primitive sets:
Conjecture 1 (Erdős problem #1196) Suppose thatis a primitive set, which means that no element of
divides another. Then
as
.
One can show that the upper bound of is best possible up to the
error by taking
to be the set of products of
primes for some suitable parameter
. This was one of the most well-known open problems in the study of primitive sets, and had attracted some number of partial results (for instance, Lichtman was able to show the upper bound of
). It was thus notable that this problem was first solved by an autonomous AI query (by the fifth author) a few weeks ago. This solution introduced a proof technique – based on Markov chains in the divisibility poset – which in retrospect is very natural for controlling primitive sets, but which had not been explicitly used in previous literature, though in retrospect many of the arguments in that literature involved a specific Markov chain which we call the downwards Mertens chain. The proof instead revolved around a different Markov chain, which we call the downwards von Mangoldt chain, which manages to neatly avoid the “
” type losses in the previous Mertens-based arguments, and resolve Conjecture 1. In this paper we develop the Markov chain approach more systematically, and show that it settles several further conjectures concerning primitive sets, and also provides simpler proofs of some previous results in the literature. More precisely, in addition to Conjecture 1, we establish the following:
Theorem 2 (Erdős primitive set conjecture, #164) For any primitiveconsisting of numbers greater than
,
Theorem 3 (Odd Banks–Martin) Letand suppose
is a primitive set consisting of odd numbers with at most
prime factors. Then
where
denotes the primes appearing as factors of elements of
, and
is the collection of products of
primes from
.
Theorem 4 (is Erdős-strong) If
is a primitive set consisting of even numbers, then
Theorem 5 (Ahlswede–Khachatrian–Sárközy) Ifis a primitive set, then
whenever
.
Theorem 6 (Erdős–Sárközy–Szemerédi, #1217) Letbe such that the upper doubly logarithmic density
is positive. Then there exists a strictly increasing infinite divisibility chain
in
such that
Theorem 2 and Theorem 5 had been previously established by Lichtman and Ahlswede–Khachatrian–Sárközy respectively, but the Markov chain formalism gives shorter (and more unified) proofs of both. Theorems 3, 4, 6 were open conjectures that can now be settled by this method. These results were obtained with varying levels of AI involvement, ranging from completely autonomous AI queries to traditional pen-and-paper calculations, to various hybrid approaches (for instance, with humans suggesting key inequalities that could then be rapidly tested numerically or even proved by various AI tools).
— 1. Chain/antichain duality and Markov chains —
I’ll now discuss the basic method of proof and try to motivate the main ideas, which have become much clearer in retrospect. Primitive sets can be viewed as antichains in the divisibility poset , in which the partial ordering is given by the divisibility relation
. So, one can pose the following more abstract question: given a general poset
and a weight function
, what is the maximal value of
as
ranges over all antichains in
?
One can attack this problem using the well known duality between antichains and chains (totally ordered subsets of
): every antichain and chain can meet in at most one point, thus one has
In fact this duality is completely tight:
Proposition 7 (Chain/antichain duality) Letbe a poset, let
be a weight function, and let
. Then the following are equivalent:
- (i)
for all antichains
.
- (ii) There exists a measure
of total mass at most
on the space of chains, such that (1) holds for all
.
Proof: We have already indicated how (ii) implies (i). Now we need to show that (i) implies (ii). A standard compactness argument allows us to reduce to the case when is finite. If (i) holds, then we also have
Thus, the “universal” problem of obtaining a uniform upper bound on for all antichains
is replaced with the equivalent “existential” dual problem of exhibiting a single measure
on chains of controlled mass, which hits each element
of the poset with a mass of at least the original weight
. Thus, such problems are now reduced to that of finding a sufficiently clever construction of such a measure
. If the mass of
was normalized to equal
, this becomes a probability problem: find a random chain process in the poset
that hits each element
of the poset with a sufficiently high probability. (Though in our paper we found it more convenient for technical reasons to not normalize the measure, and allow the mass to take values other than
.)
It turns out (in a manner that was not explicitly appreciated in past literature) that particularly good choices of random chain to use here can come from Markov chains. (Here, the term “chain” is now being used in two different ways, but fortunately the order-theoretic concept of a chain and the Markov process-theoretic concept of a chain will be quite compatible in this discussion.) There will be two types of Markov chains on the poset that will be relevant: downwards Markov chains and upwards Markov chains. Here is our notation for a downwards Markov chain:
Definition 8 (Downwards Markov chain) Letbe a poset, and suppose we designate some subset
of
to be the “absorbing states” (in practice these will be the minimal elements of
, although they do not have to be). A downwards Markov chain on
with absorbing states
is a collection of transition probabilities
for
obeying the following axioms:
(Thus for instance one has
- (i)
, with equality unless
and
, or if
and
.
- (ii) For any
, one has
.
for any absorbing state
.)
Given such a downwards Markov chain and an initial state , one can generate a random decreasing sequence
by having each
transition to
with probability
after conditioning on the past history
of the chain. This sequence will (almost surely) be strictly decreasing until it hits an absorbing state, in which case it stays there forever, although if the descending chain condition is not satisfied it is also possible for the sequence to be strictly decreasing indefinitely. We let
denote the law of this decreasing sequence. This construction is already enough to recover the Lubell-Yamomoto-Meshalkin (LYM) inequality:
Theorem 9 (LYM inequality) Ifis the power set of
with the inclusion partial ordering, and
is an antichain in this poset (i.e., a Sperner family), then
Proof: We introduce the downwards Markov chain with absorbing state in which each non-empty subset of
of some cardinality
transitions to a
-element subset chosen uniformly at random (i.e., with probability
of transitioning to each). If we start the descending sequence from the maximal element
of
, then one can easily check that each
is hit with probability
. Applying Proposition 7 with
,
, and
, we obtain the claim.
In the above argument we fixed the initial location of the Markov chain, but more generally one can start with any source mass
and work with the measure
for the purposes of applying Proposition 7.
One can also define upwards Markov chains in exact analogy with downwards Markov chains
(reversing the order in the poset), which now generate random increasing sequences
rather than decreasing sequences. There is a useful adjoint construction that can convert a downwards Markov chain into an upwards Markov chain: if we have a positive weight
which is invariant under the chain in the sense that
A downwards or upwards Markov chain, when equipped with an invariant or sub-invariant measure, also induces a flow network on the poset, in which an edge from to
is assigned a flow capacity of
— 2. The Mertens and von Mangoldt chains —
For the purpose of analyzing primitive sets, there are two downward chains on the natural numbers (with absorbing state )that, in retrospect, are particularly natural to use:
- The downwards Mertens chain, in which each
transitions deterministically to
, where
is the largest prime factor of
; and
- the downwards von Mangoldt chain, in which each
transitions to
with probability
for each
dividing
, with
the von Mangoldt function.
The von Mangoldt weight is a natural choice here thanks to the fundamental identity
The Mertens chain generates deterministic downward divisibility chains
The key innovation (which was uncovered by the AI-assisted proofs, though not quite in the notation and framework presented here) is to switch to the von Mangoldt chain, which removes the bias towards numbers whose largest prime factor is small. Indeed, the weight now turns out to be sub-invariant (after removing
) under this chain, and there is a modification
One slight defect of the von Mangoldt chain, as compared to the Mertens chain, is that it can “jump over” primitive sets (such as the set of products of primes) due to the fact that it will sometimes multiply or divide by a power of a prime rather than a prime itself. This turns out to be a technical difficulty for many of our applications, resulting in a need to make various small ad hoc modifications to the von Mangoldt chain to eliminate this type of jump.
In order to establish some crucial sub-invariance properties, it turns out (after some standard manipulations) to be useful to obtain good bounds on the negative log-derivative
— 3. Further directions —
Will Sawin and Ofir Gorodetsky have obtained analogues of several of the above results for function field models or permutation models respectively; we briefly discuss these in the paper, although we do not plan to cover these models in depth. We also note another recent use of this technique to solve a separate Erdős problem (\#858) relating to antichains in a variant of the divisibility poset.
The zeta process that we have discovered hints at an emerging theory of the “developmental anatomy of integers”, which differs from the existing topic of anatomy of integers in that it views a large integer (and its prime factor “organs”) not as a static entity, but rather as an evolving process in which primes (or powers of primes, in the case of the von Mangoldt process) are added or removed to the integer over time. With this perspective, primitive sets can be viewed as singular moments in such a developmental process, which are only encountered at most once in the life cycle of a given integer. It seems of interest to study this developmental perspective further.
The paper is currently a work in progress; we have released an early version due to the public interest in this problem. We plan to explore some further applications, and also to formalize more of the above results in Lean (currently two of the six main theorems are formalized), before submitting the paper for publication. The situation here highlights a distinction I have recently made between three components of the problem solving process in mathematics, namely proof generation, proof verification, and proof digestion. In this particular case, the first two steps were extremely rapid due to modern AI tools; however, properly digesting the AI-generated proofs into a coherent exposition that places the arguments in context with both past literature and future directions remains a slower process that requires expert human attention.