Planar point sets with forbidden four-point patterns and few distinct distances
What's new 2024-09-04
I’ve just uploaded to the arXiv my paper “Planar point sets with forbidden -point patterns and few distinct distance“. This (very) short paper was a byproduct of my recent explorations of the Erdös problem website in recent months, with a vague emerging plan to locate a suitable problem that might be suitable for some combination of a crowdsourced “Polymath” style project and/or a test case for emerging AI tools. The question below was one potential candidate; however, upon reviewing the literature on the problem, I noticed that the existing techniques only needed one additional tweak to fully resolve the problem. So I ended up writing this note instead to close off the problem.
I’ve arranged this post so that this additional trick is postponed to below the fold, so that the reader can, if desired, try to guess for themselves what the final missing ingredient needed to solve the problem was. Here is the problem (Erdös problem #135), which was asked multiple times by Erdös over more than two decades (and who even offered a small prize for the solution on one of these occasions):
Problem 1 (Erdös #135) Letbe a set of
points such that any four points in the set determine at least five distinct distances. Must
determine
many distances?
This is a cousin of the significantly more famous Erdös distinct distances problem (Erdös problem #89), which asks what is the minimum number of distances determined by a set of
points in the plane, without the restriction on four-point configurations. The example of a square grid
(assuming for sake of argument that
is a perfect square), together with some standard analytic number theory calculations, shows that
can determine
distances, and it is conjectured that this is best possible up to constants. A celebrated result of Guth and Katz, discussed in this previous blog post, shows that
will determine at least
distances. Note that the lower bound
here is far larger, and in fact comparable to the total number
of distances available, thus expressing the belief that the “local” condition that every four points determine at least five distances forces the global collection distances to be almost completely distinct. In fact, in one of the papers posing the problem, Erdös made the even stronger conjecture that the set
must contain a subset
of cardinality
for which all the
distances generated by
are distinct.
A paper of Dumitrescu came close to resolving this problem. Firstly, the number of ways in which four points could fail to determine five distinct distances was classified in that paper, with the four-point configurations necessarily being one of the following eight patterns:
-
: An equilateral triangle plus an arbitrary vertex.
-
: A parallelogram.
-
: An isosceles trapezoid (four points on a line,
, where
, form a degenerate isosceles trapezoid).
-
: A star with three edges of the same length.
-
: A path with three edges of the same length.
-
: A kite.
-
: An isosceles triangle plus an edge incident to a base endpoint, and whose length equals the length of the base.
-
: An isosceles triangle plus an edge incident to the apex, and whose length equals the length of the base.
Given that the grid determine only
distances, one could seek a counterexample to this by finding a set of
points in the grid
that avoided all of the eight patterns
.
Dumitrescu then counted how often each of the patterns occured inside the grid
. The answer is:
-
does not occur at all. (This is related to the irrationality of
.)
-
occurs
times.
-
occurs
times.
-
occurs
times.
-
occurs
times.
-
occurs
times.
-
occurs
times.
-
occurs
times.
Using this and a standard probabilistic argument, Dumitrescu then established the following “near miss” to a negative answer to the above problem:
Theorem 2 (First near miss) Ifis sufficiently large, then there exists a subset of
of cardinality
which avoids all of the patterms
.
In particular, this generates a set of points with
distances that avoids seven out of the eight required forbidden patterns; it is only the parallelograms
that are not avoided, and are the only remaining obstacle to a negative answer to the problem.
Proof: Let be a small constant, and let
be a random subset of
, formed by placing each element of
with an independent probability of
. A standard application of Hoeffding’s inequality (or even the second moment method) shows that this set
will have cardinality
with high probability if
is large enough. On the other hand, each of the
patterns
has a probability
of lying inside
, so by linearity of expectation, the total number of such patterns inside
is
on the average. In particular, by Markov’s inequality, we can find a set
of cardinality
with only
such patterns. Deleting all of these patterns from
, we obtain a set
of cardinality
, which is
if
is a sufficiently small constant. This establishes the claim.
Unfortunately, this random set contains far too many parallelograms (
such parallelograms, in fact) for this deletion argument to work. On the other hand, in earlier work of Thiele and of Dumitrescu, a separate construction of a set of
points in
that avoids all of the parallelograms
was given:
Theorem 3 (Second near miss) Forlarge, there exists a subset
of
of cardinality
which contains no parallelograms
. Furthermore, this set is in general position: no three points in
are collinear, and no four are concyclic. As a consequence, this set
in fact avoids the three patterns
(the pattern in
is concyclic, and the pattern
does not occur at all in the grid).
Proof: One uses an explicit algebraic construction, going back to an old paper of Erdös and Turán involving constructions of Sidon sets. Namely, one considers the set
whereGiven that we have one “near-miss” in the literature that avoids , and another “near-miss” that avoids
, it is natural to try to combine these two constructions to obtain a set that avoids all eight patterns
. This inspired the following problem of Dumitrescu (see Problem 2 of this paper):
Problem 4 Does the setin (1) contain a subset of cardinality
that avoids all eight of the patterns
?
Unfortunately, this problem looked difficult, as the number-theoretic task of counting the patterns in
looked quite daunting.
This ends the survey of the prior literature on this problem. Can you guess the missing ingredient needed to resolve the problem? I will place the answer below the fold.
The missing ingredient is to randomize the parabola appearing in (1). The crucial property of being free of parallelograms is preserved under affine transformations of the finite field plane , so we apply a random invertible affine transformation to the parabola to create the candidate set
Once one has this calculation, the deletion argument finishes the job. Indeed, the expected number of patterns in
is
. If we refine further by an additional factor of
as in the proof of Theorem 2, we obtain (with high probability) a set of cardinality
that contains
forbidden patterns. Deleting these, we have finally obtained a set of cardinality
in the grid
(and thus generating
distances) that avoid all of the eight patterns
, and thus give a negative answer to the original problem of Erdös.
There are still open problems in the area. One is the following: what is the best lower bound on the number of distances determined by points in the plane, for which every four points determine at least five distances (i.e., one avoids all eight patterns
)? The Guth-Katz theorem gives the lower bound of
, but in this case we can get the better bound of
by the following trivial argument: if
is one point in the set, then the other
points determine at least
distinct distances to
, because the same distance cannot occur three times as this would create a star
. Can one do better than this? Specifically, can one achieve a super-linear bound? This was posed as Problem 3 in Dumitrescu’s paper. I do not know how to make progress on this question, other than a vague suspicion that the polynomial method might be relevant here, and that one should somehow try to capture many of the points in a set that only has a linear number of distances in a reasonably low degree algebraic curve.