Leaps and Bounds: Practice Meets Theory

Gödel’s Lost Letter and P=NP 2019-08-20

Solving the runtime selection problem

Composite from src1, src2

Brendan Lucier and Csaba Szepesvári were consecutive speakers at this week’s workshop at the Toyota Technological Institute in Chicago on “Automated Algorithm Design.”

Today I will discuss their talks, which I enjoyed greatly at the workshop.

The workshop’s general theme was the use of machine-learning techniques to improve the tuning of algorithms. Indeed, the goal is to have the algorithm tune itself by selecting strategies from a collection of possible ones. Besides better performance with less human work, this promises better logical reliability than with hand-tuning programs and debugging.

A common component of this work used in both talks is an interesting oracle model. Since oracles are familiar to many of us theorists this will give us an avenue into the talks and the two recent papers they were based on.

Runtime Oracles

The oracle has a nonnegative matrix {R} of shape {n \times M}. Think of the entry {R(i,j)} as the runtime of the {i^{th}} program on the {j^{th}} input. You know {n} and {M}, but must ask the oracle for information about {R(i,j)}. You may ask the oracle a question {(i,j,t)}:

Is the entry {R(i,j)} less than or equal to {t}?

Here {t} is a time bound. Further they assume that the oracle charges you

\displaystyle  \min(R(i,j),t).

Thus an algorithm is charged in total

\displaystyle  \sum_{t} \min(R(i,j),t),

where the sum is over all questions {(i,j,t)} you asked. The rationale is that the oracle can run a program {i} on a task {j} and stop after at most {t} steps. Thus returning the minimum is realistic. Since their research is motivated by practical problems, this is a useful model for studying questions about program performance.

In passing I believe the reason this oracle model is new is that theorists do not often think about running arbitrary programs. Well those in recursion type did, but those designing classic algorithms do not.

A Selection Problem

This model to used to study a selection problem. Assume {R} is as above. The task is to find the row with the smallest row sum {s_{i}}:

\displaystyle  s_{i} = \sum_{j=1}^{M} R(i,j).

That is to find the program that takes the least total time on the inputs—hence the least average time.

The trouble is that in the worst case this can require examining all the entries. So they introduce approximations to make the problem doable, but still useful in practice:

  1. The rows need only be a {(1+\epsilon)} approximation to the smallest row.

  2. The entries can be assumed to be truncated by a value {\tau}. That is you can assume that

    \displaystyle  R(i,j) \le \tau.

  3. The governing algorithm can be randomized and need only meet the performance goal with high probability.

Now we can state the problem formally. It is parameterized by {(\epsilon, \tau)}:

Runtime Selection Problem (RSP): Given a matrix {R} with all entries bounded by {\tau}, find a row {i} so that

\displaystyle  s_{i} \le (1 + \epsilon)s_{k},

for all {k}, while minimizing the oracle charges for all questions “is {R(i,j) \leq t}?” that are asked.

The talks gave a variety of randomized algorithms that solve such problems for various parameter assumptions.

Fast and Slow

See their papers for details on the results. The algorithms they have are interesting not only from a theory viewpoint but also in practice. Indeed, they not only prove theorems but also give experimental timings.

In order to establish some intuition, let’s look at the following simple case. Assume that all entries {R(i,j)} are either {1} or some {\alpha \gg 1}. This is the “fast-or-slow” running time stipulation. As theorists we often are drawn to binary values, so this might be a good case to look at initially.

The first observation is that we will always ask questions of the form

\displaystyle  (i, j, 1.0001).

This gives us the value of the entry {R(i,j)} for almost unit cost: if it is the {R(i,j) \gg 1} case then we are only charged {1.0001}. Thus, we have reduced the original problem to a classic oracle problem. There is no complicated oracle cost measure: the cost is just the number of entries we read. Well okay the cost is slightly higher, but we can make it as close as we wish.

The selection problem then comes down to this:

  • We have a {n \times M} nonnegative matrix {R} whose entries are all either {1} or {\alpha}.

  • We must find a row sum that is within a factor of {1+\epsilon} of the smallest.

I believe that the analysis of this problem should be generally known. In any event it seems clear that randomly sampling each row is a good start.

Open Problems

Are there other natural problems where the runtime cost oracle is useful?

[Edited typo]