Disorder and determinism

composition.al 2014-06-07

My research on LVars is all about deterministic parallel programming: writing programs so that they can be scheduled onto parallel hardware with the goal of running faster, and doing so in such a way that the observable results of a program are guaranteed to be the same on every run.

But, as much time as I spend insisting that LVars enforce determinism, there’s actually a huge amount of nondeterminism in LVar programs! For instance, the order in which writes to an LVar occur isn’t deterministic at all. Writes can happen in any order, and, in fact, that’s the whole point.

So, then, it seems that the purpose of LVars is not to rule out all nondeterminism. Rather, it’s to create an abstraction barrier through which no nondeterminism can leak, such that what’s observable from the outside of that barrier is deterministic. But the funny thing about putting such a barrier in place is that, once the barrier is there, programs can go absolutely hog-wild with nondeterminism on the other side of it, without anyone else caring or even knowing!

I like to say that all programs are nondeterministic to a greater or lesser degree, because determinism and nondeterminism are a matter of what one’s allowed to observe. If I write a “deterministic” program, will the running time be exactly the same on every run? Will the processor always generate the same amount of heat when it runs? No, not likely — it’s just that, usually, those things don’t count as “observable results” of the program. Rather than just saying that a program is deterministic or isn’t, then, we should define determinism with respect to a notion of what is observable. Our abstraction barrier neatly captures that notion of observability: nondeterminism on the outside of it counts, and nondeterminism on the inside doesn’t.

Disorderly programming

The folks who work on the Bloom programming language say that programming in Bloom is “disorderly” programming, a slogan that could apply to programming with LVars as well.

Both LVars and Bloom bring order-theoretic concepts to bear on the problems they’re trying to solve. At first glance, “disorderly” might seem like a strange word to use to describe a system based on order theory. I find that an analogy to compilers is helpful here. The role of a data dependence analysis in a compiler is to determine which program statements depend on other ones happening first, so that the compiler can arrange them in an order that makes sense. In other words, it creates ordering constraints.

But, as Allen & Kennedy tell us in the first chapter of their book, the job of a parallelizing compiler is actually to take away the ordering constraints that the programmer implicitly introduced by writing sequential code!

In order to do so, the compiler must do a data dependence analysis to figure out which of those constraints really are essential — but the compiler is only doing this so that it can throw away the rest.

Likewise, LVars and Bloom, and other systems like them, use order-theoretic concepts for enforcing a minimal set of ordering constraints, so that they can safely ignore all the rest.