Bright and early next Monday morning I’m giving the keynote talk at PODS, the annual database theory conference. The topic: (a) to summarize seven years of experience using logic to build distributed systems and network protocols (including P2, DSN, and recent BOOM work), and (b) to set out some ideas about the foundations of distributed and parallel programming that fell out from that experience.
I posted the paper underlying the talk, called The Declarative Imperative: Experiences and Conjectures in Distributed Logic. It’s written for database theoreticians, and in a spirit of academic fun it’s maybe a little over the top. But I’m hopeful that the main ideas can clarify how we think about the practice of building distributed systems, and the languages we design for that purpose. The talk will be streamed live and archived (along with keynotes from the SIGMOD and SOCC conferences later in the week.)
Below the break is a preview of the big ideas. I’ll post about them at more length over the next few weeks, hopefully in more practical/approachable terms than I’m using for PODS.
- The CALM Conjecture. CALM stands for “Consistency And Logical Monotonicity”. The conjecture maps out the systems that can be implemented in an eventually consistent fashion without any coordination or waiting across sites. The answer? Systems that can be expressed in monotonic logic. As a somewhat simplified rubric, this includes systems that (a) do append-only updates (no overwriting) and (b) never try to aggregate data across sites via counting, summing, voting, finding a max or min, etc. (This is more restrictive than what CALM intends: these behaviors can in fact be “used” monotonically in special cases. More on that another day.) Interestingly, cross-site joins and recursion are included: these are monotonic operations and can easily be implemented in an eventually consistent, streaming fashion.
- The CRON Conjecture. CRON stands for “Causality Required Only for Non-Monotonicity”. It’s an extension or corollary of CALM to address classic distributed systems complexities of coordinating clocks across machines (Lamport’s famous foundations and follow-ons). It basically says that you can toss out all of that distributed clocks stuff if your program is monotonic (a corollary of CALM), and also says that you need that stuff if the results you ship between machines are involved in non-monotonic operations—again, aggregation and state overwrites are the standard examples. As a side-note, it also argues that monotonic systems will work correctly even if messages are delivered at an earlier time than they are sent! (If you think this sounds nuts, think about restarting a process but reusing its message logs from earlier runs to save work.)
- The Fateful Time Conjecture. This one is more cosmic — it tries to explain the underlying purpose of Time in computing (and maybe in general). The basic idea is that Time (meaning both the sequentiality of program steps in a single “thread”, and coordination of steps across threads/machines) is needed for only one purpose: to prevent multiple possible states from co-occurring. I.e. the purpose of time is to seal fate at each instantaneous moment. What happens if you don’t seal fate? One of two things: (a) logically impossible things like “P and Not P” co-occur, or (b) multiple non-deterministic choices among valid possibilities co-occur (so-called “multiple worlds”). I’m claiming that if you are using synchronization in a program — i.e. preventing things from happening in parallel — then you should convince yourself that the parallelism you avoided would have led to those conditions. Otherwise you are Wasting Time. (Did you think through this the last time you wrote a two consecutive lines in Java/Python/C? I didn’t think so. Hopefully Bloom will help automate that thinking for you!)