Skip navigation

It’s been about 6 years now that we’ve been working on declarative programming for distributed systems — starting with routing protocols, then network overlays, query optimizers, sensor network stacks, and more recently scalable analytics and consensus protocols.

Through that time, we’ve struggled to find a useful middle ground between the pure logic roots of classical declarative languages like Datalog, and the practical needs of real systems managing state across networks. Our compromises over the years allowed us to move forward, build real things, and learn many lessons. But they also led to some semantic confusion — as noted in papers by colleagues at Max Planck and AT&T.

Well, no more. We recently released a tech report on Dedalus, a formal logic language that can serve as a clean foundation for declarative programming going forward.  The Dedalus work is fairly theoretical, but having tackled it we’re in a strong position to define an approachable and appealing language that will let programmers get their work done in distributed environments. That’s the goal of our Bloom language.

The key insight in Dedalus is roughly this:

Time is essential; space is a detail.

Said differently, the important aspect of distributed programs is not that they run on physically dispersed hardware, but rather that the programmer needs to think carefully about the time-oriented issues that result –particularly delays and atomicity. By focusing on time as a main consideration in the language, we enable and encourage the programmer to focus on the semantic issues that affect correctness. Issues of spatial layout are set aside where they belong: as performance details that need to be addressed as part of tuning, or managed by an optimizer.

This outlook allowed us to solve the two key semantic problems in our prior work:

  1. modeling persistence and state update: it may surprise some people that Datalog, the favored language of database theory, had no way to model updates and deletes.  By incorporating time into the Dedalus logic, updates and deletes can be captured as deduction of logically timestamped (versioned) data.
  2. modeling delay and failure: using a formal notion of non-determinism in logic, we capture network delay and component failure as non-determinism in the derivation of timestamp.  More simply, we have the programmer distinguish between state that is derived now, next, or asynchronously. Asynch derivations may be delayed arbitrarily — even infinitely, in the case of failure — and the program semantics capture that possibility formally.

Having incorporated these issues into a formal logic, we are able to extend years of theoretical research on Datalog to the case of live, distributed systems that evolve over time.  We are also hopeful that we can use this as a “PODS-PODC” bridge: a linkage between database theory and distributed systems theory.

If you’ve read this far, please read the tech report.  And let us know your thoughts.

About these ads

3 Comments

  1. I noticed your paper references a bunch of Paxos papers, so you may have a use for PaxosLease, a diskless Paxos variant for distributed leases (eg. master leases). See http://scalien.com/whitepapers

  2. Hi, from your tech report I’m having a hard time imagining what building real systems in the wild would look like. Do you have any doc at that level?

    thanks

    • Hi Todd:
      Despite the formalism, Dedalus itself is not that far from Overlog, our earlier language. We’ve built many things in Overlog .. a good thing to look at might be BOOM-FS, our HDFS-like distributed filesystem. There’s a paper on that, and code that you can browse online. It’s all working code, but still a bit tough to read. One goal for Bloom is to be a lot more readable and familiar to the average programmer than Overlog… give us 6 months or so and we should get there!


One Trackback/Pingback

  1. […] into a system called OTP.See Joe Hellerstein's recent blog post about Dedalus at http://databeta.wordpress.com/20….Twitter has released an open source project called Gizzard which attempts to factor out the […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 63 other followers

%d bloggers like this: