Skip navigation

Photo Credit: Karthick R via Compfight cc

We just finished writing up an overview of our most recent thinking about distributed consistency. The paper is entitled Consistency Without Borders, and it’s going to appear in the ACM SoCC conference next month in Silicon Valley.

It starts with two things we all know:

  1. Strong distributed consistency can be expensive and dangerous. (My favorite exposition: the LADIS ’08 conference writeup by Birman, Chockler and van Renesse. See especially the quotes from James Hamilton and Randy Shoup. And note that recent work like Spanner changes little: throughput of 10’s to 100’s of updates per second is only useful at the fringes.)
  2. Managing coordination in application logic is fraught with software engineering peril: you have to spec, build, test and maintain special-case, cross-stack distributed reasoning over time. Here be dragons.

The point of the paper is to try to reorient the community to explore the design space in between these extremes. Distributed consistency is one of the biggest CS problems of our day, and the technical community is spending way too much of its energy at these two ends of the design space.

We’ll be curious to hear feedback here, and at the conference.

Advertisements

3 Comments

  1. Hi Joe, interesting paper. I hope it generates a lot of discussion. Your examples of garbage collection and deadlock detection did prompt me to wonder–how much analysis have you done on industrial applications? I see a lot of applications that run multi-master on relational DBMS systems, primarily MySQL. It’s kind of surprising how often this works, given that distributed consistency is so hard to solve in general. You can avoid collisions by mastering shards in only one location at a time, or automatically generating off-set keys that bake the site ID in, or dropping constraints. It seems these tricks work because the data access patterns are fairly simple. For instance, manufacturing test record systems are a common distributed app that mostly do inserts as contractors log certifications of equipment prior to shipment. Their behavior is largely monotonic just by the nature of the application.

    • Interesting thoughts Robert. A bunch of our recent work on the Blazes tool referenced in the paper focuses on partitioning (sharding) keys as well. This is an important pattern that programmers and their tools need to think through — though it’s often stymied by derived data such as secondary indexes and materialized views, something we’ve also been looking at. This relates to your comment about constraints: derived data has implicit constraints with the source data. As you know, relaxing constraints puts burden back at the application layer to either restore them over time, or introduce logic to tolerate constraint failures. I like your thinking here — developers should be able to sidestep these constraints at the storage layer, *and* tools should be available to help them manage the resulting complexity at application level.

      Would be fun to follow up with you on some of the applications you’re looking at, and the SW stack above MySQL you see people using.

      • Hi Joe, thanks. I would be glad to follow up and will email you separately. It’s possible we may be able to share information from mechanical scanning of DBMS apps that would help with that analysis. We’re starting to think in terms of tools like rough-and-ready schema scanners to identify and mark out particular constructs like unique non-primary keys that tend to cause trouble with multi-master. (I think we could also get at update patterns as well by scanning the logs.) Understanding the prevalence of these in different types of applications could be quite interesting for all of us.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: