Skip navigation

Category Archives: cloud

The recent July 2011 issue of Communications of the ACM includes our article on the technical aspects of the search for Jim Gray’s boat Tenacious.  This was a hard article to write, for both technical and personal reasons. It took far too long to finish, so at some point it was time to just pack it in (at which point the CACM folks informed us it had to be cut in length by half, which delayed things further.  The longer version is up as a Berkeley tech report.)

Meanwhile, some of the experience is even more relevant to current technology trends than it was 4 years ago, so hopefully folks interested in social computing, software engineering, image processing, crisis response, and other related areas will find something of use in there.

For those of you whose work is represented (or underrepresented) by the article, my apologies for its shortcomings.  I still don’t have the full picture of what happened—nobody does, really.  As a result I decided to avoid using personal names of volunteers in general to avoid attributing credit unevently. I know the result seems oddly impersonal.  Setting the tone of the article was as hard as capturing the content.

Meanwhile, I encourage you to add corrections and perspective to the article in the comment box at the end of the CACM link above. Comments are welcome here too, but they might not get as well-viewed or -archived.

Today was a big day in the BOOM group: we launched the alpha version of Bud: Bloom Under Development. If you’re new to this blog, Bloom is our new programming language for cloud computing and other distributed systems settings. Bud is the first fully-functional release of Bloom, implemented as a DSL in Ruby.

I’ve written a lot about Bloom in research papers and on the new Bloom website, and I have lots to say about distributed programming that I won’t recap. Instead, I want to focus here on the tangible: working code. If you’re looking for something serious, check out the walkthrough of the bfs distributed filesystem, a GFS clone. But to get the flavor, consider the following two lines of code, which implement what you might consider to be “hello, world” for distributed systems: a chat server.

nodelist <= connect.payloads
mcast <~ (mcast * nodelist).pairs { |m,n| [n.key, m.val] }

That’s it.

The first line says “if you get a message on a channel called ‘connect’, remember the payload in a table called ‘nodelist’”. The second says “if you get a message on the ‘mcast’ channel, then forward its contents to each address stored in ‘nodelist’”. That’s all that’s needed for a bare-bones chat server.  Nice, right?

Read More »

In today’s episode of the Twilight Zone, a young William Shatner stumbles into a time machine and travels back into the past. Cornered in a dark alley, he is threatened by a teenage hooligan waving a loaded pistol. A tussle ensues, and in trying to wrest the gun from his assailant, Shatner fires, killing him dead. Examining the contents of the dead youth’s wallet, Bill comes to a shocking conclusion: he has just killed his own grandfather. Tight focus: Shatner howling soundlessly as he stares at his own hand flickering in and out of view.

Shatner? Or Not(Shatner)? Having now changed history, he could not have been born, meaning he could not have traveled back in time and changed history, meaning he was indeed born, meaning…?

You see where this goes.  It’s the old grandfather paradox, a hoary chestnut of SciFi and AI.  Personally I side with Captain Kirk: I don’t like mysteries. They give me a bellyache. But whether or not you think a discussion of “p if Not(p)” is news that’s fit to print, it is something to avoid in your software.  This is particularly tricky in distributed programming, where multiple machines have different clock settings, and those clocks may even turn backward on occasion. The theory of Distributed Systems is built on the notion of Causality, which enables programmers and programs to avoid doing unusual things like executing instructions in orders that could not have been specified by the program that generated them. Causality is established by distributed clock protocols. These protocols are often used to enforce causal orderings–i.e. to make machines wait for messages. And waiting for messages, as we know, is bad.

So I’m here to tell you today that Causality is overrated, and we can often skip the wait. To hell with distributed clocks: time travel can be fine.  In many cases it’s even fine to change history. Here’s the thing: Casuality is Required Only to control Non-monotonicity. I call this the CRON principle.

Read More »

12/16/2010: final version of CALM/Bloom paper for CIDR now posted

Conventional Wisdom:
In large distributed systems, perfect data consistency is too expensive to guarantee in general. “Eventually consistent” approaches are often a better choice, since temporary inconsistencies work out in most cases. Consistency mechanisms (transactions, quorums, etc.) should be reserved for infrequent, small-scale, mission-critical tasks.

Most computer systems designers agree on this at some level (once you get past the NoSQL vs. ACID sloganeering). But like lots of well-intentioned design maxims, it’s not so easy to translate into practice — all kinds of unavoidable tactical questions pop up:

Questions:

  • Exactly where in my multifaceted system is eventual consistency “good enough”?
  • How do I know that my “mission-critical” software isn’t tainted by my “best effort” components?
  • How do I maintain my design maxim as software evolves? For example, how can the junior programmer in year n of a project reason about whether their piece of the code maintains the system’s overall consistency requirements?

If you think you have answers to those questions, I’d love to hear them. And then I’ll raise the stakes, because I have a better challenge for you: can you write down your answers in an algorithm?

Challenge:
Write a program checker that will either “bless” your code’s inconsistency as provably acceptable, or identify the locations of unacceptable consistency bugs.

The CALM Conjecture is my initial answer to that challenge.

Read More »

I don’t usually post about business deals on my blog. But today’s acquisition of Greenplum by EMC is too close to home not to comment. I’ve been involved as a technical advisor at Greenplum for almost three years, and joined the EMC technical advisory board this spring — so I have some interest in the deal.
Below is my take on things from the technical side. Note that I’m not privy to any private information about the deal, and I’m generally more interested in the tech than the finance. No need to try and read financial tea leaves here — there aren’t any. This is a computer scientist’s view of the technology implications.  Here goes:

Bright and early next Monday morning I’m giving the keynote talk at PODS, the annual database theory conference.  The topic: (a) to summarize seven years of experience using logic to build distributed systems and network protocols (including P2, DSN, and recent BOOM work), and (b) to set out some ideas about the foundations of distributed and parallel programming that fell out from that experience.

I posted the paper underlying the talk, called The Declarative Imperative: Experiences and Conjectures in Distributed Logic. It’s written for database theoreticians, and in a spirit of academic fun it’s maybe a little over the top.  But I’m hopeful that the main ideas can clarify how we think about the practice of building distributed systems, and the languages we design for that purpose.  The talk will be streamed live and archived (along with keynotes from the SIGMOD and SOCC conferences later in the week.)

Below the break is a preview of the big ideas.  I’ll post about them at more length over the next few weeks, hopefully in more practical/approachable terms than I’m using for PODS.

Read More »

We were happy to find out this week that our BOOM project and and Bloom langauge have been selected by Technology Review magazine as one of the TR10, their “annual list of the emerging technologies that will have the biggest impact on our world.” This was news to us — we knew they were going to run an article, but weren’t aware of the TR10 distinction. Pretty neat.

I’ve been getting a lot of questions since the article launched about the project and language. So while folks are paying attention, here’s a quick FAQ to answer what the project is all about and its status.

Read More »


Saw a fun talk at the Eurosys conference today on Otherworld, a facility that allows applications to recover after an OS crash. It included a demo of an editor running over Linux running over a virtual machine. Fault injected into Linux, which crashes, reboots, and restores the editor to its previous state. Nifty. The authors point out how nice this would be for a very stateful app — say mysql running on in-memory files. (Fast and easy, right? Though clearly not a good idea relative to a real main memory db with smart logging, e.g. TimesTen, in terms of performance or reliability.)

But this got me thinking about the datacenter software stack we’re starting to take for granted, and I no longer see why we need an operating system at all.

Read More »

It’s been about 6 years now that we’ve been working on declarative programming for distributed systems — starting with routing protocols, then network overlays, query optimizers, sensor network stacks, and more recently scalable analytics and consensus protocols.

Through that time, we’ve struggled to find a useful middle ground between the pure logic roots of classical declarative languages like Datalog, and the practical needs of real systems managing state across networks. Our compromises over the years allowed us to move forward, build real things, and learn many lessons. But they also led to some semantic confusion — as noted in papers by colleagues at Max Planck and AT&T.

Well, no more. We recently released a tech report on Dedalus, a formal logic language that can serve as a clean foundation for declarative programming going forward.  The Dedalus work is fairly theoretical, but having tackled it we’re in a strong position to define an approachable and appealing language that will let programmers get their work done in distributed environments. That’s the goal of our Bloom language.

The key insight in Dedalus is roughly this:

Time is essential; space is a detail.

Read More »

It’s official: the name of the programming language for the BOOM project is:  Lincoln Bloom.

I didn’t intend to post about Bloom until it was cooked, but two things happened this week that changed my plans.  The first was the completion of a tech report on Dedalus, our new logic language that forms the foundation of Bloom.  The second was more of a surprise: Technology Review decided to run an article on our work, and Bloom was the natural way to talk about it.

More soon on our initial Dedalus results.

Follow

Get every new post delivered to your Inbox.

Join 47 other followers