Skip navigation

Monthly Archives: October 2009

oscilloHadoop MapReduce is a batch-processing system.  Why?  Because that’s the way Google described their MapReduce implementation.

But it doesn’t have to be that way. Introducing HOP: the Hadoop Online Prototype [updated link to final NSDI ’10 version]. With modest changes to the structure of Hadoop, we were able to convert it from a batch-processing system to an interactive, online system that can provide features like “early returns” from big jobs, and continuous data stream processing, while preserving the simple MapReduce programming and fault tolerance models popularized by Google and Hadoop.  And by the way, it exposes pipeline parallelism that can even make batch jobs finish faster.  This is a project led by Tyson Condie, in collaboration with folks at Berkeley and Yahoo! Research.

Read More »

Advertisements

argueThanks to Boon Thau Loo and Stefan Sariou for a very interesting workshop on Networking Meets Databases (NetDB), and especially for inviting a high-octane panel to debate the success and directions of Declarative Networking.

The panel members included:

  • Fred Baker, Cisco
  • Joe Hellerstein, Berkeley
  • Eddie Kohler, UCLA and Meraki
  • Arvind Krishnamurthy, U Washington
  • Petros Maniatis, Intel Research
  • Timothy Roscoe, ETH Zurich

Butler Lampson made numerous comments from the audience, and given his insight and stature was viewed by most as something of an additional panelist.

I was happy to see a very vigorous debate!  Lots of interesting points made, no punches pulled.  My slides are posted here, and include an ad hoc manifesto for how to move forward. Read More »

Agreement Protocol

Headline: We now have a robust declarative implementation of MultiPaxos with leader election, which is radically simpler than most existing implementations.  It’s compact, suprisingly readable (as Paxos implementations go!) and live.  It forms a key part of our Boom Analytics implementation of a high-availability Hadoop File System.

Maybe more interesting are the lessons we learned about how distributed protocols and declarative languages go together, and the design patterns that emerged.  We’re using this to ground the design of our new language, code-name Lincoln.  A paper on the topic is being presented this Wednesday at NetDB 2009, after SOSP.

Read More »