I’m increasingly believing my own story that data-centric programming is the future of parallel computing at the high end. I’m starting to hear it echoed back at me from real people.
I attended the Greenplum customer advisory board meeting this week, including a public briefing in San Francisco for analysts and potential customers. The Greenplum folks asked me to speak at the briefing about parallelism and analytics in the large, outside the scope of Greenplum per se. I cooked up a little slide deck for the occasion on why and whither parallelism and analytics. A familiar story about how the future is parallel, and the practical future is dataflow parallelism. (Familiar yes, but with some nice Flickr clip-art and approachable analogies to explain it.)
The big aha moment occured for me during our panel discussion, which included Luke Lonergan from Greenplum, Roger Magoulas from O’Reilly, and Brian Dolan from Fox Interactive Media (which runs MySpace among other web properties).
Read More »
At HPTS 2001 I gave a quick seat-of-the-pants talk called We Lose, which argued that database software and research wasn’t targeting the hacker community, and therefore was dooming itself to irrelevance. This thing — which I cooked up in about 10 minutes — still gets me a bunch of feedback. (The talk included a pitch for an easy-to-use dataflow framework that could harness textual data from files, as part of our original Telegraph work. MapReduce anyone?)
This issue is decidedly back on the table as different approaches are being explored for Cloud development platforms. So I gave a similar pitch at CIDR this year, to try and get the data-centric experts to work on the most important piece of the Cloud: the programming model. I’m hoping this time some folks other than us will bite.