Skip navigation

Tag Archives: parallelism

chum in the waterOne more post on MapReduce and parallel SQL, this time for the folks at O’Reilly Radar.  

Just for the record, I think MapReduce is fine, but not especially interesting technology.  The thing is, the “teachable moment” it presents is really great stuff, because it is bringing people toward data-centric parallel programming.  So it’s good for the data-centric research business in general, and especially for data-centric approaches to parallelism.  

I.e. chum in the water for our research on Lincoln…

Advertisements

The first of two invited posts at GigaOm are up.  These are not researchy, they’re intended to be informative to a broad audience.  They describe the state of affairs in data parallelism, and some of the reasons why this is an increasingly hot topic.

This started out as an exercise for Greenplum, a company I advise that sells a massively parallel DBMS based on PostgreSQL.  I’ve been helping them with their recent launch of a MapReduce interface to their system.  That’s been an interesting project. I’ll write about it more soon.

Along the way, they asked if I’d write a blog post for them about parallelism, SQL and MapReduce to put things into perspective.  I sat down to write a few paragraphs on the subject and ended up with a seven-page essay.  Too long for a blog post so I just turned it into a Tech Report. (a.k.a. a white paper in industrial terms).  We excerpted it for GigaOm to run in a couple posts.  The original is more nuanced and playful, but hey — blogging isn’t about 7-page essays.  I’ll try to control myself here too, and stick with a few paragraphs per post.  And if that causes me to write more tech reports, so be it — I’ll link them in.