A major source of frustration in distributed programming is that contemporary software tools—think compilers and debuggers—have little to say about the really tricky bugs that distributed systems developers face. Sure, compilers can find type and memory errors, and debuggers can single-step you through sequential code snippets. But how do they help with distributed systems issues? In some sense, they don’t help at all with the stuff that matters—things like:
- Concurrency: Does your code have bugs due to race conditions? Don’t forget that a distributed system is a parallel system!
- Consistency: Are there potential consistency errors in your program due to replicated state? Can you get undesirable non-deterministic outcomes based on network delays? What about the potential for the awful “split-brain” scenario where the state of multiple machines gets irrevocably out of sync?
- Coordination performance: Do you have performance issues due to overly-aggressive coordination or locking? Can you avoid expensive coordination without incurring bugs like the ones above?
These questions are especially tricky if you use services or libraries, where you don’t necessarily know how state and communication are managed. What code can you trust, and what about that code do you need to know to trust it?
Peter Alvaro has been doing groundbreaking work in the space, and recently started taking the veil off his results. This is a big deal. Read More »
We just finished writing up an overview of our most recent thinking about distributed consistency. The paper is entitled Consistency Without Borders, and it’s going to appear in the ACM SoCC conference next month in Silicon Valley.
It starts with two things we all know:
- Strong distributed consistency can be expensive and dangerous. (My favorite exposition: the LADIS ’08 conference writeup by Birman, Chockler and van Renesse. See especially the quotes from James Hamilton and Randy Shoup. And note that recent work like Spanner changes little: throughput of 10’s to 100’s of updates per second is only useful at the fringes.)
- Managing coordination in application logic is fraught with software engineering peril: you have to spec, build, test and maintain special-case, cross-stack distributed reasoning over time. Here be dragons.
The point of the paper is to try to reorient the community to explore the design space in between these extremes. Distributed consistency is one of the biggest CS problems of our day, and the technical community is spending way too much of its energy at these two ends of the design space.
We’ll be curious to hear feedback here, and at the conference.
The big news around here today is the public announcement of Trifacta, a company I’ve been quietly cooking over the last few months with colleagues Jeff Heer and Sean Kandel of Stanford. Trifacta is taking on an important and satisfying challenge: to build a new generation of user-centric data management software that is beautiful, powerful, and eminently useful.
Before I talk more about the background let me say this: We Are Hiring. We’re looking for people with passion and talent in Interaction Design, Data Visualization, Databases, Distributed Systems, Languages, and Machine Learning. We’re looking for folks who want to reach across specialties, and work together to build integrated, rich, and deeply satisfying software. We’ve got top-shelf funding and a sun-soaked office in the heart of SOMA in San Francisco, and we’re building a company with clear, tangible value. It’s early days and the fun is ahead. If you ever considered joining a data startup, this is the one. Get in touch.
Read More »
Bill Marczak, right, in NY Times
Bill Marczak, a PhD student in my group, does interesting research on algebraic programming languages, which I hope to describe in more detail here soon.
But Bill has recently received significant attention for work he did in his spare time—a dramatically successful cyber-espionage effort to expose government misuse of commercial surveillance software in Bahrain, the nation where Bill attended high school. The story picked up major press coverage in venues including Bloomberg and the New York Times, which also ran a more detailed article in their Bits Blog.
I’m always happy to see the press pick up on my students’ work, but this one is special.
Matt Welsh of Google—formerly of Harvard, Berkeley and Cornell—is a deservedly well-read blogger in the computing community. He’s also somebody I’ve admired since his early days in grad school as a smart, authentic person.
Matt’s been working through his transition from Harvard Professor to Googler in public over the last year or so, and it’s been interesting to watch what he says, and the discussion it provokes. His latest post was a little more acid than usual though, with respect to the value of academic computer science. My response got pretty long, and in the end I figured it’d be better to toss it up in my own space.
Rather than run down work you don’t like—including maybe your own prior work, as assessed on one of your dark days—think about the academic work over the last 50 years that you admire the hell out of. I know you could name a few heroes. I bet a bunch of your blog’s readers could get together and name a whole lot more. Now imagine the university system hadn’t been around and reasonably well-funded at the time, because it was considered “inefficient when it comes to producing real products that shape the world”. It’s sad to consider.
Here’s another thing you and your readers should consider: Forget efficiency. At least, forget it on the timescale you measure in your current job. Instead, aspire to do work that is as groundbreaking and important as the best work in the history of the field. And at the same time, inspire generations of brilliant students to do work that is even better—better than your very best. That’s what great universities are for, Matt. Remember? Sure you do. And yes—it’s goddamn audacious. As well it should be.
Read More »
When the folks at ACM SIGMOD asked me to be a guest blogger this month, I figured I should highlight the most community-facing work I’m involved with. So I wrote up a discussion of MADlib, and that the fact that this open-source in-database analytics library is now open to community contributions. (A bunch of us recently wrote a paper on the design and use of MADlib, which made my writing job a bit easier.) I’m optimistic about MADlib closing a gap between algorithm researchers and working data scientists, using familiar SQL as a vector for adoption on both fronts.
Read More »
If you follow this blog, you know that my BOOM group has spent a lot of time in the past couple years formalizing eventual consistency (EC) for distributed programs, via the CALM theorem and practical tools for analyzing Bloom programs.
In recent months, my student Peter Bailis and his teammate Shivaram Venkataraman took a different tack on the whole EC analysis problem which they call PBS: Probabilistically Bounded Staleness. The results are interesting, and extremely relevant to current practice. (See, for example, the very nice blog post by folks at DataStax).
Many people today deal with EC in the specific context of replica consistency, particularly in distributed NoSQL-style Key-Value Stores (KVSs). It is typical to configure these stores with so-called “partial” quorum replication, to get a comfortable mix of low latency with reasonable availability. The term “partial” signifies that you are not guaranteed consistency of writes by these configurations — at best they guarantee a form of eventual consistency of final writes, but readers may well read stale data along the way. Lots of people are deploying these configurations in the field, but there’s little information on how often the approach messes up, and how badly.
Jumping off from earlier theoretical work on probabilistic quorum systems, Peter and Shivaram answered two natural questions about how these systems should perform in current practice:
- How many versions ago? On expectation, if you do a read in a partial-quorum KVS, how many versions behind are you? Peter and Shivaram answer this one definitively, via a closed-form mathematical analysis.
- How stale on the (wall-)clock? On expectation, if you do a read in a partial-quorum KVS, how out-of-date will your version be in terms of wall-clock time? Answering this one requires modeling a read/write workload in wall-clock time, as well as system parameters like replica propagation (“anti-entropy”). Peter and Shivaram address this with a Monte Carlo model, and run the model with parameters grounded in real-world performance numbers generously provided by two of our most excellent colleagues: Alex Feinberg at LinkedIn and Coda Hale at Yammer (both of whom also guest-lectured in my Programming the Cloud course last fall.) Peter and Shivaram validated their models in practice using Cassandra, a widely-used KVS.
On the whole, PBS shows that being sloppy about consistency doesn’t bite you often or badly — especially if you’re in a single datacenter and you use SSDs. But things get more complex with magnetic disks, garbage collection delays (grr), and wide-area replication.
Interested in more detail? You can check out two things:
For a class I’m teaching, I’d like to collect a list of favorite “maxims” or “aphorisms” for computer systems.
I’d be very grateful if you would add your favorites below to the comments, preferably with a link to a source that either introduces or references the maxim. It’s OK to agree or disagree with the maxim.
I’d enjoy seeing people’s support/critiques for these below as well — may merit more focused posts another day.
I sat at Berkeley CS faculty lunch this past week with Brian Harvey and Dan Garcia, two guys who think hard about teaching computing to undergraduates. I was waxing philosophical about how we need to get data-centric thinking embedded deep into the initial CS courses—not just as an application of traditional programming, but as a key frame of reference for how to think about computing per se.
Dan pointed out that he and Brian and others took steps in this direction years ago at Berkeley, by introducing MapReduce and Hadoop in our initial 61A course. I have argued elsewhere that this is a Good Thing, because it gets people used to the kind of disorderly thinking needed for scaling distributed and data-centric systems.
But as a matter of both pedagogy and system design, I have begun to think that Google’s MapReduce model is not healthy for beginning students. The basic issue is that Google’s narrow MapReduce API conflates logical semantics (define a function over all items in a collection) with an expensive physical implementation (utilize a parallel barrier). As it happens, many common cluster-wide operations over a collection of items do not require a barrier even though they may require all-to-all communication. But there’s no way to tell the API whether a particular Reduce method has that property, so the runtime always does the most expensive thing imaginable in distributed coordination: global synchronization.
From an architectural point of view, a good language for parallelism should expose pipelining, and MapReduce hides it. Brian suggested I expand on this point somewhere so people could talk about it. So here we go.
Read More »
MADlib is an open-source statistical analytics package for SQL that I kicked off last year with friends at EMC-Greenplum. Last Friday we saw it graduate from alpha, to the first beta release version, 0.20beta. Hats off the MADlib team!
Forget your previous associations with low-tech SQL analytics, including so-called “business intelligence”, “olap”, “data cubes” and the like. This is the real deal: statistical and machine learning methods running at scale within the database, massively parallel, close to the data. Much of the code is written in SQL (a language that doesn’t get enough credit as a basis for parallel statistics), with key extensions in C/C++ for performance, and the occasional Python glue code. The suite of methods in the beta includes:
- standard statistical methods like multi-variate linear and logistic regressions,
- supervised learning methods including support-vector machines, naive Bayes, and decision trees
- unsupervised methods including k-means clustering, association rules and Latent Dirichlet Allocation
- descriptive statistics and data profiling, including one-pass Flajolet-Martin and CountMin sketch methods (my personal contributions to the library) to compute distinct counts, range-counts, quantiles, various types of histograms, and frequent-value identification
- statistical support routines including an efficient sparse vector library and array operations, and conjugate gradiant optimization.
More methods are planned for future releases. Myself, I’m working with Daisy Wang on merging her SQL-based Conditional Random Fields and Bayesian inference implementations into the library for an upcoming release, to support sophisticated text processing.
Read More »