Don't Get Fooled Again
Surprisingly, I caught a little flack for this comment I made several weeks ago:
"If all you need are LOBs, you probably don’t need a relational database. And if you need a relational database, you probably shouldn’t use LOBs."
The surprising part was the question raised: "why would you ever use a relational database these days?" It was argued from some familiar and simple high-level reasons (trends, having to write SQL, taking the car apart to put it in the garage, etc.), and came from a NoSQL fan.
This objection seemed anachronistic to me, since we just finished migrating one of our large products from a non-relational model to using DB2 for all data storage. Why would we do such a thing? Lots of reasons, but it only took one: paying customers wanted it. Further, data mining has become so popular and successful lately (on the wings of relational databases) that it's hard to imagine tossing aside that shiny new benefit.
The NoSQL revolution has taken on a reactionary bent (just consider the name), which is odd for a new movement. Chesterton reminds us that "the business of progressives is to go on making mistakes, and the business of the conservatives is to prevent the mistakes from being corrected." NoSQL advertises itself like a progressive movement, but actually falls to the right of conservative. BigTable, Cassandra, and HBase are no newer concepts than the ancient things grandpa tells war stories about: flat files, hash files, partition stores, CAM, BDAM, ISAM, etc. So blindly applying NoSQL is making the new mistake of rejecting well-established corrections to old mistakes.
When making architectural decisions like whether to use a relational database or not, I often start with a blank 8 1/2 x 11 sheet of paper, turned landscape. At the top I write the relevant business requirements (real, not imagined, ones). I then do a simple left-and-right table of pros and cons. At the bottom I write areas where architectural prototypes are needed, if any. This, of course, helps me weigh the options and make a decision, but it also means I "own" the cons. If I decide to use an RDMS, I accept the costs. If I decide not to use an RDMS, I willingly reject the benefits they offer with eyes wide open.
Yet the war of words for and against NoSQL rages on, often without fully or correctly acknowledging the cons, nor the simple fact that you don't do both at the same time. Many problems are best solved with a relational database and many are best solved without one.
In the early 90s, I would joke that there existed a parallel universe where client/server has fallen out of fashion and the new, cool buzz was mainframes and 3270 terminals. Funny that we soon joined that parallel universe when web apps and cloud computing became our trend. Andrew Tanenbaum notes that, while recapitulation doesn't really happen in biology, computer history does rightfully repeat itself. Old solutions should be applied to new technologies; at the very least it keeps consultants gainfully employed. Let the pendulum swing, but truly grok all the pros and cons as history repeats itself.
Unlike Ted Dziuba, I don't want NoSQL to die. By definition, it can't, and that's arguing a strawman anyway. I just want it to be used where it best fits. And the same goes for relational databases. Repeat after me: "no golden hammers; pick the best tool for the job." Just don't be a lemming. And don't get fooled again.