Alternatives to SQL for high-volume data


Dr. Dobbs has an interesting article discussing the current trend for “NoSQL” data stores – technologies which do not rely on traditional SQL-based relation data models. More and more software architects (and businesses) are realising the once-unthinkable; namely, that:

[…] Not every app requires rigid data consistency

What? Surely a solid database in 3NF (with some denormalisations for speed) is the way to store data? Well, it turns out that data which is “eventually consistent” or even “almost consistent” is not only good enough in many situations, but even better than fully consistent sometimes.

The two approachs can be characterised as:

  • SQL –  ACID (atomicity, consistency, isolation, durability) – properties of the data updates
  • NoSQL – BASE (basically available, soft state, eventually consistent) – properties of the data itself

These differences are one reason non-relational NoSQL data stores, document-centric databases, and column stores have gained traction. They’re more like specialized tools rather than the Swiss Army knife functionality of SQL platforms.

Priocept has recently been working with Apache Solr, a kind of “Lucene on steriods”, which fast approaches a NoSQL database when used for applications such as geodata and automatic data categorisation (as used on eBay); in our case, we are using Solr to deal with geographical data for a large travel operator. Apache Cassandra also seems to be doing very well (serving the odd Facebook photo, for example).

SQL RDBMS are not going away any time soon, but the procession of NoSQL alternatives is opening up some interesting opportunities for novel approaches to data storage and processing.