If you think databases are all of the relational SQL kind (as in MySQL, MS SQL and PostreSQL), think again. Better still: think NoSQL, as in non-relational, distributed databases that do not require fixed-table schemas, typically scale horizontally, and provide superior data replication to SQL databases.
While not exactly new -- the NoSQL concept has been around for 10 years or so -- NoSQL has been attracting a lot of attention in recent years, primarily due to big-name production implementations. Two of the best known implementations are Amazon’s Dynamo and Google’s BigTable.
However, there are also many publicly available open source variants, such as Cassandra, CouchDB, Hbase, MongoDB, Redis, and Riak CounchDB.
In the past year, the NoSQL phenomenon has blossomed into something of a movement, popularized by events in this country and Europe, and by leading companies in diverse industries adopting NoSQL technologies.
Driving interest in NoSQL databases is the limitations of RDBMSes, which were originally built for single users on single machines doing single operations. RDBMSes weren’t designed for today’s computing world of thousands, even millions of users simultaneously accessing a database full of images, and digital and audio/video data.
NoSQL is very much a user-led phenomenon highlighted by the likes of Google, Amazon, Facebook, LinkedIn and Twitter, which created their own distributed data management technologies, says Mathew Aslett, an analyst at the 451 Group.
Aslett says Google and the other companies above chose NoSQL to deliver the performance and scalability benefits that traditional database products cannot match.
Companies and developers choose NoSQL because it gives them solutions unavailable with SQL, says Damien Katz, co-founder and CEO, Couchio, developer of CouchDB.
‘NoSQL technologies give people a choice of tools that SQL cannot,” says Justin Sheehy, CTO, Basho Technologies. Basho is the creator of Riak, a distributed data store that aims to combines high availability and powerful partitioning.
Riak's high availability means that applications built using Riak remain both read and write available under almost any conditions, says Sheehy.
Basho’s customers include Comcast and Electronic Arts. The latter uses Basho infrastructure to support seven million daily users of Warhammer Online on Facebook, saving each player’s status every half-minute.
While high availability is the main benefit of Riak, replication is the main benefit of CouchDB, a distributed, fault-tolerant, and schema-free document-oriented database.
Unlike SQL databases which are designed to store and report on highly structured, interrelated data, CouchDB (still in Beta) stores and reports on large amounts of semi-structured, document-oriented data.
“CouchDB greatly simplifies the development of document-oriented applications, which make up the bulk of collaborative web applications,” says Katz.
“One of our customers is the BBC, which is using the CouchDB to handle 150 million requests per day in two clusters of machines,” he says Katz.
The BBC, like other customers, selected CouchDB because of its superior replication and easy upgrade capabilities, says Katz.
He explains that NoSQL databases handle upgrades much better than SQL ones.
“In a SQL database, updates involve updating the schema and the stored data,” says Katz. “This often causes problems as new needs arise that weren't anticipated in the initial database designs. With CouchDB, no schema is enforced, so new document types with new meaning can be safely added alongside the old.”