've often heard it said that database management systems are archaic technology and that XML and object oriented databases are the wave of the future. I'm no curmudgeon; I like new technologies as much as anyone else, but I'm also a pragmatista philosophy that requires me to question the value of going with a newer technology just because it's the hottest thing on the block.
Just how old are these technologies? The object-oriented trend started with Smalltalk, which was first released in 1964. By comparison, SQL databases are new, having been developed in 1977 and first productized by Oracle in 1979 and by IBM in 1981. There is no such thing as 'your grandfather's SQL database' unless your grandfather is young enough to have experienced feathered haircuts, Duran Duran, and 'Frankie say' T-shirts.
So if age isn't a factor, what is the problem with SQL databases? For the conscientious developer looking for a stable platform on which to develop data-driven applications, SQL databases still have a lot of unexploited potential. Here's a short list of things that make SQL an entrenched technology, and one not deserving of abandonment.
What SQL Does Best
If you have a multi-table database schema where rows in one table relate to rows in another table, you run the risk of creating errors if related rows do not exist. It's the database equivalent of a dangling pointer error.
When referential integrity is not maintained, the database is called 'corrupted' and often needs to be fixed by hand. But by using foreign key relationships in a database schema you convey to the database engine that there is a relationship between two tables and it must be maintained properly. Referential integrity is perhaps the most important value a modern database provides to a developer.
You might think that this kind of corruption doesn't happen often enough to justify the hassle, but in fact it can be fairly common. Why? By default connections to the database wrap a transaction around each command. So each INSERT and UPDATE command is its own transaction. If you insert data into one table and then lose connection before inserting into the other related table you will have a dangling pointer, or unreferenced data. On high load systems, losing a database connection is far from unheard of. Referential integrity can insulate you from errors caused by lost connections between transactions.
Transactions are at the core of a solid SQL database system. The acronym ACID is used to describe the properties that transactions in the database system provide to the client system. ACID stands for 'atomicity,' 'consistency,' 'isolation,' and 'durability.' By understanding these you will see how much a database system is really doing for you.
Atomicity guarantees that a set of commands can be maintained as a transaction and applied as a single atomic unit. If any one of the commands fails then the entire atomic transaction is rolled back, regardless of size. This means that during a complex series of inserts and updates if any one of them fails then all of them will be undone uniformly, thus leaving the entire data set intact.
Consistency deals with the relationship rules I talked about earlierthat specify how rows in different tables relate to each other. It's the responsibility of the database to maintain these relationships. If you are a fan of strong typing in languages then the value of consistency should be readily apparent to you. By using the consistency features you are giving the database more information about the structure of your data and allowing it to do run-time error checking for you.
Isolation is important when you have multiple users adding, updating, and querying records within the database simultaneously. It means that during a single transaction, be it a query or a set of modifying commands, your transaction lives in its own virtual self-contained copy of the database state. For example, if two users of a banking system look at the same account simultaneously, one person adding some money and the other checking the balance, the one transaction will not overlap or corrupt the other.
The durability property ensures that all transactions to the database are logged and maintained such that the database can be restored to the last transaction if a system failure occurs.
When you look at the sum total of these four properties it is clear just how much a database server is doing for us and how far it is from a simplistic data store. So how do we take advantage of these properties? One of the first methods is through stored procedures.
Every transaction with a database should be atomic. For example, adding a Person entity to the database, where that entity is made up of five separate rows in three tables, should be a single transaction. Another example is a reportall of the queries that make up a report should be within a transaction so that all of the values add up properly even if other users are transacting against the data at the same time the report is being generated.
One way of doing this is to use the transaction mechanism in your database connection. This informs the database about the start and end of a transaction. Another is to use stored procedures. A stored procedure is a function that is created by a developer that resides in the database and can be called in the same way as an SQL command (e.g. SELECT, INSERT, UPDATE, etc.). Stored procedures are written in a Turing-complete programming language that changes slightly from vendor to vendor.
There are several reasons to use stored procedures:
- They are transaction-safe at the database level.
- They provide a level of business rules at the database level. If the logic to add a Person record is implemented as a stored procedure, then the procedurelet's say AddPersontakes all of the information about a person and implements all of the logic to add rows to make it happen in a safe manner. This means regardless of the client system, Person records will always be added properly.
- They are fast. In the case of adding a Person you may be sending 10 commands across the database connection to create transactions and add records. With a stored procedure you are only sending one command to the database. All of the hard work of the transaction is done within the database itself by the compiled stored procedure.
On some database systems you can restrict all database access to just stored procedures. In other words, all queries and commands against the database must be done using stored procedures.
So far I have discussed the safety and reliability aspects of SQL database servers but I haven't covered any of advanced features you can use to aid development.
Advanced SQL Database Features
Today's SQL database servers provide not only database storage and access through conventional table structures, they have extended functionality that can aid in the development of advanced systems. Some of the more important features are listed below:
- TriggersTriggers are fired on the database server when selected events occur. An example is the addition of a Person entity database. The server can be programmed with stored procedures to trigger on that event and fire off a set of asynchronous tasks. This means that the web server process that added the record isn't blocking on this asynchronous server-based processing.
- Materialized ViewsAll SQL databases support views, which are virtual tables that are the product of data collected from several tables through a query. Some servers allow for materialized views, which are views cached on disk to avoid the performance overhead of running the query during each use of the view. This allows for extremely sophisticated reports to be run on-demand with little overhead.
- Message queuesMessage queues, like those from Tibco, allow different applications to communicate with each other in an abstracted and asynchronous manner. For example, imagine a front end data entry application sending messages to a reporting or business rules application sitting on the other end of the message queue. The message queue will send out mail or generate reports in an asynchronous manner so that the front end is not held up waiting for these processes to finish. Message queues are directly supported by Oracle and can be implemented within other SQL database server frameworks.
So with all of these features to like about SQL, why is everyone so eager to leave the SQL database model behind?