‘ve often heard it said that database management systems are archaic technology and that XML and object oriented databases are the wave of the future. I’m no curmudgeon; I like new technologies as much as anyone else, but I’m also a pragmatist?a philosophy that requires me to question the value of going with a newer technology just because it’s the hottest thing on the block.
Just how old are these technologies? The object-oriented trend started with Smalltalk, which was first released in 1964. By comparison, SQL databases are new, having been developed in 1977 and first productized by Oracle in 1979 and by IBM in 1981. There is no such thing as ‘your grandfather’s SQL database’ unless your grandfather is young enough to have experienced feathered haircuts, Duran Duran, and ‘Frankie say’ T-shirts.
So if age isn’t a factor, what is the problem with SQL databases? For the conscientious developer looking for a stable platform on which to develop data-driven applications, SQL databases still have a lot of unexploited potential. Here’s a short list of things that make SQL an entrenched technology, and one not deserving of abandonment.
What SQL Does Best
If you have a multi-table database schema where rows in one table relate to rows in another table, you run the risk of creating errors if related rows do not exist. It’s the database equivalent of a dangling pointer error.
When referential integrity is not maintained, the database is called ‘corrupted’ and often needs to be fixed by hand. But by using foreign key relationships in a database schema you convey to the database engine that there is a relationship between two tables and it must be maintained properly. Referential integrity is perhaps the most important value a modern database provides to a developer.
You might think that this kind of corruption doesn’t happen often enough to justify the hassle, but in fact it can be fairly common. Why? By default connections to the database wrap a transaction around each command. So each INSERT and UPDATE command is its own transaction. If you insert data into one table and then lose connection before inserting into the other related table you will have a dangling pointer, or unreferenced data. On high load systems, losing a database connection is far from unheard of. Referential integrity can insulate you from errors caused by lost connections between transactions.
Transactions are at the core of a solid SQL database system. The acronym ACID is used to describe the properties that transactions in the database system provide to the client system. ACID stands for ‘atomicity,’ ‘consistency,’ ‘isolation,’ and ‘durability.’ By understanding these you will see how much a database system is really doing for you.
Atomicity guarantees that a set of commands can be maintained as a transaction and applied as a single atomic unit. If any one of the commands fails then the entire atomic transaction is rolled back, regardless of size. This means that during a complex series of inserts and updates if any one of them fails then all of them will be undone uniformly, thus leaving the entire data set intact.
Consistency deals with the relationship rules I talked about earlier?that specify how rows in different tables relate to each other. It’s the responsibility of the database to maintain these relationships. If you are a fan of strong typing in languages then the value of consistency should be readily apparent to you. By using the consistency features you are giving the database more information about the structure of your data and allowing it to do run-time error checking for you.
Isolation is important when you have multiple users adding, updating, and querying records within the database simultaneously. It means that during a single transaction, be it a query or a set of modifying commands, your transaction lives in its own virtual self-contained copy of the database state. For example, if two users of a banking system look at the same account simultaneously, one person adding some money and the other checking the balance, the one transaction will not overlap or corrupt the other.
The durability property ensures that all transactions to the database are logged and maintained such that the database can be restored to the last transaction if a system failure occurs.
When you look at the sum total of these four properties it is clear just how much a database server is doing for us and how far it is from a simplistic data store. So how do we take advantage of these properties? One of the first methods is through stored procedures.
Every transaction with a database should be atomic. For example, adding a Person entity to the database, where that entity is made up of five separate rows in three tables, should be a single transaction. Another example is a report?all of the queries that make up a report should be within a transaction so that all of the values add up properly even if other users are transacting against the data at the same time the report is being generated.
One way of doing this is to use the transaction mechanism in your database connection. This informs the database about the start and end of a transaction. Another is to use stored procedures. A stored procedure is a function that is created by a developer that resides in the database and can be called in the same way as an SQL command (e.g. SELECT, INSERT, UPDATE, etc.). Stored procedures are written in a Turing-complete programming language that changes slightly from vendor to vendor.
There are several reasons to use stored procedures:
- They are transaction-safe at the database level.
- They provide a level of business rules at the database level. If the logic to add a Person record is implemented as a stored procedure, then the procedure?let’s say AddPerson?takes all of the information about a person and implements all of the logic to add rows to make it happen in a safe manner. This means regardless of the client system, Person records will always be added properly.
- They are fast. In the case of adding a Person you may be sending 10 commands across the database connection to create transactions and add records. With a stored procedure you are only sending one command to the database. All of the hard work of the transaction is done within the database itself by the compiled stored procedure.
On some database systems you can restrict all database access to just stored procedures. In other words, all queries and commands against the database must be done using stored procedures.
So far I have discussed the safety and reliability aspects of SQL database servers but I haven’t covered any of advanced features you can use to aid development.
Advanced SQL Database Features
Today’s SQL database servers provide not only database storage and access through conventional table structures, they have extended functionality that can aid in the development of advanced systems. Some of the more important features are listed below:
- Triggers?Triggers are fired on the database server when selected events occur. An example is the addition of a Person entity database. The server can be programmed with stored procedures to trigger on that event and fire off a set of asynchronous tasks. This means that the web server process that added the record isn’t blocking on this asynchronous server-based processing.
- Materialized Views?All SQL databases support views, which are virtual tables that are the product of data collected from several tables through a query. Some servers allow for materialized views, which are views cached on disk to avoid the performance overhead of running the query during each use of the view. This allows for extremely sophisticated reports to be run on-demand with little overhead.
- Message queues?Message queues, like those from Tibco, allow different applications to communicate with each other in an abstracted and asynchronous manner. For example, imagine a front end data entry application sending messages to a reporting or business rules application sitting on the other end of the message queue. The message queue will send out mail or generate reports in an asynchronous manner so that the front end is not held up waiting for these processes to finish. Message queues are directly supported by Oracle and can be implemented within other SQL database server frameworks.
So with all of these features to like about SQL, why is everyone so eager to leave the SQL database model behind?
Object vs. Relational
There has always been friction between the object world and the relational database world. This friction manifests itself in code as the Object/Relational mapping layer (O/R layer). This layer allows programmers to read and write data from a relational database in an ‘objective’ manner. The first and primary value of an object database is the removal of this layer from the program. Because the database can store object data directly there is no requirement for a layer to translate the object world into the relational world.
The second advantage of an object database is that the variable types in the object database are the same as in the host programming language. This includes complex data types such as structures and classes. Large and complex object and data structures can be stored and retrieved from the database with relative ease in comparison to the same set of operations applied to an SQL database.
The third advantage is the absence of a new syntax for storing and retrieving data. Using an SQL database means learning SQL syntax and optionally a stored procedure syntax. Object databases have storage and retrieval methods available within the host language context. Therefore, comparatively little extra knowledge is required. Certainly nothing along the lines of learning a new language.
One of the downsides of an object database is its direct linkage to the object format and data types of the host language. This causes problems when other languages attempt to query data from the store. An ideal use for an ODBMS, even with this caveat, is as an application-tier temporary object store.
A second alternative to the SQL database is the Native XML Database (NXD) which can store and query XML documents in their hierarchal form. This makes storing complex data forms much simpler than it would be in an SQL database. Some SQL databases have support for XML as a native datatype but it is not directly integrated with the SQL implementation and the performance issues can be prohibitive.
So there is a value in the alternatives to the SQL model, but that value needs to be weighed against the cost.
A Little Respect for SQL?
Whether it’s elaborate O/R mapping layers, OODBMS or XML databases, engineers are looking to other technologies to solve problems that have solutions already built into their SQL DBMS.
Many in the Java community use SQL complexity as support for an argument to move on to other technologies. For example:
- At O’Reilly OnJava, Dion Almaer writes: “This is a new world, where developers can focus on their business needs and work with objects without having to be SQL gurus.”
- At The Server Side, Basil Tchurilov writes: “This method is labor intensive, error prone, requires SQL expertise, and does not have caching capability. So it will be much more appropriate to use ODBMS as storage for Java components.”
SQL isn’t difficult. You don’t have to become a guru to use it. You just have to take some time. Besides, if you are writing and data-driven application and using a SQL database then your application will be using SQL even if you don’t know about it and could be suffering from performance problems simply because of inefficient SQL.
Author and DevX reader Norman Richards criticized my last commentary in his blog this way:
“The conclusion: after all these years of usage, relational databases have completely failed to provide a reasonable level of standardization or interoperability. What is worse is that most programmers I meet have never worked with better technologies and don’t seem to realize how bad off the situation is.”
I agree that there are vendor incompatibilities that create difficulty, but the situation is no better in the OODBMS or XML database world. The value in SQL is that there are standards and those standards are dedicated to the support of the robust and solid database platforms upon which we can build applications.
SQL databases aren’t glamorous. They aren’t all that new, though they aren’t all that old. And the emphasis on reliability, data integrity, and durability isn’t very sexy. But these systems work, and they work very well.
It’s time to stand up and give deserved credit to the database engines that have been at the heart of our data-driven applications for years. It’s also time to use them properly, which means using the referential integrity features, using the stored procedures, and writing quality SQL that makes efficient use of the database.
|The author would like to thank Darryl Hasieber, a U.K.-based MCSD and MCDBA working as a software developer and database design consultant, for his contribution to this article.|