First, XML is very good at representing very hierarchical data. Navigating a hierarchy of arbitrary depth is a real hassle in SQL Server 2000. As an example of hierarchical data, consider the blueprint of a house. Various structural elements support each other in complex ways: The roof sits on beams, posts, and walls, each of which is supported by one or more structural elements below (see Listing 3). Users may want to run queries, but relationships between members are arbitrary so this kind of data doesn't map well to tables and columns.
XML is also gaining popularity for representing unstructured data. These are often prose documents, and markup helps to store them in a display-neutral fashion, so you can easily transform your document into HTML for a Web site or into a PDF document to send to the printers. Marking up prose can also add semantic transparency, making document searches more precise. Listing 4 shows a SQL script for querying unstructured data.
The example below stores an email message as XML. Not only are the standard to, from, and subject fields marked up, but the text of the message itself contains custom tags as well. These custom tags make searching through text a less ambiguous exercise.
<?xml version="1.0" encoding="utf-8"?>
<Message>
<To>Developers@example.com</To>
<From>Management@example.com</From>
<Subject>Eyes on the Prize</Subject>
<Text>
Remember the <Event>California rollout </Event> happens <Deadline>Monday 12/14/2004.</Deadline>
Management.
</Text>
</Message>
SQL Server gives you the best of both worlds for storing hierarchical and unstructured data. Now, you can query and index documents that would previously be searchable only through full text indexing.
Implementation Details: Four more Nails
Depending on your situation, XML can also be the best choice for storing even highly structured data. Here are a few practical reasons to consider storing data in a field of type XML:
- Repeated shredding or publishingOn-demand transformations carry a performance penalty. If you have to shred or publish the same document over and over again, consider storing it natively as XML. You can always expose it to relational consumers with an XML view.
- Rapidly changing data structuresWhen modeled correctly, XML lives up to its name: It's extensible. Developers can add new pieces of dataeven new hierarchiesto a schema without compromising existing software. Extensibility is an extra advantage when prototyping, or when working with rapidly changing problem domains such as bioinformatics.
- Atomic dataSometimes, you'll have XML data that's never consumed except as a whole. Think of this as logical atomicityif you never access the parts individually, you might as well store it in one big chunk.
- DebuggingEspecially for new releases, it can be a good idea to tuck away a copy of your XML imports. The data may be redundant, but keeping the original makes tracking down problems a whole lot easier.
Microsoft's database support for XML has come a long way with SQL Server 2005. So consider these more as guidelines than as rules. There are enough nails out there; you don't need to look for any more.