What's New in XSLT 2.0?
Like its predecessor XSLT 2.0 relies heavily on XPath (now XPath 2.0) for many of its core features. XPath 2.0 is itself intertwined with yet another emerging standard, XQuery 1.0
, which relies on XPath 2.0 so much that after mastering XPath 2.0 you'll have a pretty good idea how XQuery works. It's important to note though that XSLT 2.0 does not rely on XQuery. XQuery is a language for querying XML documents and is already finding substantial support in most of the native XML databases such as XIndice, Ipedo, XHive, and others.
In addition, both Microsoft
plan to support XQuery in their next major releases, which will have native XML database capabilities. Eric Brown, Microsoft's Product Manager for SQL Server, says that the next release of SQL Server, code-named Yukon, will support the following XML features:
- Native XML Storage
- XQuery Support
- Cross domain querying between relational and XML data
Some developers are even suggesting that XQuery will supplant XSLT as the primary XML processing language. Other than their common bond to XPath, however, there is no direct relationship between XSLT 2.0 and XQuery.
XSLT 2.0 Adds New Data Types
|Author Note: While reading this, you should be aware that XSLT 2.0 and XPath 2.0 are not yet stable specifications. Anything you find here is liable to change, some of it quite significantly.
In XSLT 1.0 (and XPath 1.0), there were four kinds of data types:
Node-sets, of course, contain nodes, which in turn contain some properties. There are seven types of nodes, the document, element, attribute, text, namespace, processing instruction, and comment nodes.
XPath 2.0 has a much richer data model. At the very top of the list is the sequence, which in addition to nodes can consist of XML Schema Language simple types such as xs:int or xs:date, and is equivalent to an ordered list. The addition of XML Schema data types is the biggest change. There are 16 simple data types available through Schema, and XQuery provides for functional access to all of them.
Everything starts with sequences, so let's take a look at those, since it's a new concept for many of us.
The first object layer consists of something called sequences. It's a new term but one you'll need to get familiar with if you're serious about the next version of XSLT and XPath. A sequence is a result of an expression. An expression, in turn, is constructed from a combination of keywords, symbols, and operands.
If you understand XML at the core level, you'll understand exactly what a sequence is, and you may even slap yourself in the forehead and say, "wait, that's not new at allit's the essence of XML!". So besides looking at the XPath and XSLT definition for a sequence, consider the one fundamental truth behind XML and how best to define it. I won't do it, I'll let Mike Brown, through his excellent tutorial on XML and character encoding
, do it for me.
An XML document is a UCS character sequence that follows certain patterns. These patterns provide a means of representing a logical hierarchy (a tree) of data.
That's all XML is. I'm sure you've seen other definitions for XML, and while some of them may have various levels of truth behind them, if they don't include something like this simple statement, they're leading you astray. UCS, by the way, is, in essence, sort of, Unicode (not technically, but enough so for our purposes here), which is, sort of, the mother of all character coding sets for XML. I say "sort of" twice because the intricacies of Unicode are somewhat esoteric. If you really want to know more about it, you can't do better than the Skew tutorial. So, what's a sequence in XPath 2.0? Well, in the core functional capabilities of XPath 2.0, like XPath 1.0, are built around expressions. Expressions always yield results, and in XPath 2.0, these results are expressed as a sequence, which is an ordered list of zero or more items. These items can be either a node, as in XPath 1.0, or (and this is new) a simple XML Schema data type.
At its most basic, a sequence is the result of an expression like this:
(7, 1, 2, 3 )
This results in the following sequence:
This sequence contains all foo element children of the context node (which is the position from which evaluation starts):
A sequence can be empty, so this:
yields an empty sequence.
Sequences in XPath 2.0 are ordered and never nested. They can also be duplicated within the scope of the same expression. For example, consider the source document shown in Listing 1
. Suppose you want to extract some of the <quantity>
information from each product. To do that, here's a simple for-each
statement that outputs some literal results:
<xsl:value-of select="." /><br />
Starting with the source document's <product>
node as the context node, the preceding for-each
statement contains and expression that yields the following results (or sequence):
This example highlights the difference between the old and new XSLT models. For one thing, you couldn't use commas as node delimiters in the previous version of XSLT or XPath. Now commas are a legitimate way to separate items in ordered sequences.
Note also the parentheses, a useful way to make the code more readable. But even more interesting is how easy it is to duplicate a node as part of the sequence. You can see that the first and third items in the expression are the same, and so they yield the same result in the output.
If you've ever been confused by the difference between what you can put in a match pattern (in, for example, an xsl:template match attribute), a good way to think about it is that matches never yield results like expressions do. They're merely patterns for determining whether or not a node meets certain criteria.