Browse DevX
Sign up for e-mail newsletters from DevX


Occasional XSLT for Experienced Software Developers : Page 2

Although using XSLT to process XML is increasingly common, most developers still use it only occasionally—and often treat it as just another procedural language. But that's not the best way to use XSLT. Learn how to simplify and improve your XSLT processing using event-driven and declarative techniques.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Event-driven XSLT
Fortunately, you can make the transformation much simpler by using matched templates. A matched template is one the XSLT processor triggers when its "match" attribute matches the current (context) node, whether that's simply the name of a tag or a more complex XPath expression. For example, the processor will trigger the following template whenever the context node is a "lang" attribute (the ampersand denotes an attribute node rather than an element node).

<xsl:template match="*[@lang]"> <xsl:text>This element has the follwing language id:<xsl:value-of select="@lang"></xsl:text> </xsl:template>

By processing the file through matched templates, the code makes as few assumptions as possible about the format of the input file. For example, the following stylesheet outputs exactly the same result for both input files, even though their hierarchical formats differ significantly. Here's the revised stylesheet:

<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output indent="yes" encoding="utf-8"/> <xsl:template match="/"> <xsl:element name="titles"> <xsl:apply-templates select="node()"/> </xsl:element> </xsl:template> <xsl:template match="book"> <xsl:copy-of select="title"/> </xsl:template> </xsl:stylesheet>

This event-driven version matches the root element—regardless of its name—by using the single backslash (/) syntax. Next, it outputs the root <titles> tag, and instructs the stylesheet to continue the iteration over the contents of the current or context node (the root node in this case) with the apply-templates call.

If you apply this stylesheet to the second input file, you'll get the following result:

<?xml version="1.0"?> <titles> <title>JavaScript: The Definitive Guide</title> <title>JavaScript: The Definitive Guide</title> <title>Photoshop 6 for Professionals</title> </titles>

The output is indeed the same as for the first input file, except for one minor annoyance. There are some gratuitous carriage returns before and after the <title> tags that cause the extra white space in the output.

After trying to determine the cause of these extra carriage returns, an occasional XSLT programmer might just drop the simple event-driven approach altogether in favor of the more complex flow-driven one. But if you instead explore the XSLT specification, you'll find a built-in template that copies text through and thus outputs the carriage returns:

<xsl:template match="text()"> <xsl:value-of select="."/> </xsl:template>

In the example above, the carriage returns stem from the inside of the <section>, <row>, and <book> tags of the input document, one for each tag.

To correct that, you can add one line to the event-driven stylesheet that matches text() nodes as follows:

<xsl:template match="text()"/>

That line gets rid of the carriage returns by overriding the built-in text template using a custom version that produces no output.

The key point to take away here is that almost any useful XSLT stylesheet should override at least two of the built-in templates: the one for text, shown above, and the one that matches all nodes, which is:

<xsl:template match="*|/"> <xsl:apply-templates/> </xsl:template>

The built-in template for nodes copies nothing to the output, but by invoking the <xsl:apply-templates/> call, allows other templates to match children of the current tag. In other words, any XSLT stylesheet processes all the nodes in the input document by default.

Author's Note: You can gain fine-grained control over extra whitespace characters in the XSLT output by using the <xsl:preserve-space> and <xsl:strip-space> constructs in the stylesheet, or by using the xml:space attribute on XML tags in the input files.

Thanks for your registration, follow us on our social networks to keep up-to-date