Browse DevX
Sign up for e-mail newsletters from DevX


Occasional XSLT for Experienced Software Developers : Page 4

Although using XSLT to process XML is increasingly common, most developers still use it only occasionally—and often treat it as just another procedural language. But that's not the best way to use XSLT. Learn how to simplify and improve your XSLT processing using event-driven and declarative techniques.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Key Indexing in XSLT
You can simplify a fair portion of XSLT processing if you understand how to use keys. Keys in XSLT have more or less the same meaning that indexes have in relational databases, except that in XSLT, keys index hierarchical structure rather than relational structure. It's easiest to explain keys with an example.

Imagine that you need to count the number of book copies available for each book title and display them in an HTML table, where each row looks like this:

... <tr> <td>JavaScript: The Definitive Guide</td> <td>2</td> </tr> ...

Here's a possible solution that illustrates the use of keys:

<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output indent="yes" encoding="utf-8"/> <xsl:key name="kbook" match="book" use="title"/> <xsl:template match="/"> <table> <xsl:apply-templates select="node()|@*"/> </table> </xsl:template> <xsl:template match="book"> <tr> <td> <xsl:value-of select="title"/> </td><td> <xsl:value-of select="count(key('kbook',title))"/> </td> </tr> </xsl:template> <xsl:template match="text()"/> </xsl:stylesheet>

In the preceding example, the key declaration has three parts: the name of the key, used to refer to it later in the code, the match, that is, the element or attribute of the input data to be indexed, and the use which is an XPath expression that defines the key itself. XPath is a language for addressing parts of an XML document, designed to be used by XSLT and XPointer. See the full language specification for more information.

In this particular case, the expression <xsl:key name="kbook" match="book" use="title"/> literally means: Create a key with the name kbook on all the tags book and group them by title.

The "book" template uses the key by calling the function key() with two parameters: the name of the key and the value of the index as defined in the @use attribute of the key declaration—in this case, simply "title" as that's the child of the context <book> node. Quite expectedly, this stylesheet would produce two identical lines for the book "JavaScript: The Definitive Guide" as shown below.

<?xml version="1.0" encoding="utf-8"?> <table> <tr> <td>JavaScript: The Definitive Guide</td> <td>2</td> </tr> <tr> <td>JavaScript: The Definitive Guide</td> <td>2</td> </tr> <tr> <td>Photoshop 6 for Professionals</td> <td>1</td> </tr> </table>

That leads to another common XSLT problem: removing duplicates.

Removing Duplicates: the Muenchian Method
Because XSLT is an almost side-effect-free declarative language, the problem of removing duplicates—ridiculously simple in imperative languages such as C++ or Java—becomes overly complicated. But fortunately, an elegant solution exists, so unexpected that it even earned its own name, "Muenchian," because Steve Muench was reportedly the first to discover it.

<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output indent="yes" encoding="utf-8"/> <xsl:key name="kbook" match="book" use="title"/> <xsl:template match="/"> <table> <xsl:apply-templates select="node()|@*"/> </table> </xsl:template> <xsl:template match="book"> <xsl:if test="generate-id()= generate-id(key('kbook',title)[1])"> <tr> <td> <xsl:value-of select="title"/> </td><td> <xsl:value-of select="count(key('kbook',title))"/> </td> </tr> </xsl:if> </xsl:template> <xsl:template match="text()"/> </xsl:stylesheet>

Notice that the key declaration in this example is identical to the previous example. You use the generate-id() function to obtain a unique id for each node, which ensures that every time you pass in the same <book> node, you get the same ID. The ID value depends on which XSLT processor implementation you're using, but typically, ID would be something like n1n1 or d1md1 or some other meaningless string. This example uses the key in a conditional expression that compares the ID of the current node with the ID of the first node returned by the key that matches the title of the current node. In other words, the key that matches "JavaScript: The Definitive Guide" returns two nodes ordered as 1 and 2. During execution, the template matching <book> passes in those same two nodes. When processing node 1, the ID of the node is the same as returned by the key key('kbook','JavaScript: The Definitive Guide')[1]; but when processing node 2, the condition is false. Thus, the stylesheet processes only one book that matches the title "JavaScript: The Definitive Guide."

Thanks for your registration, follow us on our social networks to keep up-to-date