Browse DevX
Sign up for e-mail newsletters from DevX


Introduction to XQuery (Part 2 of 4) : Page 4

The XQuery data model recognizes seven types of nodes. Learn how to take advantage of the data model in XQuery expressions




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Path Expressions (XPath 2.0)
A path expression is an XPath (2.0) expression. XQuery is a superset of XPath2.0 in the sense that any valid XPath 2.0 expression is also a valid XQuery 1.0 expression and will return the same results. HereI's a quick tutorial on XPath 2.0, intended to give you just enough of an introduction so you can follow the sample queries in this series.

A path expression identifies a set of nodes (one or more nodes) in an XML document. As described previously, the XQuery1.0/XPath2.0 data model views an XML document as a tree of nodes. A path expression traces a path through such the tree, identifying all the nodes to be retrieved by the expression. The result of the evaluation of a path expression is typically either a sequence of nodes, a single node, or simple values such as strings, numbers etc.

All samples in this section are based on the sample document PO.xml. The tree view of this document has a document node, whose child is the document element <polist>. The <polist> element has two <po> child elements and so on.

Here's an example. Suppose you wanted to retrieve a list of all line items from all purchase orders. The query you would compose looks like this:

# see XQuery211.ixq in samples.zip document("data/PO.xml")/ns:polist/po/lineitems/lineitem

Let's dissect the expression. Path expressions contain one or more location steps, separated by an axis specifier such as the forward slash (/). To trace a path through a tree the expression starts at the document node, which is the root node for the document PO.xml. The document() function returns the document node.. The axis specifier that follows the document() function is the forward slash (/), which is a shorthand notation for the child axis. In other words, after identifying the document node, you want to search its child nodes. So, next the query steps down to the document element, which is the <polist> element. The ns: is a namspace prefix—a shorthand notation for the full namespace identified URI "http://www.ipedo.com/XQueryExample". How the prefix-uri mapping is provided to a query processor is processor dependent; however, if you use the path expression as part of an XQuery query, you can provide the namespace mapping using a namespace clause.

Next, the query continues down the tree (because the next axis identifier is again /) to the <po> elements. For each po element, the query continues down to the <lineitems>. For each <lineitems> element selected, the query finds all the <lineitem> child elements. At that point, the expression ends, so the query processor returns a sequence of <lineitem> elements.

You can make more specific XPath queries by by adding a predicate to any location step. A predicate is a condition enclosed in square brackets following a location step. For example, to retrieve just the <lineitems> for the purchase order with the id 0001, you can write this expression:

# see XQuery212.ixq in samples.zip document("data/PO.xml") /ns:polist/po[@id="0001"]/lineitems/lineitem

The preceding query applies the predicate applied to the location step that selects the <po> elements.. To be included in the query path, a <po> element must have an attribute named id (you refer to attributes by preceding them with the @ symbol)that has the value 0001).

You can take shortcuts to specify path expressions. For example, if you wanted to retrieve all the <lineitem> elements in the document PO.xml, you can use the axis specifier //, which is a shorthand notation for the descendants axis, and selects all descendent nodes of the nodes selected in the previous step in the path. For example, this query selects all the <lineitem> descendents of the document node in the document "PO.xml":

# see XQuery213.ixq in samples.zip document("data/PO.xml")//lineitem

In general you should avoid using the descendant axis and try to specify the node path as specifically as possible. Not only are generalized descendent queries expensive, because they often select more data than you need, but they're also relatively slow, because the processor can't optimize the search in the absence of appropriate indexes.

Now that you've seen a high-level overview of the XQuery1.0 and XPath2.0 data model, and looked at XQuery and XPath expressions in greater detail, you're ready to move on. Part III of this Introduction to XQuery series will introduce one of the most versatile XQuery expression types—FLWR expressions, and also looks at element constructors and the different syntaxes for element and attribute construction.

Srinivas Pandrangi is an architect at Ipedo and a member of the W3C XML Query working group, where he is working on the standardization of XQuery. His article on XML performance techniques was published earlier this year in XML Journal. You can reach him at srinivas@ipedo.com.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date