WEBINAR:
On-Demand
Application Security Testing: An Integral Part of DevOps
Path Expressions (XPath 2.0)
A path expression is an XPath (2.0) expression. XQuery is a superset of XPath2.0 in the sense that any valid XPath 2.0 expression is also a valid XQuery 1.0 expression and will return the same results. HereI's a quick tutorial on XPath 2.0, intended to give you just enough of an introduction so you can follow the sample queries in this series.
A path expression identifies a set of nodes (one or more nodes) in an XML document. As described previously, the XQuery1.0/XPath2.0 data model views an XML document as a tree of nodes. A path expression traces a path through such the tree, identifying all the nodes to be retrieved by the expression. The result of the evaluation of a path expression is typically either a sequence of nodes, a single node, or simple values such as strings, numbers etc.
All samples in this section are based on the sample document
PO.xml. The tree view of this document has a document node, whose child is the document element
<polist>. The
<polist> element has two
<po> child elements and so on.
Here's an example. Suppose you wanted to retrieve a list of all line items from all purchase orders. The query you would compose looks like this:
# see XQuery211.ixq in samples.zip
document("data/PO.xml")/ns:polist/po/lineitems/lineitem
Let's dissect the expression. Path expressions contain one or more
location steps, separated by an
axis specifier such as the forward slash (
/). To trace a path through a tree the expression starts at the document node, which is the root node for the document
PO.xml. The
document() function returns the document node.. The axis specifier that follows the
document() function is the forward slash (
/), which is a shorthand notation for the
child axis. In other words, after identifying the document node, you want to search its child nodes. So, next the query steps down to the document element, which is the
<polist> element. The
ns: is a
namspace prefixa shorthand notation for the full namespace identified URI
"http://www.ipedo.com/XQueryExample". How the prefix-uri mapping is provided to a query processor is processor dependent; however, if you use the path expression as part of an XQuery query, you can provide the namespace mapping using a namespace clause.
Next, the query continues down the tree (because the next axis identifier is again
/) to the
<po> elements. For each
po element, the query continues down to the
<lineitems>. For each
<lineitems> element selected, the query finds all the <lineitem>
child elements. At that point, the expression ends, so the query processor returns a sequence of
<lineitem> elements.
You can make more specific XPath queries by by adding a predicate to any
location step. A predicate is a condition enclosed in square brackets following a location step. For example, to retrieve just the
<lineitems> for the purchase order with the id
0001, you can write this expression:
# see XQuery212.ixq in samples.zip
document("data/PO.xml")
/ns:polist/po[@id="0001"]/lineitems/lineitem
The preceding query applies the predicate applied to the location step that selects the
<po> elements.. To be included in the query path, a
<po> element must have an attribute named
id (you refer to attributes by preceding them with the
@ symbol
)that has the value
0001).
You can take shortcuts to specify path expressions. For example, if you wanted to retrieve all the
<lineitem> elements in the document
PO.xml, you can use the
axis specifier
//, which is a shorthand notation for the
descendants axis, and selects all descendent nodes of the nodes selected in the previous step in the path. For example, this query selects all the
<lineitem> descendents of the document node in the document "PO.xml":
# see XQuery213.ixq in samples.zip
document("data/PO.xml")//lineitem
In general you should avoid using the descendant axis and try to specify the node path as specifically as possible. Not only are generalized descendent queries expensive, because they often select more data than you need, but they're also relatively slow, because the processor can't optimize the search in the absence of appropriate indexes.
Now that you've seen a high-level overview of the XQuery1.0 and XPath2.0 data model, and looked at XQuery and XPath expressions in greater detail, you're ready to move on. Part III of this Introduction to XQuery series will introduce one of the most versatile XQuery expression typesFLWR expressions, and also looks at element constructors and the different syntaxes for element and attribute construction.