devxlogo

Introduction to XQuery (Part 2 of 4)

Introduction to XQuery (Part 2 of 4)

ou can find a quick introduction to the XQuery language with examples of several kinds of XQuery expressions in Part I of this series. In this part you’ll explore some basic XQuery concepts.

Data Model
XQuery’s data model describes the values that occur in the evaluation of a query; the inputs, outputs and intermediate values that occur during the processing of a query are all instances of the data model. The XQuery data model contains these important categories of values:

Simple Typed Values. A simple typed value conforms to a simple type as defined by the XML Schema specification, which includes strings, integers, dates etc. You can find the complete list of simple types in the XML Schema specification?Part 2.

Nodes. The XQuery data model recognizes seven types of nodes: document, element, attribute, text, comment, processing-instruction and namespace nodes. The model treats XML documents and document fragments as trees of nodes. A document is a tree with the document node as the root of the tree. Similarly a document fragment is a tree with an element at its root. Document nodes and namespace nodes have no parents. All other nodes can have zero or one parent. Only a document node or an element node can have children. Here’s a simple xml document and the corresponding data model instance.

                  Hello XQuery User   

The data model instance representing this document looks like Figure 1:

Figure 1: The data model representing a simple XML document containing a root element and a element that holds a text message.

Sequences. A sequence is an ordered list of simple values or nodes. Sequences are flat, which means that a sequence cannot contain another sequence as its member. A sequence containing only one member is referred to as a singleton sequence. From a data model point of view, there is no distinction between a singleton sequence containing a simple value and the simple value itself.

Query Expressions. Expressions are the top-level abstraction of every element in XQuery, and are the basic building blocks of XQuery. There are several kinds of expression defined in the latest specification:

  • Constants and Variables
  • Operators
  • Path Expressions
  • FLWR (For-Let-Where-Return) expressions
  • Element/Attribute Constructors
  • Conditional expressions
  • Quantified Expressions
  • Built-in and User Defined Functions

Constants and Variables
Here’s an example query that uses constant literals and variables. The following query finds the customer whose customer number is 1003 from the sample document customers.xml.

   # see XQuery21.ixq in samples.zip   let $custno := "1003"   return document("data/customers.xml")      //customer[custno=$custno]

Constants and variables are some of the most basic expressions in XQuery. Constants take of the form of literal representations of values. For example, consider the following constants:

      "John Doe"      1      true      "2002-05-29"

Variable references always start with a dollar sign ($), followed by the variable’s name. In the preceding example, “1003″ is a constant and $custno is a variable. You bind variables to values using FLWR expressions or Quantified expressions. The following example binds the results from executing a path expression (XPath expression) to a variable. It returns the list of all itemno elements in the items.xml document.

   # see XQuery22.ixq in samples.zip   for $i in document("data/items.xml")//item/itemno   return $i

Here’s another example. The following expression binds a sequence to the variable and returns the sequence. This expression also introduces the sequence constructor (the parentheses), used here to construct a sequence of two strings.

   # see XQuery23.ixq in samples.zip   let $i := ("s1", "s2")   return $i

You can also pass variables to a function as parameters. The next example binds the sequence of nodes returned by a path expression to the variable $items, and then invokes the built-in count() function with this $items variable as a parameter. The expression therefore returns the count of item elements in the items.xml document.

   # see XQuery24.ixq in samples.zip   let $items:= document("data/items.xml")//item   return count($items)

The variable scoping rule in XQuery is quite staightforward. The inner binding of a variable always overrides any in-scope outer binding with the same name. In other words, the most localized scope definition takes precedence. For example:

   # see XQuery25.ixq in samples.zip   for $s in (1 to 2)    let $s := (3 to 4)    return $s

The preceding query returns the sequence (3,4,3,4) because the definition of $s in the let clause overrides the $s definition in the for clause.Operators
XQuery is a functional language just like other programming languages, so it is not strange to see almost all the operators that you typically encounter in other languages. But in addition to those, there are some new operators that are XQuery specific, such as the except and intersect operators.

Arithmetic operators. The arithmetic operators in XQuery are same as the arithmetic operators in other languages. The argument and the return types are basic numeric types. The following example shows how you might use arithmetic operators in XQuery:

   # see XQuery26.ixq in samples.zip   # Calculate the unit price, tax and total amount for item number '0021':   let $price := document("data/PO.xml")      //item[itemno='0021']/price   let $quantity := document("data/PO.xml")      //po[@id='0001']/lineitems/lineitem/quantity   let $taxrate := document("data/PO.xml")      //item[itemno='0021']/taxrate   return         {$price div $quantity}        {$price * $taxrate}        {$price + $price * $taxrate}    

Comparison operators. Comparison operations take two operands of the same type and return a Boolean value. XQuery supports comparing two operands of different types if one of the operands can be converted to the same type as the other. The following query computes an approval level for each item in a purchase order based on the item’s price.

   # see XQuery27.ixq in samples.zip   # Find out the approval level for all items:   for $item in document("data/PO.xml")//item   return         {$item/itemno}            {if ($item/price >= 500) then 3 else        if ($item/price >= 200) then 2 else        if ($item/price     

Boolean operators. XQuery supports the basic boolean operations: and, or, and not. Here’s a query that returns an approval level for items based on the item’s price range. Note how the query uses the Boolean and operator to test the price ranges.

   # see XQuery28.ixq in samples.zip   # Find out the approval level for all items using Boolean operators:   for $item in document("data/PO.xml")//item   return         {$item/itemno}            {if ($item/price >= 500 and $item/price = 200 and $item/price  50)           then 1 else 0}          

Sequence operators. XQuery defines a number of operators that apply specifically to sequences: item-at, concatenate, except, union, and intersect. The following example shows how to use the except operator.

   # see XQuery29.ixq in samples.zip   # Find all items except the 'Finishing good' type item   let $allitem :=       document("data/items.xml")//item/itemno   let $orderitem:=       document("data/items.xml")      //item[ItemType='Finishing good']/itemno   return       {$allitem except $orderitem}   

Node Operators. There are a few node-specific operators defined in XQuery. They are node-after (>>), node-before (node-equal (==, !==). The following query demonstrates how to use the >> operator.

   # see XQuery210.ixq in samples.zip    # Find all the items that occur after item '0031' in document order:   let $theitem :=       document("data/items.xml")      //item[itemno='0031']/itemno   for $allitem in       document("data/items.xml")//item/itemno   where $allitem >> $item   return       {$allitem}   

Other Operators. XQuery defines a number of operators on QName, Date, Time, anyURI, base64Binary and hexBinary that are not covered here in this article.Path Expressions (XPath 2.0)
A path expression is an XPath (2.0) expression. XQuery is a superset of XPath2.0 in the sense that any valid XPath 2.0 expression is also a valid XQuery 1.0 expression and will return the same results. HereI’s a quick tutorial on XPath 2.0, intended to give you just enough of an introduction so you can follow the sample queries in this series.

A path expression identifies a set of nodes (one or more nodes) in an XML document. As described previously, the XQuery1.0/XPath2.0 data model views an XML document as a tree of nodes. A path expression traces a path through such the tree, identifying all the nodes to be retrieved by the expression. The result of the evaluation of a path expression is typically either a sequence of nodes, a single node, or simple values such as strings, numbers etc.

All samples in this section are based on the sample document PO.xml. The tree view of this document has a document node, whose child is the document element . The element has two child elements and so on.

Here’s an example. Suppose you wanted to retrieve a list of all line items from all purchase orders. The query you would compose looks like this:

   # see XQuery211.ixq in samples.zip    document("data/PO.xml")/ns:polist/po/lineitems/lineitem

Let’s dissect the expression. Path expressions contain one or more location steps, separated by an axis specifier such as the forward slash (/). To trace a path through a tree the expression starts at the document node, which is the root node for the document PO.xml. The document() function returns the document node.. The axis specifier that follows the document() function is the forward slash (/), which is a shorthand notation for the child axis. In other words, after identifying the document node, you want to search its child nodes. So, next the query steps down to the document element, which is the element. The ns: is a namspace prefix?a shorthand notation for the full namespace identified URI “http://www.ipedo.com/XQueryExample”. How the prefix-uri mapping is provided to a query processor is processor dependent; however, if you use the path expression as part of an XQuery query, you can provide the namespace mapping using a namespace clause.

Next, the query continues down the tree (because the next axis identifier is again /) to the elements. For each po element, the query continues down to the . For each element selected, the query finds all the child elements. At that point, the expression ends, so the query processor returns a sequence of elements.

You can make more specific XPath queries by by adding a predicate to any location step. A predicate is a condition enclosed in square brackets following a location step. For example, to retrieve just the for the purchase order with the id 0001, you can write this expression:

   # see XQuery212.ixq in samples.zip    document("data/PO.xml")      /ns:polist/po[@id="0001"]/lineitems/lineitem

The preceding query applies the predicate applied to the location step that selects the elements.. To be included in the query path, a element must have an attribute named id (you refer to attributes by preceding them with the @ symbol)that has the value 0001).

You can take shortcuts to specify path expressions. For example, if you wanted to retrieve all the elements in the document PO.xml, you can use the axis specifier //, which is a shorthand notation for the descendants axis, and selects all descendent nodes of the nodes selected in the previous step in the path. For example, this query selects all the descendents of the document node in the document “PO.xml”:

   # see XQuery213.ixq in samples.zip    document("data/PO.xml")//lineitem

In general you should avoid using the descendant axis and try to specify the node path as specifically as possible. Not only are generalized descendent queries expensive, because they often select more data than you need, but they’re also relatively slow, because the processor can’t optimize the search in the absence of appropriate indexes.

Now that you’ve seen a high-level overview of the XQuery1.0 and XPath2.0 data model, and looked at XQuery and XPath expressions in greater detail, you’re ready to move on. Part III of this Introduction to XQuery series will introduce one of the most versatile XQuery expression types?FLWR expressions, and also looks at element constructors and the different syntaxes for element and attribute construction.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist