devxlogo

Introduction to XQuery (Part 3 of 4)

Introduction to XQuery (Part 3 of 4)

n this article, you’ll explore one of the most interesting types of XQuery expressions?the FLWR expression, and see more detail on element constructors.

XQuery provides FLWR expressions for iterating over groups of nodes and for binding variables to intermediate results. The letters in the name “FLWR”, pronounced “flower”, stem from the keywords for, let, where, and return?the four clauses in a FLWR expression. FLWR expressions are useful for processing and restructuring data from one or more documents.

For example, the following query returns the list of email addresses of all customers in the document customers.xml.

   # see XQuery31.ixq in samples.zip      {      for $c in document("data/customers.xml")//customer     return      $c/email   }   

The query result looks like this:

         [email protected]      [email protected]      [email protected]      [email protected]   

FLWR Syntax and Semantics
The FLWR expression can be explained with the help of the block diagram shown in Figure 1.

Figure 1: The parts and processing sequence of an XQuery FLWR expression.

A FLWR expression must contain at least one for or let clause. The task of a for or let clause is to evaluate expressions and assign (bind) the results of the expressions to variables. The difference between these two clauses is that a for clause creates “n” different bindings when the expression results in a sequence (see bottom of first page, Part 2 of this series with “n” members, and a let clause creates one single binding (where the variable is bound to the entire sequence resulting from the associated expression). These variable bindings are then passed on to a where clause that filters them based on some conditions. In this respect, the where clause in a FLWR expression is similar to the where clause in a SQL select statement. The return clause constructs the results of the FLWR expression, and is invoked once for every variable binding that survives the filter (where clause).

FLWR expressions are useful for processing and restructuring data from one or more documents.

For example, suppose you wanted to to list all the items in the file items.xml that have been ordered by customers in the PO.xml file. You could do that it using the following query:

   # see XQuery32.ixq in samples.zip   for $i in document("data/items.xml")//item   let $p := document("data/PO.xml")//po   where $i/itemno = $p//itemno   return            {$i/description/text()}       

The for loop iterates through each item element in items.xml, while the where clause selects those items whose itemno appears in each po element in the file PO.xml, as specified by the let clause. The return clause lists each ordered item’s description as an ordered_item element. The result of the query looks like:

   Scooter   Digital Camera   Ping Pong ball   Fresh Roses

Variable Binding and Scoping
Although for and let both bind variables, the manner in which they do so is quite different. A let clause binds variables directly to the expression as a whole. However, the for loop binds variables to each value of the sequence returned by the expression. Consider the following query:

   # see XQuery33.ixq in samples.zip   let $c := document("data/customers.xml")//customer/custno   return            {$c}         

The query returns the custno elements for each customer in the customers.xml file, placing them between the the explicit tags. In other words, because the expression binds the variable $c to the sequence of custno elements in customers.xml. The let clause generates only one tuple for this binding. Therefore, the query engine must invoke the return clause only once, generating the following output:

          9000       1001       1003       2005   

Now look at the query rewritten to use a for loop instead:

   # see XQuery34.ixq in samples.zip   for $c in document("data/customers.xml")//customer/custno   return            {$c}         

In this case, the query binds the $c variable to the first mathing element (9000, then the next, 1001, etc. Therefore the query engine invokes the return clause four times, generating the following result:

          9000             1001             1003             2005   

You must bind a variable before using it. After binding a variable bound in a for or let clause, the variable remains in scope until the end of the FLWR expression in which it is bound. Localized bindings take priority over less localized bindings. In other words, if a variable name used in a binding is already bound, binding the variable again assigns the variable to the newly bound value until the variable goes out of scope. At that point, the variable again refers to the prior binding. Here’s an example:

   # see XQuery35.ixq in samples.zip   for $p in document("data/PO.xml")//po   return            {      for $j in $p//item      for $p in document("data/items.xml")//item      where $p/itemno=$j/itemno         return          {$p/description/text()}      }         

The preceding query lists all item descriptions in PO.xml. The outermost for loop binds $p binds to each po element in the document PO.xml. However, in the inner most for clause, $p is used again to bind to each item element in items.xml. After the inner loop completes, $p$ goes out of scope and regains its initial value from the outer for loop binding (to po in PO.xml again.

The query result looks like this:

           Scooter         Digital Camera               Ping Pong ball         Fresh Roses    

Using Joins in XQuery
In SQL, one of the most powerful ways to process data stored in Relational Data Base Management Systems (RDBMS) is to join data from multiple tables. A join operation usually combines data from multiple tables. With FLWR expressions you can use XQuery to perform similar functions on multiple documents.

In the last example, two for clauses were nested in the return clause of the outer FLWR expression. In this case, the tuples of variable bindings are drawn from the Cartesian product of the sequences returned by the expressions in all the for clauses. The ordering of the sequences from which they were formed, from left to right, governs the ordering of the tuples . For example, the preceding query returned only those items from items.xml whose itemno matched an itemno in PO.xml. Conventionally, this type of relationship is called an inner join.

An outer join is a join that preserves information from one or more of the participating documents, including elements that have no matching element in the other documents. Consider the following query, which lists all customers as well as the purchase order id for those customers who have placed orders.

   # see XQuery36.ixq in samples.zip   for $u in document("data/customers.xml")//customer   return               {$u//firstname/text()}   {$u//lastname/text()}         {            for $p in document("data/PO.xml")//po           where $u/custno = $p//custno           return            {$p/@id}         }      

The query returns the following result:

           Joe Anderson                      Andy Shaperd                      Amanda Johnson               Bill Murphy    

You can see that the query result lists all the customers, regardless of whether or not they have placed a po. Conventionally, a join that selects all the items from the left side of the join is called a left outer join.

At this point, you should be able to see that XQuery provides very sophiscated mechanism to process and transform data in XML documents through FLWR expressions, built-in functions, user defined functions, and conditional operations.Creating Element Constructors
Several of the preceding examples contain element constructors, but here's a more formal introduction. You use element constructors to create elements that appear in the output or intermediate results of an expression. The December 2001 draft of the XQuery specification proposes two different syntaxes for element constructors; you'll see both in this section. The rest of this article refers to these two syntaxes as the XML notation constructor syntax and the computed constructor syntax.

XML notation constructor syntax

Here's a very simple element constructor example:

   Hello World!!

When elements and attribute values contain constants, the XML notation syntax for element constructors is the same as the syntax for XML. Element constructors have to follow all the same rules for well-formedness as XML. But things start getting more interesting when you want to embed dynamic content in constructed elements. For example, to create an element containing the average price of items, you could write the following query:

   # see XQuery37.ixq in samples.zip   namespace ns="http://www.ipedo.com/XQueryExample"   {avg(document("data/items.xml")/ns:items/item/price)}

Notice that the nested expression is enclosed in curly braces ({ }). This element constructor creates an element named avg_price that has a child text node whose value (returned by the call to the built-in function avg()) is the average price of all items.

You can assign attribute values similarly. The following query returns a list of all items with their descriptions and prices listed as attributes of the constructed item elements.

   # see XQuery38.ixq in samples.zip   namespace ns="http://www.ipedo.com/XQueryExample"      {   for $i in document("data/items.xml")/ns:items/item   return         }   

The Computed Constructor Syntax
Computed constructor syntax uses the keywords element and attribute to construct elements and attributes. The next example creates the same greeting element constructed in the first example from the previous section, but uses computed constructor syntax:

   element greeting {      attribute complexity { "simple"},      "Hello World!!"   }

Similarly, here's the computed constructor syntax equivalent constructor for the second example from the preceding section:

   # see XQuery39.ixq in samples.zip   namespace ns="http://www.ipedo.com/XQueryExample"   element item_list {      { for $i in document("data/items.xml")/ns:items/item         return {         element item {            attribute description {$i/description},            attribute price {$i/price}         }      }   }

The primary purpose of computed constructor syntax is to let you construct elements and attributes whose names are not static (they are computed). For example, instead of hard-coding an attribute named description in the result, if you wanted to use a tag named after the language encoding used for the description, you could write:

   element item_list {      { for $i in document("data/items.xml")/items/item         return {         element item {            attribute {lang($i)} {$i/description},            attribute price {$i/price}         }      }   }

In the next and last part of the series, you'll look at some of the more powerful expressions such as conditional expressions, quantified expressions, and see how to include built-in functions and create user-defined functions in your expressions.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist