In general, you should try to avoid creating new content until you have a data set filtered down to as small a sequence as possible. Indeed, there's a surprisingly common cadence to filtered searches and views that seem to recur whenever you're dealing with querying potentially large data sets:
- Retrieve the initial collection of resources and store that into a working variable.
- Filter the items of this collection to retrieve only the relevant ones.
- Sort the filtered data set.
- Page through the sorted and filtered data set to retrieve a workable subset.
- Transform through the paged content for the appropriate output.
- Output the transformed content.
Retrieval occurs several ways. The
doc() function takes either an absolute or relative URL
and attempts to parse the contents of the URL as an XML document. The
collection() function,
on the other hand, retrieves a collection of nodes from an external source, without the need for that collection to
have a containing element. This distinction may seem fairly meaningless in retrieving an XML file, but as most
XML databases are set up around the notion of collections (with URIs corresponding to the collection rather than a
single element), the
collection() element is actually quite useful when invoked from within an XML database.
In an eXist database, you could create a collection of individual employees (though the exact mechanism for doing so
won't be covered here) and assign the collection to a given path, perhaps something like
db/employees. You could then reference all of the items from this collection by writing:
let $employees := collection("/db/employees")
or, if using the XQuery for Java (xqj) notation:
let $employees := collection("xmldb:exist:///db/employees")
Where
xmldb:exist:/// indicates that it is using the
xmldb: protocol
and that
eXist is the server being referenced. The triple slash notation is also shorthand for the full protocol path, which most likely has the form:
let $employees := collection("xmldb:exist://localhost:8080/db/employees")
Internally, retrieved collections are treated as sequences, at least with respect to queries (updates, which are out of the scope of this article, do make a distinction). This means that regardless of whether you are querying the results of a collection, or of an XPath sequence pulled from a document, you work with the content the same way.
Filtering involves reducing the initial collection side to handle particular records that are important to you. For instance, suppose you want to retrieve only those records where people are in a given division (for example, "Materials"). You can of course, do this as part of the retrieval process:
let $employees := collection("/db/employees[division = 'Materials']")
On the other hand, splitting this into two distinct steps provide advantages from both design and performance standpoints:
let $employees := collection("/db/employees")
let $filtered-employees := for $employee in $employees[division = 'Materials']” return $employee
You can also use the XQuery
WHERE command, which lets you separate the evaluation predicate:
let $employees := collection("/db/employees")
let $filtered-employees := for $employee in $employees
where $employee[division = 'Materials']
return $employee
Which is better? When the predicate (the expression in brackets) is relatively small and self-contained,
using XPath on the sequence is usually faster. However, when the expression is complex, when it involves more than
one variable or when this process is accompanied by a
SORT BY, then using the
WHERE clause is usually both more efficient and easier to read.
In theory, you could get by with the primary FLOWR operators, but there are a fair number of situations where
building such expressions could prove awkward. For this, you can use the
IF ... THEN ... ELSE construction:
if ($condExpr) then $resultExprTrue else $resultExprFalse
Both the
then and the
else statement have an implicit
return associated with them, meaning that you can use an
IF statement
to create a fairly complex script. For instance, consider a situation where you want a table showing the list of
employees in a given section, but you want a status message if there are no employees in a given section.
The
IF...THEN...ELSE statement works remarkably well for this
(see
Listing 4):
There are times when you have a conditional statement but you only want output when the condition is true or false exclusively. In this case, you can use an empty sequence (denoted by
"()") as the output:
if ($cond) then $output else ()
If the condition proves to be false, the empty sequence is returned, which in turn becomes blank output.
You can nestIF...THEN...ELSE statements within the result blocks, though expressions can get fairly complex when you have multiple nested
statements. Suppose you want to create different header styles based upon the value of a variable
$h (which can hold values from 1 to 6).
You can use an embedded IF statement to create such a switch:
let $title := "This is a test."
let $result := if ($h = 1) then <h1>{$title}</h1>
else if ($h = 2) then <h2>{$title}</h2>
else if ($h = 3) then <h3>{$title}</h3>
else if ($h = 4) then <h4>{$title}</h4>
else if ($h = 5) then <h5>{$title}</h5>
else <h6>{$title}</h6>
return $result
On the other hand, you can also make use of the element statement to create an element directly:
let $title := "This is a test."
let $result := element {concat("h",$h)}{$title}
The
element statement treats the first expression it encounters as the name of the newly created element, and the second contained expression as its content. You can see now there are a couple of different ways to accomplish the same task in XQuery.