DevX Skillbuilding for IBM DeveloperWorks
DevX Skillbuilding for IBM DeveloperWorks
DevX Skillbuilding for IBM DeveloperWorks
Get regular email alerts when we publish new features!
DevX Update for IBM developerWorks

More Newsletters

Introducing IBM DB2 Viper Workbench, featuring the XQuery Visual Builder

With the release of DB2 Viper, IBM has also announced the availability of the DB2 Developer Workbench. Formerly packaged with DB2 as the Development Center, the new Workbench, based on Eclipse, allows DBAs more flexibility in their development process and gives them a load of new toys to play with. This article walks through one of the new features: the XQuery Visual Builder. 


With the release of IBM's DB2 9, formerly known as Viper and currently in Release Candidate, IBM also plans to unveil the new DB2 9 Developer Workbench. Existing DB2 users will recognize the functionality as the current DB2 Development Center. But the new version, available as a free download, has been decoupled from the Viper administrative tools and re-visioned on an Eclipse platform.

The Workbench can be downloaded from the DB2 Download Center. If you're not quite ready or don't have the bandwidth for it, then use this walkthrough as your introduction to the new UI and key features, particularly the XQuery Visual Builder.

For developers who have come to know and love the free version of DB2, known as DB2 Express-C, good news! DB2 Express-C 9, which will be released with the fee-based versions, also includes pureXML capabilities.

XQuery Visual Builder
While the native XML capabilities of DB2 9 have been getting a lot of digital ink lately (IBM's marketing gurus have recently dubbed it "pureXML"), less attention has been paid to what the Workbench does with that XML. Since XML is stored in DB2 tables as neither a CLOB nor parsed relational fields, developers need something other than SQL to query it.

XQuery, or XML Query Language, has long been in development and is currently in Candidate Recommendation with the W3C. For more information than you will probably ever need, you can read the W3C Documentation, XQuery 1.0: An XML Query Language.

While supposedly it's syntactically similar to SQL, its differences number more than enough to warrant separate and prolonged study. Fortunately for those of us new to XQuery, the Workbench provides a visual designer, which, in the proper hands, can make short work of even complex queries.

That said, the visual designer itself is not without its quirks, as you'll see. If you're planning to use XQuery but are just beginning to learn the basics, this walkthrough will help familiarize you with key concepts. Use the Visual Builder to piece together your queries, then study the auto-generated syntax created by the Workbench. (Remember the days of using a visual designer to learn SQL? Same thing. Except that working with nested XML nodes is unlike any SQL diagram you've ever drawn.)

But once you've learned the basics, you'll want to take it further. Learn XQuery syntax. Play with it in the Command Editor. Mix it up with SQL, which I'll touch on after the walkthroughs. In short, this is just an intro. The Workbench tools can be tricky to grasp at first, but once you do, you'll wonder how you ever got work done without them. But to truly prepare for the next stage of XML storage represented by DB2 9, use the Workbench as not just a tool in and of itself, but as an aide to learning.

Back to Shakespeare: A New Workbench Project
If you read "XML and Beyond," then you're somewhat familiar with the XML storage capabilities of DB2 9. In fact, if you completed then walkthrough, then you should have a working copy of a sample table filled with some heavy XML documents consisting of a few of Willie Shakespeare's finest works. The following walkthrough builds on that example by introducing the XQuery Visual Builder and pulling data from those plays.

If you want to experiment with the Shakespeare database, then download DB2 9 and complete the walkthrough. Otherwise, you can follow along with similar queries against the SAMPLE database that comes with DB2. If nothing else, the walkthrough should at least give you some notion of what to expect in the DB2 9 Workbench.

This walkthrough assumes you have DB2 9 installed and have created a SHAKES database, with a PLAYS table that includes the following columns and contains at least three rows, each with an XML document containing the full text of a play in the PLAYTEXT column:

"PLAYID" INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY
"PLAYNAME" VARCHAR(50) NOT NULL
"PLAYTEXT" XML NOT NULL


    Figure 1.
    New Connection to SHAKES
  1. Open up the Workbench and start a new Data Development Project, File -> New -> Data Development Project or click the New Project dropdown from the tool menu. This kicks off one of the included project wizards, which not only walks you through setting up the parameters of the project but also initializes your Perspective, or workspace layout, specifically for working with data. Just one of the many things to love about Eclipse.
  2. Call the project "ShakesQuery" or whatever.
  3. If you see an existing connection for SHAKES then choose "Use an existing connection" and select it. Otherwise, choose "Create a new connection" and the wizard will guide you through the connection setup.
  4. If creating a new connection, you can uncheck "Use default naming convention" and give the connection a unique name, if you want. In Figure 1, I'm calling it "SHAKES_CONNECT". The important thing here is to specify "v9.1" as your database manager rather than leaving the default selection of v8.1. Also, specify "SHAKES" as your database. To test it, enter your User information in the lower left and click "Test Connection".
  5. On "Specify Schema Filter" accept the defaults.
  6. Same for "Specify routine parameters". By default this points to IBM's JDK.

Figure 2.
New Data Project and Connection

When you click "Finish", the wizard will create a new project for you under the Data Project Explorer. If you selected to create a new connection, it will also create that connection and enable it. You can explore the connection in the Database Explorer, drilling down all the way to the column if you want (see Figure 2).


Figure 3.
Entire Workspace

Look at Figure 3 for a larger view of the entire workspace. This is the "Data" perspective. Note the default arrangement of the various Views. The Eclipse framework gives you almost complete freedom of choice when it comes to UI layout. You can dock, tab, or float to your heart's content with any number of Views included in the Workbench.

The Play's the Thing: XQuerying XML Nodes
XML Queries require a representative schema to populate the Visual Builder. You can add XSDs to the "XML Schema Documents" folder to facilitate the process of associating them with the XML columns you want to query. You can also add a Document Type Definition (DTD) or schema to the workspace.

But just to show off, you're going to let the Workbench discover and generate the XML schema on its own.

  1. Right-click on the XML Queries folder and choose New XML Query.
  2. Call this "PersonaQuery". "PERSONA" is the XML node that contains the character names for all of the plays in this particular XML collection.
  3. The New XML Query wizard gives you the chance to add representative XML documents. Since this is your first time here, click Add.
    1. For document location, select Database. You'll be generating the XML schema on the fly based on sample data.

      Figure 4.
      Sample Column Document
    2. Navigate to the PLAYTEXT column of the PLAYS table and select it.
    3. When asked to choose a document, as in Figure 4, you'll see several interesting things. On the right you'll see a list of the XML documents found in that column, as well as their sizes. On the left, you'll find fully navigable schemas. If your rows have different schemas, totally allowed, then you can select a different row to serve as the reference for your query. Notice that the sample data displayed in the Document pane changes. At this point, the wizard wants to create your schema for you, so you can also specify a name for the resulting XML file or accept the default, as will be done here.
  4. In addition to having an XML schema handy, you have to associate it with the actual data you'll be querying. In this case, since the schema was generated from the data itself, the association has already been created. Note the green check in Figure 5.

Figure 5.
Associate Document with XML Column
In your new PersonaQuery.xqm view, you have a navigable node tree just like the one in Figure 4. You've also been given a fresh new visual design space on which to drag and drop XML nodes and an abundance of handy syntax, such as Constructors, Expressions, Functions, and Variables (see Figure 6).

Figure 6.
New PersonaQuery


Figure 7.
New Persona Node

The tricky thing about using such a complex example as a Shakespearean play is that you have a ton of nodes to work with. Though fairly visual and certainly helpful in crafting XQuery syntax, the visual designer is not especially intuitive. You'll find it quite easy to pull in the wrong node or put the right node in the wrong place or perform a mismatched operation. The visual designer is friendly enough to let you make all the mistakes you want. When you try to run them, you'll just get an error (and not a very helpful one, at that, in all likelihood).

Of course, the best way to learn is to bang your head against the wall a few times until you break through. But to save you some headache and keep this example fairly straightforward, I'll step you through a sample query.

  1. Expand the PERSONAE node and drag the first PERSONA node onto the designer (see Figure 7). One of the cool things about the builder is that you just need a representative node. SQL experts might feel like the search is going to be limited to just the node that you drag over, but that's not the case. Notice, however, that the sub-node called PGROUP will not be included in the query. You would have to add it separately. And by the way, the "X" in the icon for PERSONA stands for "XML", not "warning, warning, this is an error". So don't let it throw you off.
  2. To the right of the node here and throughout the builder, you'll see a "Step Into" icon (see Figure 8). You can use this to create conditions and result sets for that node. Since you're on the primary node, in this case you'll be defining the result set for the entire query. Click it to step into the For Logic (FLWOR) for the query.

Figure 8.
Step Into

Figure 9.
For, Let, Where, Order By, Return


Figure 10.
Expanded Function List


Figure 11.
matches() function


Figure 12.
matches() function Completed


Figure 13.
Completed FLWOR

The FLWOR, which stands for FOR, LET, WHERE, ORDER BY, RETURN, is XQuery's most basic syntax, the rough equivalent to SQL's SELECT-FROM-WHERE. This is where you define the node that will be iterated through in performing the query, a binding variable to serve as a sort of anchor for the query, your conditions, and the nodes to be returned. You can specify the variable name or accept the default, in this case "PERSONA0" (see Figure 9).

The magic for this query happens in the Where clause. The goal: find every play in which one of the characters is a King or somehow related to a King. Here, you have a couple of options. Though XQuery doesn't use the keyword LIKE, it does provide a ton of functions, including contains() and matches(). In this case, contains() won't quite work because of case sensitivity. But matches() gives you highly flexible pattern-matching, including an "ignore case" switch. To use it, expand the Functions category on the left and, if not already selected, pick String Functions from the dropdown (see Figure 10).

  1. Drag "fn:matches" onto the Where grid under Operand 1 (see Figure 11). Usually, you'll have a conditional and will go on to specific the operator and second operand. In fact, this is one of the gotchas of the visual designer—even though the matches() function doesn't take a second operand, you can still assign one. For other functions that do take an operand, you can assign the wrong one, or the wrong type, or the wrong operator, so be sure what the function you use is returning. In this case, it returns a Boolean, so it stands alone.
  2. Click the Step Into icon for matches(). You'll see its three arguments, each with their own Step Into icons.
    1. For "source-string", drag over a PERSONA node. This is the string that will be matched. (Note: if you drag it to the wrong argument or otherwise make a tiny goof, you'll likely have a hard time undoing the mistake. In some cases, you can delete the offending value but you'll still have visual detritus in the designer. To keep it absolutely clean, your best option is to completely back out, delete the function with right-click -> Delete, and start over.)
    2. For "pattern" you can type "king" directly into the field or click Step Into and type it there. Stepping into an XML node brings up a new FLWOR, while stepping into a native field type usually brings up a single value box.
    3. For "flags", enter "i" to "ignore case". The completed set up should look like Figure 12.
  3. Click the Step Out function to return to the FLWOR. For this example, you'll ignore the Let clause. But add one more thing: directly under the PLAY node you have a TITLE node (not to be confused with the one under PERSONAE). Drag it onto the Return grid under PERSONA, as in Figure 13.

At this point your XQuery is complete. From the Step Into dropdown, choose "Run..." You'll see a new request for document association, but since you've already done that you can accept the default and choose Finish.

If all goes well, in your Data Output view (see Figure 14), you'll see a Success status on the left and a parent node and ellipsis on the right with your results. Click the ellipsis.

The plays used in this example feature the following Kings and their relations. In fact, the match worked so well that the results also include one "DUKE of BUCKINGHAM".

<PERSONA>KING OF FRANCE</PERSONA>
<TITLE>All's Well That Ends Well</TITLE>
<PERSONA>FERDINAND, king of Navarre.</PERSONA>
<TITLE>Love's Labor's Lost</TITLE>
<PERSONA>KING EDWARD The Fourth</PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>
<PERSONA>HENRY, Earl of Richmond, afterwards King Henry VII.</PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>
<PERSONA>DUKE of BUCKINGHAM</PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>
<PERSONA>ELIZABETH, Queen to King Edward IV. </PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>
<PERSONA>MARGARET, Widow of King Henry VI. </PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>
<PERSONA>DUCHESS of YORK, Mother to King Edward IV.</PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>
<PERSONA>LADY ANNE, Widow of Edward Prince of Wales, son to King Henry VI;
afterwards married to Richard.</PERSONA>
<TITLE>The Tragedy of Richard the Third</TITLE>

Figure 14.
Data Output View

Now here's the best part: at the bottom of your XQM frame, you should see two tabs—Design and Source. Look at the source code that the designer generated for you. You should see something like:

values(XMLQUERY('
declare boundary-space strip;
for $PERSONA0 in
db2-fn:xmlcolumn("PLAYS.PLAYTEXT")/PLAY/PERSONAE/PERSONA
where matches ( $PERSONA0,"king","i" )
return
(
$PERSONA0,
$PERSONA0/../../TITLE
)
' RETURNING SEQUENCE))

Use this to learn from, or to generate XQuery code to be incorporated elsewhere. In fact, you can take it directly to the Command Editor and generate the same results with some minor modifications:

xquery
for $PERSONA0 in
db2-fn:xmlcolumn("PLAYS.PLAYTEXT")/PLAY/PERSONAE/PERSONA
where matches ( $PERSONA0,"king","i" )
return
(
$PERSONA0,
$PERSONA0/../../TITLE
)

For Extra Credit
Create another XML Query. Your goal this time: find every line of dialog, its speaker, and the play it can be found in, that uses the word "Hark" (in both upper and lower case).

Tip #1: since you created an XML Schema in the last example, you can use that same file as the representative example here (look under Local Workspace). Only this time, you'll need to manually associate it with your data.

Tip #2: the node you want is LINE, which can be found under PLAY/ACT/SCENE/SPEECH. You'll see a sister node SPEAKER and can grab TITLE from the same place as the previous example.


Figure 15.
SAMPLE.CUSTOMER Table

Mixing XQuery and SQL
In addition to adding the Powers of XQuery to your collection of mutant abilities, you'll still have many reasons to call upon your SQL skills. For that reason, it'll be handy to mix the two, which DB2 9 does ably and elegantly.

For a quick demo, open up your Control Center and connect to the SAMPLE database. If you need to create this database, run the db2sampl utility. Several of the tables in the DB2 9 version of SAMPLE include XML fields. Look at the CUSTOMER table (Figure 15)—it has two XML fields, INFO and HISTORY.

To view the XML data, double-click the table to open it. INFO and HISTORY will both appear to be blank. Click "Fetch XML" to populate them, then click on one of the INFO cells. An ellipsis will appear that takes you to an XML Document Viewer. HISTORY contains no information in the default SAMPLE database. Open the Command Editor and run the following commands:


Figure 16.
Results of SELECT Query

connect to SAMPLE;
select * from customer; 

You should see the results in Figure 16—a list of rows, with some XML.

Now run just the select statement. The results will instead appear in the Query Results tab of the Command Editor. But here's a slight gotcha: though you can click "Fetch XML" to populate the XML fields, you can't currently click the ellipses to view the XML data, as you could when you opened the table directly. Whether it's a bug or a "feature", it's not going to get in your way here, but is something you should know about.

Suppose you want to select your rows based on the city of the customer, such as all customers in Toronto. You can enter the following XQuery statement directly:

xquery
for $c in db2-fn:xmlcolumn('CUSTOMER.INFO')
/customerinfo/addr/city
return $c

Notice a few things:


Figure 17.
XQuery Results


Figure 18.
Results of XQuery Statement
  • The "xquery" keyword defines the type of query
  • The variable "$c" is binding to the iterations
  • Line breaks are ok—the "/customerinfo..." line would actually append the previous line, with no spaces, but has been broken here for clarity

You can also shorten this statement to:

xquery
db2-fn:xmlcolumn('CUSTOMER.INFO')/customerinfo/addr/city

This produces the results in Figure 17.

To pull the customers who live in Toronto, use something like:

xquery
db2-fn:xmlcolumn("CUSTOMER.INFO")/customerinfo/name
[./../addr/city = "Toronto"]

You should get the results you see in Figure 18.

So far, all of this can be done quicker and easier in the XQuery Visual Builder (in fact, I had to use it to build that last statement). But suppose you needed to mix up some SQL code with all of this, perhaps for a SQL stored procedure. The following statement returns the CID using the same conditional, but in the context of SQL:

select CID from CUSTOMER
where xmlexists('$c/customerinfo/addr[city="Toronto"]'
passing CUSTOMER.INFO as "c")

Note:

  • Use of the xmlexists() function to identify the XPath [http://en.wikipedia.org/wiki/Xpath] expression used to point down the "addr" node of the INFO field. Compare this to the statement you used previously.
  • Use of the "passing" keyword to set the field itself as the root node

One more example. This one's a bit more complicated: it pulls a list of employees whose last names match the name of a customer. It's completely irrelevant but shows how to link the relational fields of one table with XML nodes of another.

select CUSTOMER.CID,
EMPLOYEE.EMPNO,
EMPLOYEE.FIRSTNME,
EMPLOYEE.LASTNAME
from CUSTOMER, EMPLOYEE
where xmlexists('$c/customerinfo/name[matches(.,$l,"i")]'
passing INFO as "c", EMPLOYEE.LASTNAME as "l")

What to notice here:

  • Use of the xmlexists() function to again define the conditional of the XML data. But this example uses a slightly different version of the syntax and adds the matches() function you used in the XQuery earlier.
  • EMPLOYEE.LASTNAME getting passed as an alias to the XML conditional. This is how the two tables come together—look at the "$l" parameter in the matches() function defining the search pattern. Also notice use of "." for the source string since the pattern is being matched on the node itself. This points to some versatility that can be exploited further.

Figure 19.
Results of SQL and XQuery Combination

See Figure 19 for the results of this query, which is returned in table format.

For Extra Credit
Right now, this query doesn't return any XML data. For comparison, pull the CUSTOMER name that's being matched. Use the /customerinfo/name node. Also, try experimenting with the xmltable() function to return XML results in tabular format.

Where to Go from Here
This walkthrough covers a lot of ground. But even so, it barely scratches the surface of what the DB2 9 Workbench is capable of, even just in the area of XQueries. For more information on the Workbench, the Help files included with your install actually contain quite a bit of good information, unlike the Help files of other IDEs we know and love. You'll find abundant information about XQuery syntax and even some tutorials to help get you started on various Workbench features.

Also be sure to check out the following resources:

  • Download DB2 9
  • An Introduction to XQuery
  • XML 1.0: An XML Query Language
  • On Demand Webcast: Enabling XML and SOA Applications with DB2 9

    On the subject of combining XQuery with SQL, you have a couple of excellent resources:

  • Integration of SQL and XQuery in IBM DB2
  • Query DB2 XML Data with SQL
  •    
    Justin Whitney is a regular contributor to DevX.com and Jupitermedia. He currently lives in San Francisco, where he consults for leading high-tech firms and writes about emerging technologies.