Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


An XQuery Servlet for RESTful Data Services

Find out how to expose XQuery data integration services by invoking them through a Java servlet using a REST interface.

any web applications exchange data as XML, but that data is usually stored in and queried from relational databases, CRM, ERP, proprietary repositories, and a hodgepodge of other systems. Unfortunately, the languages most commonly used for creating or processing data on the web were designed neither for processing XML nor for integrating data among multiple heterogeneous sources. These are precisely the tasks for which the XQuery language was designed.

This paper shows how to use XQuery for data integration, and how to expose an XQuery as a RESTful data service using a Java servlet. Listing 1 contains the source code for the servlet. This servlet uses the name and external variables of any XQuery to provide a REST interface to the query and deploys the query.

As an XML-oriented data integration language, XQuery can be used to access XML, relational, and flat file formats such as EDI to create complex XML and HTML results. To deploy a query, a developer saves the query into a designated deployment directory in a secure location accessible to the servlet. Subsequently, developers can invoke any query in this directory using its REST interface, which requires nothing more than an HTTP GET or POST operation using a URL that represents the query and its parameters.

Using XQuery for Data Integration
XML plays a central role in most data-intensive web applications, and XQuery was designed to make it easy to find data in XML and to process and transform XML to create any desired XML structure. XQuery simplifies programming with XML in the same way that SQL simplifies programming with relational data and Java simplifies programming with objects—each language was designed to work with data using a particular data model, and supports the operations that are commonly needed in the given paradigm.

Figure 1. Data Integration Without XQuery: The figure illustrates a typical servlet that gathers data from heterogeneous sources, and then processes the results into a usable form.
In addition, the XQuery language was also designed to simplify data integration. Many web applications need to combine data from various sources, including XML, relational databases, legacy formats, and Web services. Each of these data sources typically has its own API and data model, and sometimes also has its own query language. After writing the code to retrieve the data their applications need from each of these data sources, developers then typically write yet more code to combine the data.

Consider a servlet that combines data from two databases and a Web service — in the Java world, this typically involves coding to three different APIs, then writing Java code or JSP to combine the results, as shown in Figure 1.

The process illustrated in Figure 1 is much easier in XQuery, which queries data in relational databases and other sources as though that data were stored XML. An XQuery implementation designed for data integration can represent almost any kind of data as XML, either by providing an XML view of the data via middleware, or by physically converting it to XML. Such implementations can be optimized for each data source, freeing the programmer from the idiosyncrasies of each data source. Consider the following query, which joins an XML document to a table in a relational database to create an XML result:

for $h in doc("holdings.xml")/holdings/entry for $c in collection("companies")/companies where $h/userid = "Minollo" and $c/ticker = $h/stockticker return { $c/companyname } { $c/annualrevenues }

The first line of this query accesses an XML document on the file system using the doc() function. The second line addresses a relational table using the collection() function.

Author's Note: The examples in this article are based on DataDirect XQuery, which uses the collection() function to address relational tables. Unfortunately, at this time there is no standard way to address a relational table from XQuery.

A Java program equivalent to the above query would use JDBC and SQL to access the relational data and an XML API such as DOM, SAX, or StAX to process the XML source and create an XML result. The XQuery version is simpler because it treats both data sources the same way, provides direct support for querying and combining data as XML, and can directly create any desired XML structure. And the query is declarative—rather than specifying the steps needed to create the XML result, the query specifies the desired result and lets the implementation find the best way to implement the query.

The declarative nature of XQuery makes it easier to optimize for a variety of data sources; a good implementation can both transparently generate efficient SQL for relational databases, and also retrieve only the required data from the XML file by telling the parser to ignore other data.

It's worth exploring how an XQuery is executed for several representative data sources—XML, SQL databases, non-XML file formats, and Web service calls. Note that the range of data sources supported by any given XQuery implementation and the strategies used to execute an XQuery against a given data source vary widely. The next section of this article briefly discusses the strategies used by DataDirect's XQuery implementation.

Editor's Note: The author, Jonathan Robie, is the XQuery Technology Lead for DataDirect, a vendor of XQuery products. We have selected this article for publication because we believe it to have objective technical merit.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.