An XQuery Servlet for RESTful Data Services

An XQuery Servlet for RESTful Data Services

any web applications exchange data as XML, but that data is usually stored in and queried from relational databases, CRM, ERP, proprietary repositories, and a hodgepodge of other systems. Unfortunately, the languages most commonly used for creating or processing data on the web were designed neither for processing XML nor for integrating data among multiple heterogeneous sources. These are precisely the tasks for which the XQuery language was designed.

This paper shows how to use XQuery for data integration, and how to expose an XQuery as a RESTful data service using a Java servlet. Listing 1 contains the source code for the servlet. This servlet uses the name and external variables of any XQuery to provide a REST interface to the query and deploys the query.

As an XML-oriented data integration language, XQuery can be used to access XML, relational, and flat file formats such as EDI to create complex XML and HTML results. To deploy a query, a developer saves the query into a designated deployment directory in a secure location accessible to the servlet. Subsequently, developers can invoke any query in this directory using its REST interface, which requires nothing more than an HTTP GET or POST operation using a URL that represents the query and its parameters.

Using XQuery for Data Integration
XML plays a central role in most data-intensive web applications, and XQuery was designed to make it easy to find data in XML and to process and transform XML to create any desired XML structure. XQuery simplifies programming with XML in the same way that SQL simplifies programming with relational data and Java simplifies programming with objects?each language was designed to work with data using a particular data model, and supports the operations that are commonly needed in the given paradigm.

 
Figure 1. Data Integration Without XQuery: The figure illustrates a typical servlet that gathers data from heterogeneous sources, and then processes the results into a usable form.

In addition, the XQuery language was also designed to simplify data integration. Many web applications need to combine data from various sources, including XML, relational databases, legacy formats, and Web services. Each of these data sources typically has its own API and data model, and sometimes also has its own query language. After writing the code to retrieve the data their applications need from each of these data sources, developers then typically write yet more code to combine the data.

Consider a servlet that combines data from two databases and a Web service ? in the Java world, this typically involves coding to three different APIs, then writing Java code or JSP to combine the results, as shown in Figure 1.

The process illustrated in Figure 1 is much easier in XQuery, which queries data in relational databases and other sources as though that data were stored XML. An XQuery implementation designed for data integration can represent almost any kind of data as XML, either by providing an XML view of the data via middleware, or by physically converting it to XML. Such implementations can be optimized for each data source, freeing the programmer from the idiosyncrasies of each data source. Consider the following query, which joins an XML document to a table in a relational database to create an XML result:

   for $h in doc("holdings.xml")/holdings/entry   for $c in collection("companies")/companies   where $h/userid = "Minollo"      and $c/ticker = $h/stockticker      return                   { $c/companyname }           { $c/annualrevenues }         

The first line of this query accesses an XML document on the file system using the doc() function. The second line addresses a relational table using the collection() function.

Author’s Note: The examples in this article are based on DataDirect XQuery, which uses the collection() function to address relational tables. Unfortunately, at this time there is no standard way to address a relational table from XQuery.

A Java program equivalent to the above query would use JDBC and SQL to access the relational data and an XML API such as DOM, SAX, or StAX to process the XML source and create an XML result. The XQuery version is simpler because it treats both data sources the same way, provides direct support for querying and combining data as XML, and can directly create any desired XML structure. And the query is declarative?rather than specifying the steps needed to create the XML result, the query specifies the desired result and lets the implementation find the best way to implement the query.

The declarative nature of XQuery makes it easier to optimize for a variety of data sources; a good implementation can both transparently generate efficient SQL for relational databases, and also retrieve only the required data from the XML file by telling the parser to ignore other data.

It’s worth exploring how an XQuery is executed for several representative data sources?XML, SQL databases, non-XML file formats, and Web service calls. Note that the range of data sources supported by any given XQuery implementation and the strategies used to execute an XQuery against a given data source vary widely. The next section of this article briefly discusses the strategies used by DataDirect’s XQuery implementation.

Editor’s Note: The author, Jonathan Robie, is the XQuery Technology Lead for DataDirect, a vendor of XQuery products. We have selected this article for publication because we believe it to have objective technical merit.

Efficient XQuery for XML
One of the most important factors in processing XML is to have an efficient representation of the input document that permits incremental processing. Optimization through query rewrites is also important. The syntax of XQuery may look procedural, but a straightforward procedural implementation of XQuery does not perform well, especially for large XML files or complex queries, so good implementations may rewrite a query significantly. Many of these rewrites are commonly used for many languages, including constant folding, elimination of common subexpressions, loop rewrites, and ordering rewrites.

In addition to these rewrites, two techniques known as document projection and streaming can dramatically improve speed and memory usage, especially for very large input documents (see Projecting XML Documents).

Document projection involves examining a query to determine what parts of a document the query needs, and using that information at parse time to ensure that only those parts of the document get constructed when the document is parsed. This obviously saves both memory and time, because building the input document accounts for a significant amount of time, and searching through parts of the document that are never needed also accounts for time. Document streaming involves using each portion of an input document to compute the output for which it is responsible, then discarding that portion of the input while processing the next portion. Document streaming does not generally improve speed, but it dramatically improves memory usage, to the extent that memory usage for many queries is near-linear regardless of the size of the document. Implementations that do not use document projection or streaming may have difficulty processing XML files much larger than about 30 MB. In contrast, implementations that use document projection and streaming may be able to handle more than 30 GB for typical queries, but even queries against small files will execute faster.

Efficient XQuery for SQL Databases
Relational data can be queried efficiently with XQuery by converting the query to SQL, executing the SQL in the database, and returning the results as XML. The quality of the generated SQL can dramatically affect performance. Only the data actually required to compute query results should be returned from the database; rows and columns that are not needed should be discarded. To do this, the implementation must generate maximally selective SQL, taking into account all aspects of the original XQuery that might restrict the data that is actually required. Operations that have a straightforward SQL equivalent should almost always be performed in the database. This is particularly important for joins, sorting, and functionality available in the SQL library (which is particularly helpful when implementing the extensive XQuery library, but some ingenuity is required to account for the differences between SQL functions and XQuery functions).

When creating hierarchical XML structures, some algorithms in the generated SQL generally perform better than others; for example, the sort-merge algorithm has been shown to have very good overall performance. When supporting databases from multiple vendors, it’s tempting to rely on a SQL subset portable among most databases; however, translating XQuery for optimum performance often requires the richer functionality found in modern relational databases, which differs among vendors. Because of this, an implementation can perform much better if it tailors the generated SQL to a particular database vendor. Hints can be provided to give the programmer control over the generated SQL. For more information on SQL generation for XQuery, see the article DataDirect XQuery? 2.0? Performance: Generating SQL.

Result retrieval is also an important factor in overall performance. Obviously, high-quality drivers can significantly improve performance. Because some XML APIs and processing patterns require streaming, an implementation should support incremental retrieval so that the query processor can use the first part of the result when appropriate while later parts are still being computed.

XML Converters for non-XML Formats
Many data integration environments have to cope with data in non-XML formats. Some of these formats, such as comma-delimited files, have a simple structure. Others, including EDI, have complex structures?and there are thousands of EDI formats. Predefined XML converters can convert such data to XML as it is queried, and tools exist for creating custom XML converters; for instance, Stylus Studio supports designing XML converters in a graphical environment. Once generated, such XML is queried in the same manner as any other XML, and the same optimization strategies apply. An XML converter that supports streaming can also support document projection and streaming.

XQuery for Web Service calls
Web services are a useful and common way to expose data from applications as XML. Because both SOAP requests and responses are expressed in XML, XQuery is very useful for generating or processing SOAP messages. If an XQuery implementation allows Web service calls within a query, then a single XQuery can formulate a request and process the result. For example, the following query shows how to create Web service call to an Amazon Web service to obtain a book description identified by an ISBN number:

   declare function local:amazon-listing($isbn)   {            All       Ship              ASIN       { $isbn }       Medium        };       let $loc :=           let $payload := local:amazon-listing("0395518482")   return ws:call($loc, $payload)    

The last line of the preceding query issues thee Web service request, specifying a location and a payload. The function at the beginning of the query creates the XML for the payload. Because Web service requests can have complex structures, and because data needed to formulate a Web service request may come from many sources, XQuery is very useful for creating payloads.

With the underlying basics in hand, it’s time to move on to the main topic?exposing an XQuery with a RESTful interface.

Exposing an XQuery as a RESTful Data Service
The main task of a servlet is to respond to an HTTP request by assembling a response to be returned to the client. You’ve seen how XQuery excels at data integration tasks, treating all data sources as XML and returning XML as the result of any query, making XQuery a natural choice for data integration in servlets that return their results as XML or HTML.

First, I’ll discuss how to expose an XQuery to the client using HTTP GET or HTTP POST operations, and then I’ll discuss the design of the sample Java servlet that deploys the queries.

Calling an XQuery with GET
First, consider how to expose an XQuery so it can be called using HTTP GET. The following code shows an XQuery containing an external variable, called user:

   declare variable $user as xs:string external;           { $user }     {         for $h in collection('HOLDINGS')/holdings         where $h/userid = $user         return                        { xs:string($h/stockticker) }             { xs:string($h/shares) }                 }    

You’d access this query using a URL similar to the one shown below, which specifies the name of the query (in the URL parameter “q”) and a value for the external variable:

   http://tagsalad.org/xquery?q=portfolio&user=jonathan

Using the servlet described later in this section, a developer can write an XQuery, test it locally, and then deploy it by placing it in a server-side deployment directory, where the query is protected from the outside world. After deployment, HTTP clients can invoke the query and obtain results using the simple URL shown above.

Calling an XQuery with POST
GET requests are fine for simple queries, but when query parameters have complex structure or need to be given XML Schema datatypes, it is generally better to specify parameters in the content of an HTTP POST request?the approach generally used for SOAP web messages. The following example shows an XQuery and an XML message that contains the query parameter. If an HTTP request has content, the servlet attempts to parse it as XML, binding the result to the variable $content:

   portfolio.xquery -- a query with an external variable   declare variable $content as document-node() external;      let $user := string($content/parameters/user)   return            { $user }       {         for $h in collection("HOLDINGS")/HOLDINGS         where $h/USERID = $user         return                        { xs:string($h/STOCKTICKER) }             { xs:string($h/SHARES) }                   }     

To run the XQuery post a message to the URL that specifies the name of the query (in the URL parameter ‘q‘) and carries the parameters as POST data to obtain query results. Here’s an example of the URL and the HTTP POST content:

   http://tagsalad.org/xquery?q=portfolio      HTTP CONTENT:   Jonathan

To make such queries work, you need an intermediary to accept the web requests (GET or POST) and run the appropriate query. That’s what the XQuery RESTful servlet does.

Implementing the XQuery RESTful Servlet
The examples shown in the previous two sections illustrate the requirements for a RESTful servlet. Queries must be protected from the outside world, but easily deployed by copying them into a deployment directory that is accessible to the server. Queries can be parameterized using URI parameters and/or the content of a HTTP request. The result of a query is returned to the client as the result of the HTTP request.

This servlet is written in Java and implements the Servlet API. It uses XQJ, a Java API that serves as “the JDBC for XQuery,” to invoke XQueries. To improve performance, the servlet prepares each query and places it in a HashMap the first time a client invokes that query. You can see the complete servlet code in Listing 1. Here’s an outline of the structure of that program.

When the servlet is initialized, the init() method shown below creates an empty HashMap to hold prepared queries, sets indentation properties, and connects to the data sources used on the server.

   public void init() {         xqueryMap = new HashMap();      indentationProperty = new Properties();      indentationProperty.setProperty("indent", "yes");      try {         dataSource = new DDXQDataSource(            new FileInputStream(XQueryServlet.CONFIG_FILE));         connection = dataSource.getConnection();      }      catch (Exception exception) {         System.out.println("Could not initialize DataDirect " +             "XQuery due to an Exeption:");         exception.printStackTrace();      }   }

When the servlet terminates, the destroy() method closes all open connections.

   public void destroy(){      try {         if(connection != null){            connection.close();         }      }       catch(XQException anException){         //just making sure that a close took place.         //no real work to perform on this Exception      }   }

HTTP requests from the client result in calls to doPost(), doGet(), or doPut(), but all these methods delegate to a method called doXQuery(), which actually executes the requested query and creates the result.

The doXQuery method first obtains a prepared query by calling findXQuery(), then creates XQuery external variables with the same names and values as parameters found in the HTTP URI. These variables have type xs:string, but the query can cast them to any desired type. The servlet parses any HTTP content in the request as XML and binds it to the variable $content, which can be used in a query. Finally, doXQuery executes the query and writes the result to the return buffer.

The findXQuery() method takes the name of a requested query as a parameter and returns a prepared query. The findXQuery() method first checks the HashMap to see if this query has already been prepared and is up to date; if so, it simply returns the existing prepared query. Otherwise, it looks for the query in the XQuery deployment directory (which is specified in the WEB.XML file shown in Listing 3), prepares the query, adds it to the HashMap, and returns the prepared query. Here’s the findXQuery method code:

   private XQPreparedExpression findXQuery(String shortName)       throws Exception{         //TODO: is the date needed?      //Date now = new Date();      String xqueryFileName = shortName + XQueryServlet.XQUERY_FILE;      XQueryMapEntry entry = null;               entry = (XQueryMapEntry)xqueryMap.get(shortName);      if (entry != null) {         File xqueryFile = new File(xqueryFileName);         if (entry.getDate().before(new Date(            xqueryFile.lastModified()))) {            // The prepared query is stale - prepare again            entry.setQuery(connection.prepareExpression(               new FileReader(xqueryFileName)));            return entry.getQuery();         }         else {            // The prepared query exists and is up to date            return entry.getQuery();         }      }
 
Figure 2. Data Integration with an XQuery Servlet: The servlet uses the REST interface to select and parameterize XQuery queries. XQuery can query each data source and integrate results, eliminating the need for many APIs.
else { // This query has not yet been prepared and // added to the map entry = new XQueryMapEntry(); entry.setQuery(connection.prepareExpression( new FileReader(xqueryFileName))); xqueryMap.put(shortName, entry); return entry.getQuery(); } }

Figure 2 shows the revised architecture after implementing the XQuery servlet.

As you’ve seen, it’s not terribly difficult to create an XQuery servlet that implements the Java Servlet API, using XQJ to issue XQueries. But it’s a powerful idea, because developers can develop data services by writing queries in XQuery, testing them, and simply copying them to the deployment directory. The servlet makes deployed queries instantly available to users, providing an HTTP interface determined by the query name and its parameters. This development/deployment simplicity is an extremely productive way to create data services.

Editor’s Note: The author, Jonathan Robie, is the XQuery Technology Lead for DataDirect, a vendor of XQuery products. We have selected this article for publication because we believe it to have objective technical merit.
devx-admin

devx-admin

Share the Post:
Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India,

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in and explore the leaders in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India, and kickstart your journey to

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner for your online project. Your

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not

Cybersecurity Strategy

Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies

Global Layoffs

Tech Layoffs Are Getting Worse Globally

Since the start of 2023, the global technology sector has experienced a significant rise in layoffs, with over 236,000 workers being let go by 1,019 tech firms, as per data

Huawei Electric Dazzle

Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting

Cybersecurity Banking Revolution

Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising

FinTech Leadership

Terry Clune’s Fintech Empire

Over the past 30 years, Terry Clune has built a remarkable business empire, with CluneTech at the helm. The CEO and Founder has successfully created eight fintech firms, attracting renowned

The Role Of AI Within A Web Design Agency?

In the digital age, the role of Artificial Intelligence (AI) in web design is rapidly evolving, transitioning from a futuristic concept to practical tools used in design, coding, content writing

Generative AI Revolution

Is Generative AI the Next Internet?

The increasing demand for Generative AI models has led to a surge in its adoption across diverse sectors, with healthcare, automotive, and financial services being among the top beneficiaries. These

Microsoft Laptop

The New Surface Laptop Studio 2 Is Nuts

The Surface Laptop Studio 2 is a dynamic and robust all-in-one laptop designed for creators and professionals alike. It features a 14.4″ touchscreen and a cutting-edge design that is over

5G Innovations

GPU-Accelerated 5G in Japan

NTT DOCOMO, a global telecommunications giant, is set to break new ground in the industry as it prepares to launch a GPU-accelerated 5G network in Japan. This innovative approach will

AI Ethics

AI Journalism: Balancing Integrity and Innovation

An op-ed, produced using Microsoft’s Bing Chat AI software, recently appeared in the St. Louis Post-Dispatch, discussing the potential concerns surrounding the employment of artificial intelligence (AI) in journalism. These

Savings Extravaganza

Big Deal Days Extravaganza

The highly awaited Big Deal Days event for October 2023 is nearly here, scheduled for the 10th and 11th. Similar to the previous year, this autumn sale has already created

Cisco Splunk Deal

Cisco Splunk Deal Sparks Tech Acquisition Frenzy

Cisco’s recent massive purchase of Splunk, an AI-powered cybersecurity firm, for $28 billion signals a potential boost in tech deals after a year of subdued mergers and acquisitions in the

Iran Drone Expansion

Iran’s Jet-Propelled Drone Reshapes Power Balance

Iran has recently unveiled a jet-propelled variant of its Shahed series drone, marking a significant advancement in the nation’s drone technology. The new drone is poised to reshape the regional

Solar Geoengineering

Did the Overshoot Commission Shoot Down Geoengineering?

The Overshoot Commission has recently released a comprehensive report that discusses the controversial topic of Solar Geoengineering, also known as Solar Radiation Modification (SRM). The Commission’s primary objective is to

Remote Learning

Revolutionizing Remote Learning for Success

School districts are preparing to reveal a substantial technological upgrade designed to significantly improve remote learning experiences for both educators and students amid the ongoing pandemic. This major investment, which

Revolutionary SABERS Transforming

SABERS Batteries Transforming Industries

Scientists John Connell and Yi Lin from NASA’s Solid-state Architecture Batteries for Enhanced Rechargeability and Safety (SABERS) project are working on experimental solid-state battery packs that could dramatically change the

Build a Website

How Much Does It Cost to Build a Website?

Are you wondering how much it costs to build a website? The approximated cost is based on several factors, including which add-ons and platforms you choose. For example, a self-hosted

Battery Investments

Battery Startups Attract Billion-Dollar Investments

In recent times, battery startups have experienced a significant boost in investments, with three businesses obtaining over $1 billion in funding within the last month. French company Verkor amassed $2.1