Browse DevX
Sign up for e-mail newsletters from DevX


Apache's Xindice Organizes XML Data Without Schema : Page 3

A native XML database can make a lot of sense for organizations that want to store and access XML without all the unsightly schema mapping required to store XML in a traditional relational database system. Several commercial native XML databases exist; now, we take a first look at Apache's open source offering, Xindice.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Extracting the Data
Now that I have a method to insert data into the database, it is time to create one to get it back out. Unlike a relational database, there are two ways we can get data out of Xindice. The first and simplest way is to get an entire resource using its unique identifier. Because I know that I want to get an entire XML document, all I need is a method that accepts the unique identifier as a string and returns a string with the XML data. The method body follows:

XMLResource document = (XMLResource) col.getResource(ID); if(document != null) return (String) document.getContent(); else return "";

As you can see, the method is very simple. Because I know ahead of time I will be getting an XML document, I can cast the result of getResource to an XMLResource. From there I simply check to see whether or not the result is null then return the appropriate string.

Of course, the above method is only useful if I know the unique identifier and I want the whole document returned. If I want to query an entire collection for a specific subset of data I need to use XPath. Before jumping right into the method, let me show you an example of an XPath query. Assume I put the following document into the database:

<?xml version="1.0"?> <product product_id="1" type="widget"> <description>foo</description> </product> <product product_id="2" type="widget"> <description>bar</description> </product> <product product_id="3" type="widget"> <description>foobar</description> </product>

If I wanted to write an XPath query to select the second product, I could use the following query string:


The query result would be an XPath node-set that contains one node for each result found. In this case, the result would be:

<product product_id="2" xmlns:src="http://xml.apache.org/xindice/NodeSource" src:col="/db/products" src:key="2"> <description>bar</description> </product>

However, if I change the XPath query string to find all products of type widget than my result would contain more than one node. Below is the modified XPath query string and its result.

/product[@type="widget"] <product product_id="1" xmlns:src="http://xml.apache.org/xindice/NodeSource" src:col="/db/products" src:key="1"> <description>foo</description> </product> <product product_id="2" xmlns:src="http://xml.apache.org/xindice/NodeSource" src:col="/db/products" src:key="2"> <description>bar</description> </product> <product product_id="3" xmlns:src="http://xml.apache.org/xindice/NodeSource" src:col="/db/products" src:key="3"> <description>foobar</description> </product>

The above was just a basic example of XPath. For a more thorough look at XPath try this article from Top XML.

With the basics of XPath covered let me jump right into writing my query method. For this method I am going to use a similar signature to the one I used to retrieve an entire document. It will accept an XPath query string as a parameter and return a result as an XML string.

XPathQueryService service = (XPathQueryService) col.getService("XPathQueryService", "1.0"); ResourceSet resultSet = service.query(xpath); ResourceIterator results = resultSet.getIterator(); String allResults = ""; while (results.hasMoreResources()) { Resource res = results.nextResource(); allResults += "[" + res.getContent() + "]"; } return allResults;

Because any given XPath query can return an arbitrary number of results I use the ResourceIterator class to loop through the results. I concatenate each result in a single result string that I return after I am finished looping. The above example queried an entire collection. However, it is also possible to use XPath queries on individual documents. To do that, one would use the queryResource method instead of the query method.

As with selecting data there are two ways to update a document in Xindice. The easy way is to overwrite it. With Xindice, if you attempt to insert a document with an identifier of an existing document it will simply overwrite it with no questions asked. For small XML documents this generally makes the most sense. However, with very large documents it often makes more sense to update only what has changed in the document. In this case you would use the second method of updating data, XUpdate.

Before explaining XUpdate, let me first implement a method call the XUpdate service. My method will take two parameters: a unique identifier and an XUpdate string. The unique identifier represents the document I want to update, while the XUpdate string contains the rules as well as the data to update it with. Here is the method body:

XUpdateQueryService service = (XUpdateQueryService) col.getService("XUpdateQueryService", "1.0"); service.updateResource(ID, xupdate);

The method is about as straightforward as you get. Interestingly enough, you can also use XUpdate on an entire collection. Like XPath, simply use the update method instead of the resource specific updateResource method. With this kind of power clearly all the work is done in the XUpdate string.

Thanks for your registration, follow us on our social networks to keep up-to-date