Browse DevX
Sign up for e-mail newsletters from DevX


Apache's Xindice Organizes XML Data Without Schema

A native XML database can make a lot of sense for organizations that want to store and access XML without all the unsightly schema mapping required to store XML in a traditional relational database system. Several commercial native XML databases exist; now, we take a first look at Apache's open source offering, Xindice.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

ML is well deserving of its popularity. Developers are finding myriad uses for it, including application configuration files and object persistence. While using XML in this capacity has many benefits, it can also become an organizational nightmare.

At first glance, a relational database management system (RDBMS) seems like a good way to organize all of your disparate XML data. However, mapping XML documents to relational models is not only difficult, but often results in ugly schemas. For many the answer lies in using a native XML database instead of a traditional RDBMS. This article will describe what a native XML database is, introduce Apache Xindice, and show how to make use of Xindice in a Java application.

As defined by the XML:DB initiative, a native XML database is simply a database for storing and accessing XML using XML. This is different from a relational database in that XML data must by stored as tabular data, accessed using SQL. While it is possible to store XML data in a relational database as a CLOB or map the XML data to a schema, these methods each fall down in their own ways. Storing the data as a CLOB eliminates the need to map the structure of the XML document to a schema, but it doesn't allow the database to understand the structure of the document. This makes it impossible to query the data effectively or update specific sections of the document. Mapping the XML document's structure to a relational schema overcomes these issues, but can heavily degrade performance.

A database that is contextually aware of the structure of XML documents—a "native" XML database—can solve the tradeoff. This method does not require you to store the data in XML. Instead it understands the structure of the contained XML documents, which is important; it allows the documents to be queried and updated using appropriate XML technologies, the first of which is a W3C specification known as XPath. Using XPath it is possible to obtain a set of XML elements contained in a given XML document that conforms to the parameters of the XPath query.

The second technology of importance is XUpdate as defined by XML:DB. XUpdate makes it possible to update specific elements of an XML document without having to overwrite the entire document. It is extremely useful, particular with very large XML documents.

Thanks for your registration, follow us on our social networks to keep up-to-date