devxlogo

Using XML Documents as a Database

Using XML Documents as a Database

Question:
Would it be practical to use an XML document, or more accurately documents, as a database, and if so, how would you recommend doing this? I would use XML, JavaScript, Internet Explorer 5, and HTML to accomplish this. There would be about 30 entities and 2000 individual records. I do not want to use a proprietary database, and this application will be intranet-based.

Answer:

XML actually makes for a pretty fair database when the number of records is relatively small or compact. You’re probably pushing the envelope with a recordset of that size, however?that’s some 60,000 nodes that need to be processed, which could prove to be a fairly major memory buster, especially as the DOM keeps all of that in memory at once.

One solution you may want to think about is to model your XML to a certain extent on the way traditional databases are built. Most DBMS applications can let you specify indexed fields?these are the primary fields that are easily searchable, and are actually kept in separate internal tables.

An XML solution would be similar, but would work on the idea of partition files. A partition file would contain a specific number of records in XML. The index file would then contain a smaller version of the records that contained just the relevant key fields. If a match took place on the key field, then the appropriate partition could be loaded into memory, the full record extracted, and the result placed into a queue with other records that satisfied this query.

One advantage to this solution is that your key searches can take place very quickly without significantly impacting your available memory. You could also search non-key fields, but to do this you would have to load in each partition, perform the search on the partition, aggregate the results into a collection, close the partition and open the next.

This would be a slower operation, though probably not significantly so. When you add a new record, you load the newest partition, check the count of records in the partition, then either add the record to the partition or create a new partition depending upon whether you’ve reached your record count. Deleting a record in a partition would involve setting up a tag in the record indicating that record’s status. Then periodically you would need to remove the deleted records, consolidate the partitions, and recreate the index file. Editing would work in a similar manner?if you edited the field, you would need to also change the index file so that it matched the latest changes if they happened to a key field.

This strategy is similar to the way that most traditional databases work, by the way. The advantage to doing it with XML is that you can make the whole process transparent. If you pass an XPath query to your “database,” it would check to see if it could satisfy the query with just the index file first, then would attempt the larger, but slower, query if the first query didn’t satisfy the problem.

I would recommend for performance purposes that you do the actual database work on the server, then return the result set to the client for display. Most client computer systems aren’t really designed for memory intensive applications, and even with partitioning, creating a database of any significant size can just eat up memory.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist