advertisement
Premier Club Log In/Registration
  Include Code  Search Tips
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   SKILLBUILDING  |   TIP BANK  |   SOURCEBANK  |   FORUMS  |   NEWSLETTERS
Browse DevX
Partners & Affiliates
advertisement
advertisement
Average Rating: 4.5/5 | Rate this item | 2 users have rated this item.
 Print Print
 
Beyond Tables: Dealing with the Convergence of Relational and XML Data
Developers are under increasing pressure to work with heterogeneous information. The advent of XML provided a data representation suitable for both regular (database tables) and irregular content, and was a significant step forward, but representation alone is not enough. We need a powerful standardized method for querying that data as well. XQuery fulfills that need. 

advertisement
ew serious applications are considered enterprise-worthy without a core database engine backed by an extensive, normalized, and optimized relational database architecture. Traditionally, such database applications rely on SQL queries and statements to retrieve and update data in the back-end store. But that's about to change. The W3C XQuery language is accelerating towards "Recommendation" status; SQL developers should take note. According to this March 2005 Developer Survey conducted by my employer, DataDirect, XQuery is quickly becoming both a required and a core component in both emerging enterprise application architectures such as service-oriented architectures (SOA), and in more established enterprise architectures such as J2EE. XQuery is the best approach for integrating XML and relational data and will quickly become as ubiquitous in the future as SQL is today.

Setting the Scene
The invention of XML and the Internet created opportunities for businesses to exchange information in a way previously possible only through narrowly defined data interchange formats such as EDI (Electronic Data Interchange) typically governed by organizations such as the Data Interchange Standards Association (DISA). XML is now considered the de facto standard for retrieving and exchanging data. However, the rapid growth of XML and increasing proliferation of hierarchical messages presents a fresh set of challenges to established enterprise applications and developers who have historically built their business process around relational databases.

Business-critical data is typically stored in relational database management systems (RDBMSs). By centralizing storage and distribution of data, relational databases consolidated data security, integrity, and control within a single system. Many database systems are well-established and reliable enough that their existence is entrenched; they're unlikely to disappear any time soon. In spite of this dominant position, the growth of XML is forcing modern business applications to function seamlessly with both relational and XML data.

Relational data and structured XML data have very different models. Before analyzing various data integration approaches, it is worth revisiting the organization of the data models.

The Relational Model
Relational data is organized according to a set of cardinalities and logically defined dependencies known as normalizations. A table is required to express a single defined set of data, with each table containing a set of records organized by rows, or tuples. Data in each of these tables is organized by columns, which may serve as keys. A key column uniquely identifies the data in the rows of a table.

You can establish simple relationships between tables by storing data within the same tuple, or more complex relationships by using separate relationships and common keys.

XML Model
XML data turns the relational data model almost completely on its head. Relationships between data are intrinsic as opposed to the more explicit relationship expressions used in the relational model. XML documents use parent/child relationships and element/attribute relationships. Hierarchical data relationships are more obvious than relational relationships; they are based on the relative position of each node within the document and easily discernable.

Specialized Optimizations
Relational tables and XML documents are both powerful ways to represent relationships between data, but each is optimized to provide a particular benefit. Relational tables, coupled with keyed columns, are optimized for efficient data retrieval with minimum fuss. XML documents are optimized to express the intrinsic relationships of data that together make up an XML document.

Bringing Together Tables and Documents
Why do enterprises need both XML and relational data technologies? The answer is that enterprise application developers must leverage existing investment in applications based on the relational model while quickly adapting them to the heterogeneous and message-driven nature of XML data.

XQuery: The Resolution
XQuery is the best approach for integrating XML and relational data. The W3C XQuery specification provides a native XML query language that integration platforms and components can use to solve this problem. XQuery levels the data integration playing field by providing a single interface that lets developers access multiple data sources under a unifying data model. Middleware products are set to deliver Java components that provide developers with extensive options for presenting and exchanging their relational data as XML and for processing relational and XML data together.

The gradual leveling of the data integration landscape will precipitate further by RDBMSs embedding XQuery support as a means to expose relational data as an XML data source, therefore implicitly increasing data portability and accessibility via XQuery itself. RDBMSs without integrated support for XQuery will continue to delegate the responsibility to the middle tier to ensure their equal participation in increased data integration.

Before XQuery, developers' design patterns strategies for integrating relational and XML data were limited to:

  • Shredding (decomposing XML into relational tables) XML data into individual table columns in relational database—this process flattens the built-in data hierarchy, and (potentially) loses the intrinsic internal data relationships. The original XML document itself is also lost, although in can, in some cases, be reproduced from the shredded data. If preserving the XML structure is unimportant, shredding is reasonable approach for combining XML and relational data.
  • Storing the XML data as unstructured data in a relational database—using the CLOB (Character Large Object) data type. CLOB columns can store an XML document in its entirety, thus preserving both the document and its internal relationships. However, treating the XML document as nothing more than a text file severely compromises the ease with which it can be queried and searched.
  • Storing the XML data as a structured XML document in a relational database
The last alternative—structured XML document storage—enables a close relationship between XML data and relational data within a traditional relational database. This method preserves the document structure and maintains the hierarchical relationships within the document, but relies on direct support for structured XML as part of the database architecture. To successfully execute a concrete relational and XML data integration strategy requires a consistent, standards-based approach; however, the most widely used relational databases currently have wildly varying levels of support for structured XML and relational data co-existence, thus making portable data integration difficult.

This lack of native database standardization and support for XML means that developers are likely to turn to middle-tier components to obtain a consistent integration end-point for relational and XML data.

XQuery and Middle-tier Components
The middle tier will likely emerge as the sweet spot for developers to establish an integration component end-point (iCE) as a means to integrate a set of distributed data sources, both relational and structured. This iCE component will typically exist in (but will not be restricted to) the middle tier between the application (client tier) and data source components, and encapsulates a W3C XQuery implementation and runtime. Such middle-tier XQuery implementations will help application developers address their relational and XML data integration challenges.

XQuery offers the best integration technology because it leverages the structure of XML to allow applications to express queries across all kinds of XML data regardless of the data's location. Weaving together distributed relational data sources with XML data also provides a solid foundation on which to migrate and build applications towards SOA deployments.

Readers may be wondering whether such integration truly requires creating yet another query language. To answer this question, it's worth considering some history.

Editor's Note: The author, Jonathan Bruce, is a Technology Evangelist for DataDirect Technologies, a vendor of XML processing components and database drivers. We have selected this article for publication because we believe it to have objective technical merit and valid insights.

Page 1 of 2
advertisement
  Next Page: Do We Really Need Another Query Language?
Page 1: IntroductionPage 2: Do We Really Need Another Query Language?
advertisement
Advertising Info  |   Member Services  |   Permissions  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About


JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
Microsoft Article: 7.0, Microsoft's Lucky Version?
Microsoft Article: Hyper-V--The Killer Feature in Windows Server 2008
Avaya Article: How to Feed Data into the Avaya Event Processor
Microsoft Article: Install What You Need with Windows Server 2008
HP eBook: Putting the Green into IT
Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
Avaya Article: Setting Up a SIP A/S Development Environment
IBM Article: How Cool Is Your Data Center?
Microsoft Article: Managing Virtual Machines with Microsoft System Center
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Intel Video: Are Multi-core Processors Here to Stay?
On-Demand Webcast: Five Virtualization Trends to Watch
HP Video: Page Cost Calculator
Intel Video: APIs for Parallel Programming
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Sun Download: Solaris 8 Migration Assistant
Sybase Download: SQL Anywhere Developer Edition
Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
Red Gate Download: SQL Compare Pro 6
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
IBM Article: Collaborating in the High-Performance Workplace
HP Demo: StorageWorks EVA4400
Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES