RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Creating Hierarchical Data Structure Mashups

You can use SQL to connect or "mash-up" hierarchical structures, joining them at or below the root to create larger queryable hierarchical structures.

NSI SQL can naturally perform full multi-path hierarchical data query processing that far exceeds the single linear path processing performed by XPath and XQuery today (see the article "Navigationless Database XML: Hierarchical Data Processing"). That article covers the basics of full hierarchical processing and describes how SQL can inherently perform this advanced processing. This article describes new ANSI SQL hierarchical capabilities that can perform semantically correct advanced hierarchical data structure joining and mashups.

While the previous article concentrated on using the SQL LEFT OUTER JOIN operation to model and process hierarchical structures—both logical relational and physical XML structures—in individual views, this article demonstrates how you can join these heterogeneous hierarchically complex structure types together dynamically and seamlessly into a larger hierarchical structure. You'll also see how standard SQL hierarchical structure joining also naturally supports unrestricted data structure mashups that still follow standard hierarchical principles. A powerful and advanced new hierarchical data modeling capability makes this possible.

SQL Hierarchical Structure Views
When integrating relational and XML data, it is important to get the correct result without any data loss. Unfortunately this has not yet been achieved. Today's processing introduces inaccurate results that inhibit correct hierarchical XML processing in databases. However, by modeling relational flat data as hierarchical structures, it becomes possible to integrate relational data with XML without any loss of data accuracy. Figure 1 demonstrates this by defining a hierarchically modeled XML view named "XML," and a relationally modeled hierarchical view named "RDB."

Figure 1. Hierarchical Data Modeling: The figure shows a hierarchical data model in both SQL and XML.
The difference in the XML and relational view in Figure 1 is the View keyword. In the XML view definition (XVIEW), a middleware XML processing extension receives control, accesses the native XML data, and maps XML element names to SQL table names and XML element and attribute data names to SQL column names. Because XML is a contiguous physical structure, it does not require foreign data keys to define its structure, so the SQL ON clause linkage clause in the XML View uses table names rather than column names to make the node connections as shown in the XML view in Figure 1. This is a physical XML view definition, but it's treated as a SQL logical view after the data is retrieved, enabling seamless heterogeneous operation of relational and XML data.

In both SQL views in Figure 1, the LEFT OUTER JOIN operation preserves the left data argument even when there is no matching right argument, meaning that a parent node can exist without a child node, but a child node can not exist without a parent node. Notice in Figure 1 that you can define single paths by following a path of nodes (the RDB view) or define multiple paths by joining multiple nodes to the same parent node (the XML view).

As you can see from Figure 1, if a relational RDB hierarchical path occurrence has no existing R node then nodes D and B are never linked. On the other hand, if an R and D node exists than a partial path is possible—so variable length paths are possible in the relational view. In the XML view, if an X node does not exist then node M and L are never linked, but if X and M nodes exists than one path can exist while the other does not.

Heterogeneous Views
The SQL below hierarchically joins the two hierarchical SQL views from Figure 1 to create a unified heterogeneous logical view called "RDBXML." SQL view JOIN operations are not performed until the view is invoked in real time. That invocation transparently performs the relational/XML data integration seamlessly and automatically. 

The preceding code places the relational RDB view structure hierarchically over the XML view structure and connects them using a single LEFT OUTER JOIN operation. This is possible because entire hierarchical views are being joined together, not just joined one single node at a time as in Figure 1. The technique provides incredible power. Figure 2 shows the result structure and its expanded SQL.

Joining Hierarchical Structures Hierarchically
Figure 2. Expanded RDBXML View and Structure: The SQL shown treats the the logical and physical structures exactly the same.
The heterogeneous view in Figure 2 automatically generates the SQL LEFT OUTER JOIN sequence that defines its combined XML and RDB structures shown in Figure 2 when invoked. This causes their stored RDB and XML views to be expanded into a single unified LEFT OUTER JOIN sequence. Notice how the lower level XML view is nested under the upper relational RDB view in its expanded SQL view expansion in Figure 2. In particular, notice the automatic placement of the last ON clause pushed to the end of the multiple LEFT OUTER JOIN view expansion. This ON clause placement causes the lower view structure to be fully defined and materialized (built) before encountering the final ON clause which joins the fully formed XML lower structure to the already materialized upper relational view.

The above delayed structure joining happens whether the lower structure is logical or physical. In the case of a lower level physical structure, this is exactly what is required to naturally process fixed physical structures which naturally operate as fully materialized structures. This allows logical and physical structures to be processed exactly the same. This is performed automatically by ANSI SQL.

Figure 2 demonstrates an expansion of the XML and RDB views into a unified logical view using the LEFT OUTER JOIN command, preserving the combined semantics of both views. Typically, with both XML and relational processing, the integration solution is to flatten the hierarchical data to resemble relational data—losing valuable hierarchical semantics. As you can see from Figure 2, this solution does the opposite.

Elevating the relational data to a hierarchical level increases accuracy and opens up many new capabilities. For example, the elevation makes the relational/XML data integration seamless and increases the semantics used in the SQL query. This in turn makes the query more powerful because it can automatically handle multi-path queries. This increases the value of the user's data by taking advantage of additional semantic data that exists between the different pathways of the accessed structure, and greatly increases the number of different queries possible from the same multi-path structure.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date