Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Performing Hierarchical Restructuring Using ANSI SQL

Discover the flexibility and power of using ANSI SQL to reshape and transform hierarchical structures.

his is the third in a series of DevX articles on the new hierarchical XML processing capabilities possible with navigationless database hierarchical processing. The first article covered the basics of processing full multipath hierarchical data queries, while the second article discussed combining hierarchical structures into larger hierarchical structures, and demonstrated a new, powerful, and unlimited way to mashup hierarchical structures. This article delves further into the topic, demonstrating advanced new multipath (nonlinear) hierarchical structure transformations. Like the previous articles, the processing follows correct hierarchical principles to derive correct hierarchical results.

Today, the terms "restructuring" and "reshaping" are used interchangeably for XML structure transformation processes. However, these two basic types of XML hierarchical structure transformations need to be separately distinguished, because they are different in meaning, results, and use.

  • Restructuring is controlled by existing relationships in the data, while reshaping is controlled by the semantics of the current data structure.
  • Restructuring is performed by using new and unused relationships to restructure the data while reshaping uses the semantics of the current structure to mold the structure into some other shape.
  • Restructuring (using data relationships) can create a new structure and data with new semantics, while reshaping (using structure semantics) alters the structure without changing the data and its semantics.
Restructuring and reshaping both have their uses. You usually use restructuring to match a structure to its use in an application, and you usually use reshaping to map a structure to some desired format.

This article shows restructuring examples in SQL, which can perform the required operations without navigation or looping constructs. Specifying these SQL statements does require some thought—as you might expect for transformations. The results are correct because all operations are performed hierarchically by strictly following the hierarchical relationships in the SQL specified data structure. SQL's hierarchical structure operation helps considerably in specifying the transformations, and simplifies designing complex transformations without introducing errors. The techniques and principles used here for restructuring are equally valid for navigational and procedural transformations used in other XML processors, but the full nonlinear transformations shown in this article may be too difficult to perform procedurally.

The SQL statements shown with the restructuring examples in this article can be executed using an ANSI SQL Transparent XML Hierarchical Processor prototype. The SQL examples can be modified. You can find the hierarchical processor prototype and directions for its use at http://www.adatinc.com/prototype.html.

Restructuring Using Data Relationships

Figure 1. The StoreView View: The figure shows the deconstruction of the StoreView view into the two Cust and Emp sub-view fragments.
The example in Figure 1 uses a StoreView view, which joins the circled Cust and Emp sub-views to create a complex hierarchical structure that can be either a logical relational or physical XML structure, or heterogeneous combination. The example transforms this contiguous structure by isolating and manipulating structure fragments. Note that logical and physical structures appear the same in SQL working sets. A fragment is a connected subset of nodes from a hierarchical structure (which can contain multiple pathways) located either from or below the root. SQL handles this naturally and automatically. In other words, data field selection using the SELECT list non-procedurally specifies which nodes are to be selected for output from the query; all other required data fields are automatically referenced as needed. The SV1 fragment is hierarchically LEFT JOINED over the SV2 segment using this SQL:

SELECT SV1.EmpID, SV1.DpndID, SV2.CustID, SV2.InvID, SV2.AddrID FROM StoreView SV1 LEFT JOIN StoreView SV2 ON SV1.EmpCustID=SV2.CustID

Node promotion, a standard hierarchical operation, causes the structure fragments to become contiguous, letting you manipulate and join them easily into the structure being constructed using existing relationship values—all at a high hierarchical conceptual level new to SQL/XML processing. SQL's relational processing performs the node promotion automatically because that is the way data selection (relational projection) also works.

Figure 2. StoreView Decomposition and Transformation: The figure shows the process of selecting specific items from two fragments (identified as SV1 and SV2) to create a transformed result structure.
Figure 1 also shows the high hierarchical conceptual level of SQL processing. The separate fragments SV1 and SV2 (dotted circles) are translated into following SQL using the correlated names "SV1" and "SV2" as prefixes. Note how the prefixes are defined on the JOIN statement and used in the SELECT list; such usage is common in the other examples too. This uses SQL renaming/aliasing at the table and sub-view level, allowing fragments in the same structure to be isolated and referenced separately by their identifying assigned prefix. Figure 2 shows the selection process and the resulting structure.

Note that Figure 2 uses a number of different symbols used to define the hierarchical processing. These visual conventions are used throughout this article, specifically:

  • A solid box indicates a selected node.
  • A dashed box indicates an unselected node that is sliced out of the query returned result.
  • A solid line connects nodes in the active structure.
  • A dashed line connects nodes not in the active structure.
  • A dashed arrow indicates the relationship join data points. If not specified, the solid arrow specifies the relationship data points controlling the join in addition to its other uses.
  • A solid arrow represents the data modeling structure node linkage between the structures being joined. This is used to complete the unified data structure of all the structures joined, and represents the structure semantics that control query processing.
All the figures in this article use these conventions. The solid arrow that represents the data modeling structure linkages between structures being joined also usually specifies the ON clause data relationship linkage points. It is possible (for reasons explained later) that the ON clause data relationship linkage points may differ from those indicated by the data modeling solid arrow, in which case a dashed arrow indicates the data relationships linkage points.

An ANSI SQL Transparent Hierarchical XML Processor prototype can fetch the restructuring example SQL statements in this article, and execute them in real time. You can download this hierarchical processor prototype along with directions for using it. Feel free to modify the SQL examples to test their effects on the processing.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.