Store and Retrieve XML-based Business Objects with SQL Server and .NET (Part II)

his is Part II of an article series that describes how to pass data into and out of SQL Server not as raw data or as DataSets, but as objects that represent business entities. Getting data out of SQL Server and into an object, a process described in detail in Part I of the article, is fairly straightforward, as long as you can write a query that returns an XML document with the appropriate schema. This technique, however, wouldn’t be nearly as useful without the ability to pass objects into SQL Server as well. This article describes how to the reverse the extraction process?how to submit an Order object directly to SQL Server (the definition of the Order class that represents orders was explained in Part I of this article). In this discussion, the presentation tier populates the properties of an Order object, which the code serializes into XML using a middle tier component and then submits to a stored procedure. You don’t need all the members of the Order class to commit new orders to the database. Obviously, you don’t need to set the customer’s name or the product names in a new order?all you need are the IDs. Even if the presentation tier sets the names, the stored procedure ignores them and uses the IDs instead. You can design a new class to submit new orders to the database, but there’s no compelling reason to do so.

This technique arose from an application that had to record orders with many detail lines. The orders had to be inserted into the database within a transaction, so there were only two options:

  • Initiate a transaction on a Connection object and then submit each row to the database.
  • Create a stored procedure that accepts order values as arguments and performs the transaction right on the server.

It’s generally better to use the stored procedure option to perform the transaction against the database rather than initiate a transaction on a Connection object. When possible, you should isolate transactions to the data tier, because it’s the DBMS that actually performs the transactions. Of course, this isn’t possible when you work with multiple databases, but most small business applications don’t perform transactions against multiple databases.

Having decided to use a stored procedure, you now have the problem of figuring out how to pass a large (and/or unknown) number of arguments to a stored procedure. In this case, the order’s header consists of a fixed number of fields, but the order body may contain any number of detail lines. One approach is to create a delimited string containing the order’s fields, pass that to the stored procedure, and write T-SQL code to parse the string, but that’s inefficient at both the presentation tier (you have to build a long string) and at the stored procedure level (you have to write T-SQL code to parse the string and extract the individual fields). Using SQL Server’s support for XML you can create an XML document containing the data that represents the order, pass the generated document as an argument to a stored procedure, and then use the built-in XML parser to process the XML data within the stored procedure’s code.

With .NET, converting an instance of a class that represents an order into an XML document is almost trivial, thanks to the XmlSerializer class. Although certainly not in common use (yet) you can use this interesting technique? to pass an unknown number of arguments to a stored procedure?and it works regardless of whether you use business objects in your presentation tier.

Processing XML Data with T-SQL
To implement the technique, you must first write a stored procedure that accepts as argument an XML document resenting an order and commits the order to the database. The stored procedure isn’t difficult to write, if you know how to handle an XML document with T-SQL. There are basically two keywords you need to know, the sp_xml_prepareddocument built-in stored procedure and the OpenXML function. If you’re familiar with SQL Server’s XML extensions, you already know how to commit a serialized order to the database.

The sp_xml_prepareddocument built-in stored procedure loads an XML document into memory. After the document has been laid out in memory, you can use the OpenXML function to retrieve specific elements from the document using XPath syntax. You need to create a stored procedure that accepts a serialized Order object argument and inserts the order’s data into the proper tables. This example uses the Orders class developed in the Part I of this article to read data out of SQL Server and into an instance of the Orders class. The stored procedure’s name is SaveXMLOrder. I’ll explain how to parse and process the XML data with T-SQL as we go along. The stored procedure accepts a single argument, of type nText. Here’s a step-by-step breakdown of the stored procedure, starting with the declaration.

   CREATE PROCEDURE SaveXMLOrder   @Order nText   AS   ...

The sp_xml_prepareddocument stored procedure accepts two arguments: an Integer variable, where it will store the handle to the in-memory representation of the parsed document, and the string with the XML data. Declare the handler to the document and call the built-in stored procedure as shown here:

   ...   DECLARE @XMLHandler int   EXEC sp_xml_preparedocument @XMLHandler output, @Order   IF @@Error<>0   BEGIN      RETURN(--1)   END   ...

The next step is to read specific elements and/or attributes of the XML document and insert the appropriate rows into the Orders and Order Details tables. This must take place in the context of a transaction, of course. First you must extract the order’s header (the customer’s ID, the order’s date and freight) from the XML document. To do so, use a SELECT statement, as usual, with one difference: in the statement’s FROM clause, you call the OpenXML function. The OpenXML function accepts as arguments a handler to an XML document and an XPath expression that retrieves the desired nodes from the XML document (instead of one or more tables, as you would do with a regular SQL statement). XPath is to hierarchical (XML) data what SQL is to relational data. It’s a language for retrieving the nodes of an XML document, whereas SQL retrieves columns. The following SELECT statement returns the order’s header:

   ...   SELECT       CustomerID, CAST(OrderDate As datetime),    CAST(Freight AS money)   FROM       OpenXML(@XMLHandler,'/Orders', 1)   WITH (CustomerID nchar(5) '@CustomerID',    OrderDate varchar(10) '@OrderDate',    Freight varchar(12) '@Freight')   ...

The columns of the row(s) returned by the OpenXML function correspond to the attributes of the element(s) selected. Using the optional WITH clause you can specify the description of the result set to be generated and how the processor should map XML nodes to the result set’s columns. After defining the XML element mapped to a row, you must decide how the element’s attributes will be mapped to columns. You do that in the WITH part of the statement. You specify each column’s name and type, as well as the name of the attribute to be mapped to the specific column. When the attributes have the same names as the matching columns, you can omit the names of the attributes and write the WITH clause as follows:

   ...   WITH    (CustomerID nchar(5), OrderDate varchar(10), Freight varchar(12))   ...

Now that you’ve selected the XML nodes containing the header information, you can insert a row into the Orders table using the following INSERT INTO statement:

   ...   INSERT INTO Orders (CustomerID , OrderDate)       SELECT    CustomerID, CAST(OrderDate As datetime)      FROM       OpenXML(@XMLHandler,'/Orders', 1)                WITH    (CustomerID nchar(5) , OrderDate            varchar(10))           IF @@Error<>0              BEGIN                ROLLBACK TRANSACTION                RETURN(--2)            END   ...

If the argument passed to the stored procedure doesn’t have the correct structure, or if the new row can’t be inserted into the Orders table for any reason, the stored procedure aborts the transaction and returns the value ?2 to indicate that it failed to insert the header.

After adding the order’s header to the Orders table, the procedure extracts the ID assigned to the new order by SQL Server and stores it to a local variable, the @OrderID variable. Next, it adds the detail lines to the Order Details table. The process is the same: use an INSERT INTO statement that retrieves the columns of all the detail lines from the XML document with a single SELECT statement and insert them into the Orders table. This time we pass the Order/Detail XPath expression to retrieve the Detail elements as the argument to the OpenXML function. The INSERT INTO statement inserts all the rows returned by that expression into the Order Details table. Notice that the procedure sets the value of the OrderID column to @OrderID for all inserted rows. Here’s the INSERT statement that inserts the detail lines:

   ...   INSERT INTO [Order Details]       (OrderID, ProductID, UnitPrice, Quantity, Discount)   SELECT @OrderID,ProductID, ProductPrice,       ProductQTY, Discount    FROM       OpenXML(@XMLHandler,'/Orders/Detail')      WITH (ProductID int '@ProductID',         ProductPrice money '@Price',         ProductQTY smallint '@Quantity',         Discount real '@Discount')   ...

After inserting the detail lines you can commit the transaction and return the new order’s ID. If any operation fails, the transaction is rolled back and the stored procedure returns a negative value.

Listing 1 contains the complete listing of the SaveXMLOrder stored procedure. Notice that it returns a negative value to indicate a failure and that different conditions return different values. If all goes well and the database commits the transaction successfully, the stored procedure returns a positive numeric value?the ID of the new order.

The AddOrder Method
The AddOrder method comprises the middle tier method that commits the order to the database (see Listing 2). The code is straightforward: first it serializes the newOrder object passed to the method as argument into a string variable, the XMLOrder variable, and then calls the SaveXMLOrder stored procedure passing this string as argument.

   Public Shared Function AddOrder( _      ByVal newOrder As Order) As Integer      Dim serializer As XmlSerializer      serializer = New XmlSerializer(GetType(Order))      ' Serialize the order's details on a memory stream,       ' then read the stream into a string variable and      ' pass the variable as argument to the SaveOrder       ' stored procedure      Dim MS As MemoryStream      Try         MS = New MemoryStream         serializer.Serialize(MS, newOrder)      Catch exc As Exception         MS.Close()         Return (-1)      End Try      MS.Close()      Dim XMLOrder As String = _         System.Text.Encoding.UTF7.GetString(MS.ToArray)      Dim CMD As New SqlClient.SqlCommand      CMD.CommandText = "SaveXMLOrder"      CMD.CommandType = CommandType.StoredProcedure      CMD.Parameters.Add( _        New SqlClient.SqlParameter("@order", XMLOrder))      Dim CN As New SqlClient.SqlConnection      CN.ConnectionString = "your connection string"      ' for example: "data source=(local);initial " & _      '    "catalog=Northwind;uid=sa;pwd="      CMD.Connection = CN      CN.Open()      Dim newOrderID As Integer      Try         newOrderID = CInt(CMD.ExecuteScalar)      Catch ex As Exception         Return -1      Finally         CN.Close()      End Try      Return newOrderID   End Function

The AddOrder method is almost trivial; it serializes the newOrder object and passes the resulting XML document as argument to the SaveXMLOrder stored procedure, which is responsible for parsing the XML document and committing the entire order to the database in transactional mode. The code at the presentation tier and the middle tier need not be concerned with transactions.

The sample application’s New Order button creates a random new order and submits it to the database by calling the AddOrder method of the OrderClass class. The method returns the ID of the new order and the code at the presentation tier retrieves the new order from the database with the GetOrder method and displays its details on the ListView control. It actually calls the Show Order button’s code to retrieve the new order. Here’s the code that creates a random order and submits it to the database with the AddOrder method:

   Dim newOrder As New OrderClass.Orders   newOrder.OrderDate = Date.Today   newOrder.CustomerID = custID   newOrder.Freight =       10 + CDec((rnd.NextDouble * 15000) / 100)   Dim lines As Integer = Rnd.Next(2, 20)   Dim details(lines - 1) As OrderClass.OrdersProducts   Dim dtl As Integer   For dtl = 0 To lines - 1      details(dtl) = New OrderClass.OrdersProducts      details(dtl).ProductID = rnd.Next(1, 72)      details(dtl).Price = OrderClass.GetItemPrice _         (details(dtl).ProductID)      details(dtl).Quantity = rnd.Next(3, 50)      details(dtl).Discount = _         CDec(rnd.Next(0, 30) / 100)   Next   newOrder.Products = details   Dim ID As Integer = OrderClass.AddOrder(newOrder)

The code also keeps track of each order’s running total and updates the order’s header after displaying all the detail lines. I won’t explain the code in detail, as you will most likely use a different interface to display the orders. However, you won’t have to work with ADO.NET objects at the presentation tier. GetItemPrice is another method of the middle tier component. It returns the price of the product specified by its ID as argument.

Efficiency Considerations
XML serialization is an efficient process, except for the first time you initialize the XmlFormatter object. After that, the serialization of a custom object to XML is extremely fast, and the overhead is usually insignificant. The big advantage of this method is that you can commit the entire order within a stored procedure?executed right at the database. Submitting an order using the DataAdapter’s Update method is equivalent to executing a separate INSERT statement for each detail line. For a large order this may result in hundreds of trips to the database. You can avoid the unnecessary trips by preparing a stored procedure that acts on the nodes of an XML document passed to the stored procedure as argument.

The sample code includes a VB.NET application that creates random orders and submits them to the Northwind database using ADO.NET objects (see the IdentityTransactions project?included in the sample code). Figure 1 shows the application’s main form. The Add Order button creates a random order (an order with between 3 and 20 detail lines containing random product IDs, with random quantities). The Show New Orders button displays the new order in the Output window. Finally, the Commit Order button submits the new order to the database. The program also allows you to create multiple orders and submit them to the database at once. The code submits each order in the context of a separate transaction, of course.

Figure 1: The IdentityTransactions project demonstrates how to perform transactional updates with DataAdapters.

The IdentityTransactions project uses a DataSet with two DataTables, the Orders and Order Details DataTables, and populates them with random data. Then it initiates a transaction on a Connection object and calls first the Update method of the DataAdapter that corresponds to the Orders table to insert the order’s header, and then the Update method of the DataAdapter that corresponds to the Order Details table to insert the order’s details. I will not discuss the project’s code here, but you can open it with Visual Studio and read the comments throughout the code. Listing 3 shows the code of the Commit Order button, which submits the order created by the Add Order button.

I timed the two approaches, hoping that overhead of the XML serialization process wouldn’t introduce a substantial performance penalty, and was surprised to find that the XML-based approach is consistently and considerably, faster. Table 1 shows the average time (in milliseconds) to commit each detail line of an order. The first column is the number of detail lines in the order and the other two columns show the time it took to commit each detail line (it’s the time taken to commit the entire order divided by the number of detail lines in the order). To time the operation of committing an order to the database I created 500 random orders with the same number of detail lines and then took the average. The Orders and Order Details tables of the Northwind database grew quite large during the tests, so the measurements weren’t computed with a (practically) empty database.

Table 1: The average time (in milliseconds) required to commit orders to the Northwind database.

Number of detail lines in order Time per detail line (ms)
DataAdapter OpenXML

5

4.07

7.58

10

3.76

4.54

25

3.20

2.42

50

2.93

1.41

100

2.63

1.27

500

2.58

0.73

1000

2.74

0.49

1500

3.61

0.54

You won’t be able to create large orders with the Northwind database, because the Products table has only 72 products and the order’s ID along with the product ID form a unique key in the Order Details table (in other words, you can’t have an order with two detail lines that refer to the same product). I had to create random products and random customers to make even a moderately sized database. As you can see, the XML-based approach performs better as the number of detail lines in the order increases, and here’s why.

The first order involves substantial overhead: a temporary assembly is created (it’s required by the XmlSerializer class) and the application establishes a connection to the database. Committing the first order to the database takes a few seconds. After that, orders are created and committed to the database within a few dozen milliseconds, depending on the number of detail lines they contain. Using the DataAdapter’s Update method takes twice as long for orders with up to 100 detail lines, and up to seven times as long for very large orders. I’ve used a low-end server to time the operations (a Raid 5 system with a single 1 GHz processor and 1 GB of RAM), so the actual times are meaningless, only their ratio matters.

The XML-based approach performs better with larger orders and, for very large orders, it can commit each detail line in a few tenths of a millisecond. This is very impressive and in retrospect it shouldn’t surprise us. The stored procedure can insert hundreds of detail lines with a single INSERT statement, while any other technique that uses ADO.NET objects will execute an INSERT statement for every detail row. The advantage of performing the transaction from within a stored procedure more than offsets the overhead of XML serialization.

Changing Views of Data
To sum up, the XML-based approach is both faster and more elegant than any strictly ADO.NET approach based on DataSets and DataAdapters. A DataSet isn’t the most accurate representation of a business object (an order in our case). The Orders class, on the other hand, is a very accurate model of an order. It contains all the information about the order and you don’t have to worry about relating an order header to its associated order details. The order’s ID becomes irrelevant, as it should, because it’s a meaningless value generated by the database whose sole purpose is to establish a relation between the Orders and Order Details tables. At the presentation tier, it’s much more intuitive, convenient, and robust to work with an Order object than a DataSet with two related tables. The fact that it’s also faster is a pleasant bonus.

In the days of COM+ developers’ efforts in the middle tier centered around dismantling Recordsets into business objects and passing collections of objects between layers. One positive side effect of using business objects to pass information between layers is that developers have to design the business objects and implement methods that expose the required functionality. Designing functional business objects requires a thorough understanding of the business processes modeled in the code. Those efforts resulted in applications with a solid architecture and distinct layers of functionality. But architectural issues shouldn’t depend on the data access mechanism. Regardless of whether applications pass data from the data tier to the middle tier as Recordsets or DataSets, the presentation tiers code should manipulate objects that represent specific business entities as discussed in this article series.

Of course, not all applications will benefit from this approach, and no tool will address every requirement. The approach suggested here works nicely when it comes to submitting updates that involve parent/child tables, especially when you’re inserting (or updating) hundreds of rows.

As XML becomes the glue between all of the technologies that make up the .NET Framework, I expect to see major changes in many of the technologies we use today. It’s likely that with a future version of SQL Server will be able to define custom classes at the level of SQL Server and populate instances of these classes with T-SQL statements. Today, support for XML was added to SQL Server as an afterthought. If XML is the best method of exchanging data between databases, why shouldn’t it also be the main mechanism for moving data in and out of a database? XPath and XQuery (XML query languages) are already close to SQL and may well become alternatives for it. After all, SQL was originally conceived and designed more than three decades ago. In my opinion, our view of databases will soon change from a row/column organization (relational view) to collections of business objects (hierarchical view). Until this happens, you can take advantage of XML serialization to transform relational data into hierarchies of business objects in middle tier components and provide presentation tier developers with a better, more abstract view of the database. I’m not suggesting going back to hierarchical databases, but it’s very convenient to be able to determine the format in which the database will return data. That includes defining custom formats that can be easily mapped to business objects.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: