DO.NET has a strong and powerful disconnected model that lets programmers build a Web of in-memory objects and relate contents coming from different tables and even from different data sources. It’s important to think through your code strategy carefully when processing queries and updates on interrelated tables to preserve scalability and maintain high performance. Sometimes, it’s much more effective to split compound queries; sometimes it isn’t. Submitting changes in batch mode, exploiting the DataSet and its disconnected model, may be the perfect solution. However, when you need to move dozens of megabytes, the DataSet’s serialization mechanism increase the size of the data considerably. In ADO.NET, the magic word is “tradeoffs.”
All data-driven applications look alike. They are built around a few basic tasks?fetch data across different tables, join columns, and filter data based on some user-specific criteria. Data-driven applications must also format data in such a way that end-users can easily and pleasantly consume it. More often than not, data is made of cross-related tables in which parent/child relationships are in place to define the data model and to extract a specific subset of information like detail views and subtotals.
So can we conclude that all applications that deal with data are the same and can be built using a common pattern? Of course not. Other aspects, including scalability and performance, affect and influence the design.
All distributed applications must be scalable to some extent. If a Web application is made open to anonymous users?for example, a shopping site?providing a good, if not optimal, level of scalability is a sort of survival test. Scalability is related to performance, but it can’t be identified exclusively with that. Likewise, performance affects scalability but in no case are they synonyms.
|
In general, caching data does not seem to be an optimal choice for all Web applications because it consumes valuable resources on a per-user or per-application basis. As a result, having less memory available to the Web server might lead to performance degradation. On the other hand, if you’re developing an application that needs to be both fast and scalable, you should carefully consider caching data at some level to reduce the workload on middle-tier components and the bottleneck of database connections. The more tiered your system is, the more options you have to scale it out as long as you can move disconnected caches of data across the tiers.
In the first part of this article, I examined a few recommended approaches for connections and transactions management in ADO.NET applications. In this second installment, I’ll tackle SQL-level programming and review a few scenarios in which the quality and the structure of SQL statements can make a world of difference between applications.
Untangle the Database Ball of Wool
Even the simplest Web application can have a rather complex and interrelated structure of databases. While a database with dozens of related tables, views, constraints, and triggers is perhaps not what you deal with every day, I’ve never seen a real-world application with less than a handful of related tables.
The problem with a grid of databases is with both queries and updates. End users should be able to access and filter related rows, but also to update and delete rows in a way that does not violate the consistency and integrity of the database.
Some databases resort to server-side constraints to block undesired changes. The fact that most database administrators use this strategy to stay on the safe side is not a good reason to avoid code that could potentially corrupt the database. Ideally, client applications should submit changes that respect the database structure and its internal update rules (for example, relationships and constraints).
ADO.NET has a strong and powerful disconnected model that allows applications to work with sort of in-memory databases. The DataSet object governs the storage of multiple tables and it enables filtering, sorting, constraints, and cascading changes. This model, though, seems to fit more easily for updates than for queries. For compound queries, the hurdle is the way in which an SQL result set is adapted to fit into a DataSet and whether cached data is effective to the application. Let’s review a couple of solutions for compound queries.
Conducting Multiple Queries
In light of the ADO.NET capabilities, you can implement multiple queries that span across multiple tables in at least two ways. The old-fashioned way, so to speak, is typical of ADO and any former data-access technology. It is based on the old faithful SQL JOIN command and returns a single, monolithic data structure that is quite easy to filter, but hard to update. In addition, the final result set is populated with redundant data.
|
The alternative approach, touted by the ADO.NET disconnected model, exploits the capabilities of the DataSet object?that is, a multi-table cache of data. The idea is that you split the original query into two or more distinct queries and gather the results in separate tables held in memory within the same DataSet object. The queries can be run individually or grouped in a stored procedure or in a batch. Using split queries results in a much more compact data set that can be easily managed via the programming interface of the DataSet and DataRelation objects.
Once the result set is stored into a DataSet, it supports cascading changes and batch updates. Furthermore, no synchronization code is required to retrieve the children of a parent row because automatic relationships can be declared. On the downside, just relationships make applying filters to the records not particularly easy.
Suppose you need to obtain all orders issued to a given country. The query can be written like this:
SELECT c.customerid, c.companyname, c.city, o.orderid, o.orderdate, od.productid FROM customers c INNER JOIN orders o ON c.customerid=o.customerid INNER JOIN [order details] od ON o.orderid=od.orderid WHERE c.country = 'USA' ORDER BY o.customerid
The query spans three Northwind tables?Orders, Order Details, and Customers. It retrieves all orders issued to a customer resident in the US. Viewed through the Microsoft SQL Server Query Analyzer, the result set looks like Figure 1.
![]() |
Splitting Monolithic Queries Suppose that you are not allowed to use multiple queries in your application. This has nothing to do with a bad opinion you may have of the solution. It is just that your DBA strongly disagrees with any suggested change in the set of stored procedures. You would like to manage data using comfortable ADO.NET objects, but your DBA will only let you employ old-style stored procedures to get them. How can you load the result sets into ADO.NET disconnected objects?
Let’s consider the following more complex query. It retrieves, grouped by customer, all orders issued in a given year. The query won’t retrieve all the orders, though, but only those with at least 30 items. In addition, the name of the employee who processed the order and the name of the company that issued it are inserted into the result set.
Listing 1 demonstrates how to split the results of the query above. The main procedure?SplitData?gets the DataSet originated by the query and processes it. The input DataSet contains just one table. The method in Listing 1 makes a copy of the DataSet and then creates two new tables?Employees and Customers. The SplitData method fills these two tables up using data from the monolithic table. Finally, the method obtains the third table?Order Details?and removes processed columns from the original table. As a result, you run a traditional stored procedure, pay the price of redundant data being moved across the wire, but gain in flexibility because you can handle disconnected data in distinct and possibly related tables. What’s the big advantage of this approach? You can now make batch updates. Conducting Cross-Table Updates When related tables are involved, changes flow from the parent to the children in different ways, depending on the type of the change you’re attempting?update, delete, or insert. In general, there are two ways of conducting table updates?through a direct connection or batch update. In the case of a direct connection, you typically open a connection and execute the stored procedure or the SQL batch with all the logic and data you need to apply. In the case of a batch update, you typically yield to a middle-tier component, which gets disconnected data from you and takes care of submitting data to the database, possibly in an asynchronous way. The disconnected approach in a Web scenario requires you to serialize the data that you want to submit from the Web server layer down to an internal tier, which holds a physical connection to the database. The availability of an easy-to-serialize object such as the DataSet and the powerful batch update mechanism of data adapters give you the tools to build highly scalable applications. The batch update process passes all in-memory changes recorded in a DataSet table to the back-end system for permanent storage. In ADO.NET, you don’t submit your blocks of data being sent to the database in a single shot. Your ADO.NET batch update will execute individual statements on the target system, one for each change that needs to be submitted. Batch update opens one connection but still performs multiple statements in a predefined order. To run a batch update, you don’t have to do much more than call the Update method of the data adapter class. This simple mechanism, though, won’t work at all with inter-related, hierarchical data. When you deal with hierarchical data, you need client-side cascading changes and custom update logic to build effective applications. ADO.NET supports two types of constraints?unique and foreign-key constraints. A unique constraint is a restriction on a data column in which all values must be unique. A foreign-key constraint set on a DataTable restricts the action performed when you try to delete or update a value in a column. For example, deleting a value from the parent table can affect rows in the child table in various ways. Often you want to delete child rows in the child table. However, in some cases, you’ll have a cascading action set the child rows to null or default values. You can programmatically set the action that works best for using the programming interface of the ForeignKeyConstraint class. What really matters, though, is that ADO.NET provides you with the tools to configure the runtime environment to automatically cascade changes from parent to child rows. Such a mechanism allows you to work on a DataSet object, enter changes, and (if needed), have them propagated through the tree. When you have gone through all the changes, ADO.NET guarantees that all the tables in the DataSet have been properly updated. If a change cannot be cascaded, or simply contains invalid values that would violate any of the relationships, an exception would be thrown. After ADO.NET has updated your tables, you just take the resulting DataSet and apply the logic that best suits your updates. For example, depending on the relationships set between your tables, you may need to process child deleted rows before parent insertions and child updates after parent updates. You can easily extract the subset of rows to process using the Select method on the DataTable object. Whatever the right order is to enter changes to your application, you have the tools to get ready-to-process rows. Retrieving Server-Generated Values The UpdatedRowSource property looks for output parameters, the first row of the next query, both, or neither. By default, it looks for both. The following SQL batch inserts a new row into a database where custID is an identity column. The second statement retrieves the last identity value generated in the scope and returns it as the updated value of the custID column.
If you assign the above statements to the InsertCommand property of a data adapter, the UpdatedRowSource property will catch the newly generated identity value and pass it to the DataRow object being updated. If you set the UpdatedRowSource property to Both (the default) or FirstReturnedRecord, each inserted row processed through the batch update will have the identity column updated. And, more importantly, you don’t have to write any additional code. You only need to select a checkbox in the Visual Studio .NET wizard that configures a data adapter. A data adapter that is about to update a row during a batch update process fires the RowUpdating and RowUpdated events, respectively, before and after the update. You handle the RowUpdated event as follows:
The event handler gets a data structure of type SqlRowUpdatedEventArgs in which the Row property returns the row as updated by the underlying database. You’ll need to modify this code database. You must design effective queries and update strategies to make a scalable application. To write effective SQL code, you should know the SQL engine you’re using, and realize that they may not be equal. If you’re building a .NET application, you also need a good understanding of the ADO.NET model and its capabilities. ADO.NET is powerful but it doesn’t reduce database programming to mere point-and-click programming. ADO.NET offers a set of powerful tools but without practice and experience you may not do a lot with them. The command builder object or even the batch update process, for example, greatly simplifies the update process but you must know what they do and how they work.
Share the Post:
![]() ![]() Different Types of Data Models Explained with Examples
June 7, 2023
In the modern world, data is everything and everywhere. With so much access to technology, data has become a valuable resource for any business. Albeit a complex one. Data is ![]() ![]() Revolutionizing Search: A Glimpse Into Google’s Generative Experience
June 6, 2023
Google is revolutionizing the search experience as we know it with its latest generative experience. No longer will you be bound by the limitations of traditional keyword searching. Now, you ![]() ![]() 10 Productivity Hacks to Supercharge Your Business in 2023
June 5, 2023
Picture this: your team working seamlessly, completing tasks efficiently, and achieving goals with ease. Sounds like too good to be true? Not at all! With our productivity hacks, you can ![]() ![]() GM Creates Open Source uProtocol and Invites Automakers to Adopt It: Revolutionizing Automotive Software Development.
June 2, 2023
General Motors (GM) recently announced its entry into the Eclipse Foundation. The Eclipse Foundation is a prominent open-source software foundation. In addition, GMC announced its contribution of “uProtocol” to facilitate ![]() ![]() What is Metadata?
June 1, 2023
What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular ![]() ![]() What We Should Expect from Cell Phone Tech in the Near Future
May 31, 2023
The earliest cell phones included boxy designs full of buttons and antennas, and they only made calls. Needless to say, we’ve come a long way from those classic brick phones ![]() ![]() The Best Mechanical Keyboards For Programmers: Where To Find Them
May 29, 2023
When it comes to programming, a good mechanical keyboard can make all the difference. Naturally, you would want one of the best mechanical keyboards for programmers. But with so many ![]() ![]() The Digital Panopticon: Is Big Brother Always Watching Us Online?
May 26, 2023
In the age of digital transformation, the internet has become a ubiquitous part of our lives. From socializing, shopping, and learning to more sensitive activities such as banking and healthcare, ![]() ![]() Embracing Change: How AI Is Revolutionizing the Developer’s Role
May 25, 2023
The world of software development is changing drastically with the introduction of Artificial Intelligence and Machine Learning technologies. In the past, software developers were in charge of the entire development ![]() ![]() The Benefits of Using XDR Solutions
May 24, 2023
Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved ![]() ![]() How AI is Revolutionizing Fraud Detection
May 23, 2023
Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across ![]() ![]() Companies Leading AI Innovation in 2023
May 22, 2023
Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several |