ll but the simplest applications store data in a central location and access it over a network. However, in many scenarios, distributed applications cannot assume a certain kind of network connection, both in terms of performance and reliability. In scenarios where users access their applications on mobile PCs, network connections may not be available at all. This introduces relatively complex data access scenarios with which modern applications need to cope.
The .NET Framework provides many great data access components, including the ADO.NET (System.Data) and SqlClient (System.Data.SqlClient) technologies. Many of the data features in .NET have been architected specifically with distributed applications in mind.
The DataSet object is a good example. Unlike its predecessor the Recordset object in COM-based ADO, ADO.NET DataSets operate while inherently disconnected. A connection to a database is opened, data is retrieved and put into memory, and from that point on, the connection is no longer needed and can be dropped. Only when data needs to be refreshed or stored back into the database is the connection needed again. At that point, an existing connection may be reused, or a completely new connection may be created.
The point is that ADO.NET does not have a dependency on the connection staying alive. You can query data into the memory of a laptop computer, completely disconnect the computer from the network, later reconnect it and re-establish a connection to the database, and the DataSet will save any changes to the database as if the connection was never severed. In fact, ADO.NET doesn't try to keep a DataSet connected, and therefore never sees the difference between a partially and fully-connected scenario.
ADO.NET is not the only component that is of importance to distributed systems. Often, it is also required that applications can take advantage of network connections other than LANs (Local Area Networks), in particular, the Internet. This enables users to run their applications while in the office as well as on business trips (hotel rooms), or geographically separated offices. There are a number of ways the .NET Framework helps in these scenarios as well. ADO.NET Web services can be used to call objects across the Internet. The same is true for technologies such as .NET Remoting, or the upcoming Windows Communication Foundation (WCF, formerly known as Indigo).
This means that fundamentally, the .NET Framework provides everything a developer needs to support partially connected scenarios. This statement applies at a relatively low level. Developers have the ability to connect to a database locally (meaning "on the local network") and take the data "on the road" without requiring a permanent connection. This also provides fundamentals for offline data storage. Other technologies allow the developer to invoke objects (including data access infrastructure) over the Internet.
However, this does not mean that anything happens automatically. Whenever a direct connection is opened to SQL Server, that connection only works on the LAN and there is no automatic mechanism mapping that operation to a distributed call over the Internet. Similarly, there won't be an offline cache available in scenarios where no network connection is available at all. All the fundamentals needed to implement those scenarios are provided, but some extra planning and implementation work is required on the developer's part to make all of this happen. Luckily, the amount of effort required is relatively small, assuming proper architecture.
So how can you architect the application infrastructure in a way that supports features such as data access over the Internet, offline data, or perhaps even access to different types of databases? And furthermore, is it even all that smart to access data remotely, rather than just accessing business objects remotely? As is so often the case, the answer is: it depends.
Practically all modern applications are built in tiers, where individual layers of the application are responsible for different tasks. Often, applications have at least three tiers: the database tier, the business object tier, and the user interface tier. Another valid although older pattern is the client/server approach, which in essence is a two-tier application: one tier handles all database-related tasks, and the second tier handles both the application logic and the user interface. Some applications may even have more than three distinct tiers. All these approaches are valid and have advantages and disadvantages (an exploration of which is beyond the scope of this article).
Most experts agree that tiers are the accepted way of architecting business applications these days. However, not everyone agrees on where those tiers should physically reside. Over the past 10 years, many applications have been architected with two tiers (database and business objects, a.k.a. middle tier) being located on a server or server farms with computers in close proximity to each other, or at the very least on the same LAN. ASP.NET Web applications are a good example of such a set up. In fact, you could argue that in HTML-based Web applications, the UI tier (the third tier) lives on the server as well, since almost all UI tasks are processed and handled on the server and only HTML output and very limited client-side functionality (scripts) are sent to the browser.
A similar set up can be used for Smart Client applications. Data access and business logic can reside on the server and only a small client application is deployed to individual workstations. The client applications only handle UI tasks and all processing is handled on the server, possibly using concepts from Service Oriented Architecture.
Set ups similar to these are in extensive use and development today, and they are perfectly valid. However, there are other alternatives, the need for which can arise out of some of the shortcomings of the scenarios described above. Probably the biggest issue with these scenarios is that although they are relatively straightforward to create and maintain for developers and administrators, they are not necessarily great for users in all instances. It is certainly great to have Web applications that a user can use from an airport terminal if need be, but at the same time, users may want to use their applications while completely disconnected from the network. Imagine a salesman who just visited a customer and managed to secure a large sale. On his plane trip back to the home office, the salesman wants to set up the customer's account and enter the new order. These tasks probably require business logic as well as data access, storage, and manipulation, but with an application that requires connectivity, this would not be possible.
You can accommodate this scenario by deploying the business tier to the workstation, in addition to the user interface. The scenario also requires data to be available locally, so you need to at least create the illusion of a local data store (offline data).
All the scenarios listed above are equally valid and important, and in fact, I could have come up with additional scenarios, such as Pocket PC support and others. Clearly, the goal for data access infrastructure needs to be to work in all these scenarios. Also, I do not consider it sufficient for data access to just work. Instead, I expect data access to work as efficiently as possible in all scenarios. For instance, you want to support data access over the Internet for Smart Client applications, yet at the same time, you would not want to sacrifice performance for LAN or Web Server set ups.
I expect data access infrastructure to be flexible and maintainable and be serviceable well into the future without any dependencies on code that might possibly be outdated by then. Also, I expect the system to be easy to use, ideally even easier than using ADO.NET (or any other data access technology for that matter) by itself.