ata is the blood in your system; it sits in its comfortable home of a database, and camps out in the tent of XML, but it deserves to be worked with in a reliable and consistent manner.
But why should only data-related operations be reliable? Shouldn't you want to write reliable code for your other operations? The introduction of System.Transactions in .NET 2.0 brings a paradigm shift of how you will write reliable transactional code on the Windows platform. This article dives deep in the depths of how System.Transactions works, and how you can use it to your advantage. You will also see how you can leverage existing System.Transactions integration within ADO.NET, and why you need to really understand what is under the magic carpet.
I have never met a programmer that intentionally wanted to write bad code. I have never met a client that was happy to receive an unreliable system. Face it-computer software that keeps an airplane in the sky, controls a CAT scan machine, and the software system that manages your and my payroll and retirement, needs to be reliable.
Reliability in computer software can be broken down into two main spheres: availability and predictability. Your software systems may be unavailable due to many reasons, one of which could be unpredictable software. But given the same set of inputs, why should you ever have an unpredictable set of outputs? You shouldn't!! But because the inputs to a concurrent system depend on the concurrent load experienced by the server, the inputs on a highly available and concurrent system can vary over time, and thus it becomes necessary for you to wrap ATOMIC operations in CONSISTENT, ISOLATED, DURABLE operations. Such operations are also referred to as transactions.
A Little History of Transactions
I am not the first person to emphasize the importance of transactions, and I am pretty sure I will not be the last. Transactions have been our good programming friends since the days of the mainframe. Historically, databases typically have had fantastic transactional support. You can use the BEGIN TRANSACTION statement to wrap a number of statements within one single transactional block. You can also choose to use advanced features such as savepoints, nested transactions, and various isolation levels in your transactions. But again, why should only databases enjoy the luxury of being reliable?
Transactions could be necessary in other operations as well, and recognizing that fact, architectures have been built over time to support transactions on non-database operations. Generally such transactions are managed by Transaction Managers (TM), and resources that need to be managed in a transactional manner are managed by Resource Managers (RM).
A number of RMs work in cooperation with a TM to wrap a number of operations within a single transaction. A common example of an RM is the piece of code that abstracts a database connection, typically the SqlConnection class. A common example of a transaction manager on the Microsoft Windows platform is the MSDTC (Microsoft Distributed Transaction Coordinator).
By definition, a transaction managed by a Transaction Coordinator involves one or more than one resource that needs consistency in a highly concurrent environment. In general, you can tie a number of such operations together within a transaction using various mechanisms, the most common one of which is commonly referred to as a two-phase commit process. These steps show how a two-phase commit process works.
- An RM checks to see if a transaction is currently running, and if it is, it enlists itself within the running transaction.
- As a part of phase 1, also known as the prepare phase, the RM does its part and gives a green signal to the TM that it is ready to go.
- The TM keeps track of green signals from all enlisted RMs, and when the last RM in a transaction sends a green signal as success to the prepare phase, the TM requests the RMs to go ahead with phase 2.
- The RMs receive the green signal and go ahead and commit their work in the commit phase.
- If any RMs responded with a red signal at the end of the prepare phase, the TM asks everyone to roll back their work.
There is a lot of to and fro dialog going on between the RMs and the TM. Even though the amount of data isn't much, it is indeed extremely chatty in nature. This can be a problem because on networks, chatty communication can seriously hurt performance due to network latencies. This is a serious disadvantage of MSDTC but is not the only downside of using MSDTC. Here are some additional downsides of using MSDTC.
- As mentioned, such network communication is rather chatty in nature, and leads to serious performance degradation due to network latencies.
- Such network communication can frequently be blocked by firewalls.
- Generally given the higher unpredictability (due to other RMs involved), such transactions frequently require RMs to lock their resources in a higher isolation level. For instance, a transaction being managed by MSDTC that involves a SqlConnection object will end up escalating the isolation level to Serializable, which is the "oink oink" version of a transaction.
- Serializable relies on exclusive locking, thus blocking all other concurrent running transactions. This is, in fact, the perfect recipe for a deadlock because you are aggressively locking resources-without really knowing what you are locking. After all, the programming model for using such transactions, as you will see shortly, is rather simplistic.
But does every single transaction have to be distributed? Certainly not!! Therefore, every single transaction shouldn't have to pay the heavy cost of a distributed transaction either.
In fact, when you wrap a transaction using the SqlTransaction object, your default isolation level is ReadCommitted, and SQL Server doesn't consult the MSDTC whether it can commit or not. But when you run a transaction using the BEGIN DISTRIBUTED TRANSACTION command, SQL Server indeed does leverage MSDTC behind the scenes to run the transaction. Similarly, for non-database entities that do not need a heavy-duty TM such as the MSDTC, such entities can be managed by an alternate TM called the Lightweight Transaction Manager (LTM). It is important to realize, however, that what transaction manager you pick is transparent to you as a consumer of the RM. However, as an author of the RM, you will have to make that decision.
In other words, when you use the SqlConnection class, you don't worry about which transaction manager you will use, you simply use the SqlConnection object. But if you were writing the implementation of SqlConnectionsomething Microsoft has already done for youyou would indeed sit back and think, "What TM do I need?"