ou know SQL Server 2005’s new snapshot isolation feature can help you reduce lock contention and deadlocks, but did you know that it also can help you fix intermittent errors in reports and detect lost updates? You know that snapshot isolation requires some effort from your DBA, but did you know it may introduce some breaking changes into applications? Its benefits are well known, but SQL Server developers need to understand the downstream ramifications of snapshot isolation. This article discusses a couple of common real-world scenarios that illustrate the often-unanticipated consequences of using this feature.
The scenarios use the following sample tables, which you should create in your database for reference:
CREATE TABLE Users(UserId INT NOT NULL, FirstName VARCHAR(8), LastName VARCHAR(8), LotsOfComments CHAR(5000), CONSTRAINT PK_Users PRIMARY KEY(UserId))goINSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(1, 'Jill', 'Larson', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(2, 'Brian', 'Taylor', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(3, 'Lester', 'Rash', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(4, 'Sue', 'Miller', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(5, 'Justin', 'Rowler', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(6, 'Joe', 'Hanson', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(7, 'Brenda', 'Tyler', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(8, 'Drew', 'Wellner', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(9, 'Sue', 'Hogan', 'Top Notch Pro')INSERT INTO Users(UserId, FirstName, LastName, LotsOfComments) VALUES(10, 'Xavier', 'Sanchez', 'Top Notch Pro')goCREATE TABLE Tickets(TicketId INT NOT NULL, AssignedTo INT NOT NULL, Priority VARCHAR(10), Status VARCHAR(10), LotsOfComments CHAR(5000), CONSTRAINT PK_Tickets PRIMARY KEY(TicketId), CONSTRAINT FK_Tickets_Users FOREIGN KEY(AssignedTo) REFERENCES Users(UserId))GOINSERT INTO Tickets(TicketId, AssignedTo, Priority, Status, LotsOfComments) VALUES(1, 1, 'High', 'Opened', 'Cannot login')INSERT INTO Tickets(TicketId, AssignedTo, Priority, Status, LotsOfComments) VALUES(2, 8, 'High', 'Opened', 'Reports hanging')INSERT INTO Tickets(TicketId, AssignedTo, Priority, Status, LotsOfComments) VALUES(3, 10, 'High', 'Opened', 'Cannot save changes')go
Scenario 1. Reports in OLTP Environment Have Inconsistencies
Clearly the database structure guarantees a ticket to be assigned to one and only one user at a time. However, sometimes a simple query like this:
SELECT FirstName, LastName, TicketId, Priority, Status,SUBSTRING(Tickets.LotsOfComments, 1, 25) Comments FROM Users LEFT OUTER JOIN Tickets ON Users.UserId = Tickets.AssignedTo
returns obviously incorrect results:
FirstName LastName TicketId Priority Status Comments--------- -------- ----------- ---------- ---------- -------------------------Jill Larson 1 High Opened Cannot login Brian Taylor NULL NULL NULL NULLLester Rash NULL NULL NULL NULLSue Miller NULL NULL NULL NULLJustin Rowler NULL NULL NULL NULLJoe Hanson NULL NULL NULL NULLBrenda Tyler NULL NULL NULL NULLDrew Wellner 2 High Opened Reports running very slowSue Hogan NULL NULL NULL NULLXavier Sanchez 1 High Opened Cannot login Xavier Sanchez 3 High Opened Cannot save changes
Why is ticket number 1 assigned to both Jill Larson and Xavier Sanchez? This intermittent problem manifests itself only when the data is selected and modified simultaneously. If you query only static data, you can never reproduce this problem. The following instructions show you how selecting data that is being modified can cause this elusive error. (Note that so far all queries run under the default isolation level, READ COMMITTED.)
In SQL Server Management Studio, open three tabs?that means three connections. Use the database in which you created sample tables. The query in each tab starts with SELECT @@SPID, so that you can identify these connections in the Activity Monitor. In the first tab, modify the Tickets table as follows:
SELECT @@SPIDGOBEGIN TRANSACTIONUPDATE Tickets SET LotsOfComments = 'Reports running very slowly' WHERE TicketId = 2
In the second tab, start your report as follows:
SELECT @@SPIDGOSELECT FirstName, LastName, TicketId, Priority, Status, SUBSTRING(Tickets.LotsOfComments, 1, 25) Comments FROM Users LEFT OUTER JOIN Tickets ON Users.UserId = Tickets.AssignedTo
Note that your report hangs. If you open Activity Monitor, you’ll see that the report is blocked by the uncommitted update in the first tab. While the report is still waiting, make another modification against the Tickets table in the third tab as follows:
UPDATE Tickets SET AssignedTo = 10 WHERE TicketId = 1
Note that this update completes right away. Go to the first tab and commit. Note that the report in the second tab completes right away, and the results are incorrect (precisely the ones I listed at the beginning of this section). To make sure the report returns correct results, you can use a higher isolation level for it. That will increase lock contention.
For instance, if you repeat the same scenario but run the report under the REPEATABLE READ isolation level, the update in the third tab will not complete right away. It will be blocked by the report. You can open Activity Monitor and see that for yourself. Although the report will return the correct results, the price tag for consistency prior to SQL Server 2005 may be quite steep?increased lock contention. If you are already using SQL Server 2005, you can utilize one of the two new isolation levels, either SNAPSHOT or READ COMMITTED SNAPSHOT. (If you need detailed information for configuring your database to enable snapshot isolation, refer to “Choosing Row Versioning-based Isolation Levels” in MSDN.)
No matter which snapshot isolation level you choose, the report will return right away. Note that the report will show the old value for the description of ticket number 2. Instead of “Reports running very slow,” you will see “Reports hanging.” The uncommitted changes from the first tab are invisible to the report. In this particular case, the problem with a report was very easy to notice. In many similar cases, you may get slightly incorrect totals that go unnoticed for quite a while.
Scenario 2. How Snapshot Isolation Prevents Lost Updates
In this scenario, a long-running transaction can detect lost updates under SNAPSHOT but not under READ COMMITTED SNAPSHOT. Also, in this scenario a query running under snapshot isolation exhibits intrinsic optimistic locking behavior. Consider a long batch job that archives old tickets into a history table called TicketsArchive with the same structure as the Tickets table. Assume that archiving is implemented as one large transaction that users need to be able to abort. The implementation of aborting is simple: if a row is inserted into the CancelArchiving table while the archiving transaction is running, the archiving transaction is supposed to detect it and abort.
To start the batch, begin by inserting old tickets into a history table (TicketsArchive) as follows:
DELETE FROM CancelArchivingBEGIN TRANSACTIONINSERT INTO TicketsArchive(TicketId, AssignedTo, Priority, Status,LotsOfComments) SELECT TicketId, AssignedTo, Priority, Status, LotsOfComments FROM Tickets WHERE TicketId < 10
The next step involves archiving various attachments, and it takes considerable time. Suppose that while the process is running, a different connection inserts a request to cancel out:
INSERT INTO CancelArchiving(ReasonForCancel) VALUES('Monthly Report For CEO Not Started Yet')
Other connections continue to modify the data in the range that is going to be deleted, as follows:
UPDATE Tickets SET AssignedTo = 4 WHERE TicketId = 1INSERT INTO Tickets(TicketId, AssignedTo, Priority, Status, LotsOfComments)
VALUES(5, 5, 'Medium', 'Opened', 'Aggregates In Reports Do Not Match')
Eventually, the archiving transaction gets to the point when it tries to check if there is a request to cancel it:
DECLARE @IsCancelled INTSET @IsCancelled = (SELECT COUNT(*) FROM CancelArchiving)PRINT @IsCancelled
Under READ COMMITTED SNAPSHOT, the request to cancel is detected, because it was committed before this select started. Under SNAPSHOT, however, the request goes unnoticed, as are any other changes committed after the transaction has started?and this can be a breaking change! This is an important difference that you need to consider when choosing which snapshot isolation level to use. Suppose, however, that the transaction has no requests to cancel and proceeds to delete:
DELETE FROM Tickets WHERE TicketId < 10
Again, the results under READ COMMITTED SNAPSHOT and those under SNAPSHOT are very different. Under READ COMMITTED SNAPSHOT, the delete succeeds because the changes (the insert and the update in the range that is going to be deleted) were committed before this delete?this particular statement?started. As a result, the updates to tickets 1 and 5 disappear without any warning. This is yet another manifestation of a well-known problem of lost updates. On the other hand, under snapshot isolation, the problem is detected and the delete errors out:
Msg 3960, Level 16, State 2, Line 1Snapshot isolation transaction aborted due to update conflict. You cannot usesnapshot isolation to access table 'dbo.Tickets' directly or indirectly indatabase 'Sandbox' to update, delete, or insert the row that has been modifiedor deleted by another transaction. Retry the transaction or change the isolationlevel for the update/delete statement.
As you have seen, snapshot isolation conveniently provides optimistic locking for you. This capability may come in very handy in some situations. However, this feature may also cause breaking changes. Besides, in this particular case, it might be better to use the SERIALIZEABLE isolation level for large transactions. Anyway, snapshot isolation exposed a serious problem with lost updates. It is entirely up to you to decide how to fix it. (Note: the debate about pessimistic vs. optimistic locking is beyond the scope of this article. If you choose to use pessimistic locking, you can use higher isolation levels.)
The SNAPSHOT or READ COMMITTED SNAPSHOT Isolation Level?
MSDN recommends using READ COMMITTED SNAPSHOT rather than SHAPSHOT under most circumstances. It might also be easier to use. Once enabled, it becomes your default isolation level. However, in some cases, READ COMMITTED SNAPSHOT is not your best choice. Because READ COMMITTED SNAPSHOT, once enabled in the database, becomes the default isolation level, you can end up with breaking changes in you applications. For example, suppose READ COMMITTED SNAPSHOT is not enabled in the database and you want to assign one more ticket to a person, but only if that user does not already have high priority tickets:
BEGIN TRANSACTIONUPDATE Tickets SET AssignedTo = 6 WHERE TicketId = 1AND NOT EXISTS(SELECT 1 FROM Tickets WHERE AssignedTo = 6 AND Priority='High')--- do not commit yet
Note that you have not explicitly specified an isolation level, so your transaction runs under the default READ COMMITTED level. If another connection issues a similar update:
UPDATE Tickets SET AssignedTo = 6 WHERE TicketId = 2AND NOT EXISTS(SELECT 1 FROM Tickets WHERE AssignedTo = 6 AND Priority='High')
it will hang in a lock waiting state. Once you commit your first transaction the second one will complete, but it will not assign ticket 2 to user 6, which is the correct behavior as designed.
Suppose your DBA has enabled READ COMMITTED SNAPSHOT. When you rerun your two queries, you will end up with two high-priority tickets assigned to the same person?not as designed, and you will not receive any warning messages whatsoever! (For more examples of breaking changes introduced by snapshot isolation, refer to Hugo Kornelis's series of articles on sqlblog.com and to my previous DevX article "Avoid These Common Business Rules Implementation Mistakes.")
As you have seen, if you rely on the default isolation level in your queries, you might prefer to use SNAPSHOT over READ COMMITTED SNAPSHOT, at least until you have explicitly specified READ COMMITTED whenever you assumed it. Also note that in some cases transactions behave differently under the SNAPSHOT isolation level than they do under the READ COMMITTED SNAPSHOT isolation level. In any case, explicitly specifying transaction isolation levels prevents your code from breaking after database configuration changes.