Dealing with Database Concurrency Conflicts in the Real World : Page 2
Database concurrency conflicts are somewhat of a plague in software development because they're hard to predict and handle. Unfortunately, they're also hard to prevent.
by Maurice de Beijer
Aug 22, 2006
Page 2 of 4
The Current State
Let's first take a look at what others generally say and do about solving database concurrency issues. Typically solving the problem is divided into two basic approaches:
Pessimistic concurrency control
Optimistic concurrency control
I will give a brief description of the two and the different options for handling them. Here I'll just clarify the problem. I will in no way offer a complete coverage of concurrency handling and all of its aspects.
Pessimistic Concurrency Control
Pessimistic concurrency control can prevent a collision because the user/process must take some action before the application can change the data. This action can be a number of things but typically involves locking the data in the database, thereby preventing another user from holding the same lock.
Simple to implement. Because database servers support and enforce locking mechanisms they can very easily implement pessimistic concurrency control. Because a user needs to place the lock before making any changes, the database server informs the user before it makes a change that there is a conflict.
Very secure. Because the database server implements the locking very reliably, you as a developer can be assured that nothing will be able to ignore the lock and change the data anyway.
Not very scalable. Locking data in a database requires an open connection to a database. This means every user must have at least one open connection to the database, which means more resources and licenses. When using older database servers, locking might also prevent other users from reading the data.
Prone to deadlocks. Suppose two users both want to change two pieces of dataA and B. User 1 first locks A and user 2 first locks B. Now both users want to lock the second part of the data but cannot, because another user has already locked the other piece they want. You have a deadlock if both users decide to wait until the data is available.
Locks can last a long time. If a user starts changing some data, that data remains locked until that user saves it. But if the user is distracted somehow, or goes to a meeting without saving the changes, the data remains locked; no one else can make any changes until the first changes have been committed.
You can also use pessimistic locking by using soft locks instead of real database locks. This means that you update a field to indicate that a user is busy with the data and has it "locked." This prevents the scalability problem because you don't need to keep the connection open. However this approach has several drawbacks because the database doesn't enforce the lock, so another piece of code can decide to ignore the lock. Additionally, your code must manually remove the lockand if that isn't done, the data remains locked forever.
Even though pessimistic locking has its place, it has a number of drawbacks and doesn't combine very well with the disconnected nature of data in a .NET application.
Optimistic Concurrency Control
When using optimistic concurrency control the user doesn't lock the data to prevent concurrency issues, but detects and resolves them when writing to the database. Developers typically use several different approaches with optimistic concurrency control:
No checking and last one in wins. This is not really a concurrency control mechanism, because the server simply ignores the problem. The last user to update the data overwrites the changes made by the first user. The SQL UPDATE command has only the primary key as the filter in the SQL WHERE clause. This kind of concurrency control is suitable only for single-user applications.
Comparing the fields the user changed. In this case, as part of the update the application compares the data the user wants to change to the data in the database to verify if it is the same as when the user read the data before it commits the change. If the originally read data and the data in the database aren't the same, the server doesn't commit the user's change and alerts the user. The SQL UPDATE command has the primary key and changed fields with their old values and the SQL WHERE clause.
Comparing all fields. Using this version of the optimistic concurrency approach, prior to committing an update, the application will check all fields instead of just the fields being changed. Even though this seems excessive, it is actually better than just checking the changed fields because of the nature of the standard ADO.NET classes. A "feature" of the ADO.NET data classes is that they will execute a SQL UPDATE command with all fields, not just the fields that the user changed. This means that even though no other user has updated the specific field(s) that the user wants to change, another user may have changed a field in another column. In this case the SQL UPDATE command has a SQL WHERE clause containing all fields in the table.
Comparing the row version. In this case, the data has an extra row version field, also known as a timestamp field, which the server changes every time it updates the data. Filtering is much simpler and faster than comparing all fields because the server only needs to filter on the primary key and the row version in the SQL UPDATE. This may appear to be the same behavior as comparing all fields; however, you should exercise care because these two do not result in the same behavior. If one user sends an update that updates all data with the current values the data's row version will still be updated. This means that a second user who wants to change something in the same row will have a conflict when the server checks the row version, but this second user would not have a conflict when comparing all fields.
Scalable. The application doesn't need to keep a connection open to the database, allowing this to scale to large numbers of users.
Simple to implement. This is especially easy to implement when using a row version field.
Little risk of deadlocks. Because the database server keeps no locks that prevent applications from doing their work there is also no risk of a deadlock. The only remaining risk is in the application logic itself.
Not very secure. Because the database cannot enforce restrictions on data use, the possibility remains that a rogue allocation decides to ignore a row version and update the data regardless.