Transactions and Isolation levels

Safety googles At work, we have two applications which connect to the same database. For all kinds of business reasons, we need to make sure that only one of the applications accesses certain data at the same time. To do this, we use a row in a table as a semaphore.

While working on the locking mechanism, we had a closer look at the Transaction Management and the Isolation Levels we were using. There is a lot of good documentation on Transaction and Isolation, but it tends to be over complete, elaborate and therefore hard to read. I'll try to share our insights with you in a slightly more digestable form (I hope).

What is a Transaction Every (proper) database uses Transactions. A Transaction is a unit of work which you can undo (roll back) or save (commit) to the database. Generally, database clients tend to "Auto Commit" which effectively means that each of your actions on the database are executed immediately and can not be rolled back.

When you click through your client's settings, you can probably find an option to turn Auto Commit off. Now, at your first database write action (insert or update), a Transaction is started. The database will keep track of each of your write actions, and once you're finished, you can decide to keep the changes by "committing" the Transaction. You can also decide to revert all the changes you made and do a "rollback", as long as the Transaction is not closed. Both Commit and Rollback close a transaction, so choose wisely.

What is an Isolation Level Most databases allow more than one user to connect to the database. In order for you to do your work, you may require a stable database without changes. To protect yourself from seeing other peoples updates, you can set an "Isolation Level". Generally, you set the isolation level at the time of starting a Transaction.

Isolation levels come in many different flavours, from "liberal" to "strict":

  * _READ UNCOMMITTED_ - Allows you to see the database as it is right now. Doing he same query on a table twice is likely to produce different results as other people are updating the table, even if they haven't committed their transactions yet.
  * _READ COMMITTED_ - The database will only show you changes to the table which are commited. Doing the same query twice can still produce different results, but you're sure that those changes will not be olled back by the authors.
  * _REPEATABLE READ_ - The database will make sure that when you execute the same query twice, you will see the same data twice. It may have to lock the rows in the table to do this, which will prevent other people from updating them until you complete your transaction.
  * _SERIALIZABLE_ - The database will not only make sure you see the same data, but it will also make sure that all updates in the system are done in such a way that it seems as if there was only 1 user connected to the database at a time. This is the most "expensive" isolation level, and tends to be relatively slow.

Transaction is not Isolation. Although a Transaction may suggest that your changes are not written to the database until after your commit, it will write all changes directly to the tables. That means that anybody can see what you are doing to the database, depending on their Isolation levels. Mind you: their isolation levels, not yours.

You can see a Transaction as your "undo buffer" if you will, and the "Isolation Level" as the filter through which you are looking at other people's actions to the database.

How about our semaphores? Back to our semaphore problem. Reading about Transactions and Isolation Levels, we knew that both systems must do the "fetch" (a read and an update) of the semaphore within a single transaction, and both systems must use the "serializable" isolation level to make sure that it reads the correct, current value without having a chance of incorrectly fetching the smaphore.

What's funny about the serializable isolation level however, is that it does not prevent you from overwriting other people's values without knowing. Our example:

  1. Application 1 reads the semaphore to be 0 and decides it can set it to 1.
  2. Application 2 reads the semaphore to be 0 and also decides it can set it to 1.
  3. Application 1 sets the semaphore to 1 and thinks it has the lock.
  4. Application 2 also sets the semaphore to 1 and thinks is has the lock.

This is clearly a problem. We learned that:

Regardless of your isolation level or transaction usage, the last commit always wins.

There is, however one database on the market which at least fails in this situation. It's Oracle. When you use the "Serializable" isolation level, Oracle will prevent you from overwriting a row which was changed without your knowing about it. This changes our scenario as follows:

  1. Application 1 reads the semaphore to be 0 and decides it can set it to 1.
  2. Application 2 reads the semaphore to be 0 and also decides it can set it to 1.
  3. Application 1 sets the semaphore to 1 and thinks it has the lock.
  4. Application 2 tries to set the semaphore to 1, but Oracle notices that the row has changed since the last read, and will fail to do the update. An error (ORA-08177) is thrown at the application, allowing you to re-read the lock and discover it has been taken.

Conclusion Generally, you will have 1 application connecting to 1 database. The application will use the same Transaction granularity and Isolation levels throughout, and you will not notive these interesting behaviours. As soon as you have multiple applications with different isolation levels connecting to the same databases, you are likely to run into interesting, sporadic irreproducable problems. Try to think about your Isolation strategies.

If I got you interested, I recommend reading this blogpost by Peter Veentjer about this very same thing. Peter is a bit more technically correct than I am in this post.