Tag Archives: transactions

A special kind of Hell…

So, I thought that ActiveRecord transaction support was finally reasonably mature and usable in Rails 4.

How wrong I was.

One of my biggest complaints with ActiveRecord has been that there was no way to set transaction isolation for a given transaction. This got addressed in Rails 4. Unfortunately nothing else appears to have been fixed.

The Problem

The problems remaining in ActiveRecord surrounding transactions:

  • No connector agnostic handling of transaction collisions/forced rollbacks.
  • No visibility of transaction isolation level inside a transaction
  • No automatic restart facilities for transactions.

No connector agnostic handling of transaction collisions/forced rollbacks.

My last post surmised why this is a problem pretty well – you need to understand your specific connector well enough to understand precisely how it’s going to tell ActiveRecord that the transaction failed. This is incredibly non-portable as a result – ActiveRecord is supposed to hide database details, not expose them!

No visibility of transaction isolation level inside a transaction

Transaction isolation is nice and fine, but the way it interacts with transaction nesting is a pain in the ass.

If a new deeper transaction is requested in a nested set with equal or lower transaction isolation to the open transaction, it should just work as normal without changing the isolation level. Whilst I’m sure there’s plenty of arguments to do with performance as to why this is suboptimal, but if you’re lowering isolation levels for ‘performance’, you probably shouldn’t be running those statements inside of a higher isolation transaction block.

I can’t even wrap this in since the isolation level information is discarded when the transaction is created – ActiveRecord makes no effort to remember what isolation level we’re running at.

This one is pretty important too since you can’t always see the implementation details of model methods at a glance and as such can’t tell if it’s usually a short transaction to ensure that it’s results are consistent.

No automatic restart facilities for transactions.

Not everybody realises that a transaction can fail because the database can’t guaranty it’s consistency, leading to the incorrect application of transactions by beginners.

Having an optional automatic restart facility would alleviate some of these problems as it would make it clear that it may be required in some circumstances to handle a transaction restart, and also allow users to get that behaviour with minimal effort.

It also facilities some improvements I have thought of…

A Proposal

I think we need to make the following changes to the transaction code:

  • Create an IsolationLevel class or comparison method so it’s possible to actually compare isolation levels against each other easily. As there is a clear heirachy of isolation with the standards based isolation levels, we should be able to compare them naturally in our code, and it should be possible to add any non-standard levels a driver offers.
  • Add an isolation_level attribute to the Connection.current_transaction objects so you can discover what isolation level is currently in effect
  • Add an optional restartable attribute to the transaction which defaults to false which indicates if the transaction can be auto-restarted. Add a optional max_retries attribute to specify how hard it should try.
  • Treat ALL transactions uniformly – at the moment the code differentiates a transaction started with an explicit isolation level parameter vs one without. You should be able to join transactions at the same isolation level that you desire at the very least, and there’s little harm in allowing transactions of lower isolation levels to join higher isolation level transactions.
  • Consider using the auto-restart facility, if enabled, to kick the isolation level on a transaction to a higher isolation level if required by a sub-transaction. This can be expensive if it’s happening a lot, but it’s better than some of the hoops required without it. In a lot of circumstances, however, the cost should be negligible as the query caches should help reduce the second execution costs.