samskivert: Concurrency Control and Recovery

Summary
Concurrency, recovery, transactions and ACID introduced. Serializability section: conflict and view serializability, transaction schedules and precedence graph. Recovery section: types of failure (transaction, system, media), undo and redo, steal/no-steal and force/no-force, logging, checkpointing, write-ahead logging. Best practices in concurrency control: two-phase locking, deadlock detection, isolation levels (read uncommitted, read committed, repeatable read, serializable), hierarchical locking, optimistic and multiversion concurrency control. Best practices in recovery: phsiological logging, ARIES protocol (analysis, redo and undo). Extensions and limitations: two-phase commit (distributed dbs), systems not suited to ACID, lack of app knowledge exploitation.

Comments
This is a great overview of the basic ideas that underlie transaction support in RDBMSs. Understanding what exactly has to take place to implement correct transaction semantics in the face of arbitrary failure gives a pretty helpful perspective on when and why a database is going to write to disk. The ARIES protocol is a fine piece of work. The notion of repeatability is another interesting challenge. Say you do a query and then do the same query again later in that same transaction. If repeatability is desired, one must not see new rows inserted by any other transaction. Accomplishing this without write-locking the entire table requires complex coordination. Fortunately, I believe most databases don’t provide that isolation level as the default. It’s way less expensive to just not worry about it.

Source: PDF CiteSeerX

samskivert: Concurrency Control and Recovery – Franklin

30 October 2009