Making Snapshot Isolation Serializable
Making Snapshot Isolation Serializable
net/publication/220225203
CITATIONS READS
251 1,270
5 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Patrick O'Neil on 19 May 2014.
Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson
et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the com-
mon concurrency anomalies, and has been implemented by Oracle and Microsoft SQL Server
(with certain minor variations). SI does not guarantee serializability in all cases, but the TPC-C
benchmark application [TPC-C], for example, executes under SI without serialization anoma-
lies. All major database system products are delivered with default nonserializable isolation
levels, often ones that encounter serialization anomalies more commonly than SI, and we sus-
pect that numerous isolation errors occur each day at many large sites because of this, lead-
ing to corrupt data sometimes noted in data warehouse applications. The classical justification
for lower isolation levels is that applications can be run under such levels to improve efficiency
when they can be shown not to result in serious errors, but little or no guidance has been of-
fered to application programmers and DBAs by vendors as to how to avoid such errors. This
article develops a theory that characterizes when nonserializable executions of applications can
occur under SI. Near the end of the article, we apply this theory to demonstrate that the TPC-C
benchmark application has no serialization anomalies under SI, and then discuss how this demon-
stration can be generalized to other applications. We also present a discussion on how to modify
the program logic of applications that are nonserializable under SI so that serializability will be
guaranteed.
Categories and Subject Descriptors: H.2.4 [Database Management]: Systems—Transaction
processing
General Terms: Theory, Reliability
The research of all the authors was supported by National Science Foundation (NSF) Grant IRI
97-11374 (see http://www.cs.umb.edu/∼isotest/summary.html).
The work of Professor Shasha was also supported by NSF Grants IRI 97-11374, IIS-9988636, and
N2010-0115586.
Authors’ addresses: A. Fekete, School of Information Technologies, Madsen Building F09, Univer-
sity of Sydney, N.S.W. 2006, Australia; email: [email protected]; D. Liarokapis, E. O’Niel, and
P. O’Niel, Umass/Boston, Boston, MA 02125-3393; email: {dimitris,eoniel,poneil}@cs.umb.edu; D.
Shasha, Courant Institute, 251 Mercer Street, New York, NY 10012; email: [email protected].
Permission to make digital or hard copies of part or all of this work for personal or classroom use is
granted without fee provided that copies are not made or distributed for profit or direct commercial
advantage and that copies show this notice on the first page or initial screen of a display along
with the full citation. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
to redistribute to lists, or to use any component of this work in other works requires prior specific
permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 1515
Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or [email protected].
C 2005 ACM 0362-5915/05/0600-0492 $5.00
"
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005, Pages 492–528.
Making Snapshot Isolation Serializable • 493
Additional Key Words and Phrases: Concurrency control, serializability, anomaly, consistency, weak
isolation, snapshot isolation, multiversion concurrency
1 We leave the term data item ambiguous in the usual way; all definitions and proofs go through for
any granularity of data substituted for this term. We note, however, that in the analysis of TPC-C,
we refer explicitly to field granularity, a field being a specific column value on a specific row of a
table.
R1 (X ) R2 (X ) W2 (X ) C2 W1 (X ) A1 (1.1)
the prior version had a special subscript representing unborn data, X a .2 (Note
that the creating Transaction ID is used only for version numbers as a conve-
nience for recognizing later in the history what transactional output is being
read; since unborn and dead versions are never read at a later time, there is
no contradiction in using special version subscripts in these cases.) Snapshot
Isolation, or SI, is a multiversion concurrency control algorithm introduced in
Berenson et al. [1995]. In what follows, we define time to be measured by a
counter that advances whenever any transaction starts, commits, or aborts,
and we designate the time when a transaction Ti starts as start(Ti ) and the
time when Ti commits or aborts as complete(Ti ); when Ti is successful, we also
write complete(Ti ) as commit(Ti ).
2 When we speak of inserting or deleting a data item of field granularity, we assume that the insert
or delete deals simultaneously with all the fields (column values) contained in a given row that is
inserted or deleted.
by Ti to read a row3 that has changed since start(Ti ) will cause the system to
read an older version, current as of start(Ti ), in the rollback segment. Indexes
are also accessed in the appropriate snapshot version, so that predicate evalu-
ation retrieves row versions current as of the snapshot. The First Committer
Wins rule is enforced, not by a commit-time validation, but instead by checks
done at the time of updating. If Ti and Tk are concurrent (their transactional
lifetimes overlap), and Ti updates the data item X , then it will take a Write
lock on X ; if Tk subsequently attempts to update X while Ti is still active, Tk
will be prevented by the lock on X from making further progress. If Ti then
commits, Tk will abort; Tk will be able to continue only if Ti drops its lock on
X by aborting. If, on the other hand, Ti and Tk are concurrent, and Ti updates
X but then commits before Tk attempts to update X , there will be no delay
due to locking, but Tk will abort immediately when it attempts to update X
(the abort does not wait until Tk attempts to commit). For Oracle, we rename
the First Committer Wins rule to First Updater Wins; the ultimate effect is the
same—one of the two concurrent transactions updating a data item will abort.
Aborts by Tk because of being beaten to an update of data item X are known
as serialization errors, ORA-08177 (Oracle Release 9.2).
Snapshot Isolation is an attractive isolation level. Reading from a snapshot
means that a transaction never sees the partial results of other transactions:
T sees all the changes made by transactions that commit before start(T ), and
it sees no changes made by transactions that commit after start(T ). Also, the
First Committer Wins rule allows Snapshot Isolation to avoid the most common
type of lost update error, as shown in Example 1.1.
Example 1.1 (No Lost Update). If transaction T1 tries to modify a data item
X while a concurrent transaction T2 also tries to modify X , then Snapshot
Isolation’s First Committer Wins rule will cause one of the transactions to abort,
so the first update will not be lost. For example (we include values read and
written for data items in this history):
H1 : R1 (X 0 , 50) R2 (X 0 , 50) W2 (X 2 , 70) C2 W1 (X 1 , 60) A1 .
This history leaves X with the value 70 (version X 2 ), since only T2 , attempt-
ing to add an increment of 20 to X , was able to complete. T1 can now retry
and hopefully add its increment of 10 to X without interference. Note that
many database system products with locking-based concurrency default to the
READ COMMITTED isolation level, which takes long-term write-locks but no
long-term read locks (it only tests reads to make sure they do not read write-
locked data); in that case, the history above without versioned data items would
succeed in both its writes, causing a Lost Update.
Despite its attractions, Snapshot Isolation does not ensure that all executed
histories are serializable, as defined in classical transactional theory (e.g., in
Bernstein et al. [1987], Papadimitriou [1986], and Gray and Reuter [1993]).
Indeed, it is quite possible for a set of transactions, each of which in isolation
3 We refer specifically to rows having versions when discussing the Oracle implementation; however
we note that the concepts still apply to data items at field granularity: specific column values on
specific rows.
Example 1.2 (SI Write Skew). Suppose X and Y are two data items rep-
resenting checking account balances of a married couple at a bank, with the
constraint that X + Y > 0 (the bank permits either account to be overdrawn, as
long as the sum of the account balances remains positive). Assume that initially
X 0 = 70 and Y 0 = 80. Under Snapshot Isolation, transaction T1 reads X 0 and
Y 0 , then subtracts 100 from X, assuming it is safe because the two data items
added up to 150. Transaction T2 concurrently reads X 0 and Y 0 , then subtracts
100 from Y, assuming it is safe for the same reason. Each update is acting
consistently, but Snapshot Isolation will result in the following history:
Here the final committed state (X 1 and Y 2 ) violates the constraint that X +Y >
0. This problem was not detected by First Committer Wins because two different
data items were updated, each under the assumption that the other remained
stable. Hence the name “Write Skew”.
queries, each retrieving the records satisfying some condition; with data item
locking, it is possible for one query to miss a row satisfying this condition which
is about to be inserted by a concurrent transaction, while the later query sees
this row after the inserting transaction commits. This is a nonrepeatable read,
and such a history is obviously not serializable. The classic paper [Eswaran
et al. 1976] identified this issue, although the solution proposed there has been
found to be unrealistic in practice, and instead most locking-based systems
use algorithms which lock index records (or associated information such as the
index key) as well as data records (see Section 7.8 of Gray and Reuter [1993]).
To allow us to reason properly about programs with set-oriented operations,
in this article, we use a formalism in which there are three types of operation
a transaction can perform: item read, item write, and “predicate read.” This
section explains our model and how it can represent SQL statements.
We define a predicate read as an operation which identifies a set of items in
the database, based on the state of the database. Formally, a predicate read is
represented in a history by the operation PRi (P ), or, when we need to also show
the return value, by PRi (P , list of data items), where P is some function which
takes a state (a mapping from item names to values) and returns a set of item
names. It is expected that the transaction will follow this by later performing
item read or item write operations on (some of) the items which were returned
in the predicate read.
Let us see how this applies to a SQL SELECT statement
SELECT T.col1 FROM T WHERE T.col2 = :x;
in a database where each field is regarded as an item. We represent the SELECT
statement with, first, a predicate read, which reflects the evaluation of the
WHERE clause. This predicate read determines which fields occur in those
rows of T that have the given value for their col2 field. The return value of the
predicate read will be a list of field names (i.e., a sequence of (rowid, columnid)
pairs). The actual retrieval of the target list (T.col1) takes place in successive
item reads, each of which reads the col1 value in one of the rows found by the
predicate read. Note that with Snapshot Isolation, the state of the database
against which Ti ’s predicate read is evaluated consists of the version of each
item which was most recently committed before the start of Ti , or the most
recent version produced by Ti if Ti has itself produced a version of that item.
It is important to avoid a common misconception with set-oriented opera-
tions: in our model there is no such thing as a Predicate Write operation to
conflict with a Predicate Read. A SQL operation
UPDATE T SET T.col1 = :y WHERE T.col2 = :x;
will be modeled as a Predicate Read which returns the relevant fields, followed
by a succession of item write operations which modify each returned col1 field
to the new value y. The concept of a Predicate Write has not been used since the
prototype System R demonstrated that estimating intersections of set-oriented
update data item collections with set-oriented read data item collections led to
unnecessary conflicts (as explained in Chamberlin et al. [1981], near the end
of Section 3). Instead, in our model, a predicate read can conflict only with an
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
500 • A. Fekete et al.
item write which changes the state of the database in such a way as to change
the list of items returned by the predicate read. For example, the WHERE clause
conflicts with any write that produces a new version of some col2 field, if the
new version has value x and the old version did not, or if the new version has
a different value while the previous version was equal to x.
In Snapshot Isolation, the traditional phantom examples of Eswaran et al.
[1976] mentioned above (the nonrepeatable read anomaly) cannot arise, be-
cause repeated evaluation of a WHERE clause within an SI transaction
always returns the same set of items (as the evaluation is always based on
the state at the start of the transaction). However, a somewhat more com-
plex anomaly can occur involving predicate reads, as shown in the following
example.
Fig. 1. DSG(H1 ).
rw
—We say there is a Tm −→ Tn dependency (a rw dependency or an anti-
i−rw pr−rw
dependency) if Tm −→ Tn or Tm −→ Tn .
— We say there is a Tm → Tn dependency (an arbitrary dependency) if any of
wr ww rw
the dependencies above hold: Tm −→ Tn , Tm −→ Tn , or Tm −→ Tn .
We now define DSG(H ), the Dependency Serialization Graph for a history H, a
counterpart to the Serialization Graph definition of Bernstein et al. [1987], but
where the edges of the DSG are labeled to indicate which dependencies occur.
Definition 2.2 (DSG(H )). A directed graph DSG(H) is defined on a multi-
version history H, with vertices (often called nodes) representing transactions
that commit, and each distinctly labeled edge from Tm to Tn corresponding to a
wr ww rw
Tm −→ Tn , Tm −→ Tn , or Tm −→ Tn dependency.
Before we proceed with the proof, we make a few remarks. By Lemma 2.3,
both concurrent edges whose existence is asserted must be anti-dependencies:
rw rw
Ti.1 −→ Ti.2 and Ti.2 −→ Ti.3 . Example 2.2 in the next subsection shows that
Ti.1 and Ti.3 can be the same transaction; Example 2.3 shows that Ti.1 might be
a read-only transaction as long as Ti.2 and Ti.3 are update transactions.
the reads of T1 cannot occur after one write by T2 and before another, because
both writes take place at an instant of time.
For a picture of how to devise cycles in a scheduler-oriented SI history, a
much more difficult undertaking, we start by defining what we call an SI-RW
diagram for a scheduler-based history; in an SI-RW diagram, we represent all
reads of transaction Ti to occur at an instant of time at a vertex Ri (a pseudonym
for start(Ti )), and all writes at an instant of time at a vertex Wi (a pseudonym
for Ci ); vertex Ri appears to the left of vertex Wi (time increases from left to
right), with the two connected by a horizontal dotted line segment. See Figure 2
for an example. The SI-RW diagram is based on a structure known as an SC-
Graph defined in Transaction Chopping [Shasha et al. 1995; Shasha and Bonnet
2002], with two types of edges: Sibling edges and Conflict edges; in our SI-RW
diagrams, the only sibling edges are the dotted line segment connecting the Ri
and Wi vertices for any transaction Ti that has both Reads and Writes. The
time at which an SI transaction occurs can be arbitrarily defined at any point
between Ri and Wi , but for purposes of this analysis we will say the transaction
Ti occurs at time Wi (i.e., at commit time Ci ).
wr ww
Now when we consider the three types of dependency, Tm −→ Tn , Tm −→ Tn
rw
and Tm −→ Tn , we note that the wr and ww dependencies always point in the
ww
direction such that Tn occurs after Tm : if Tm −→ Tn , Wm must be executed
wr
before Wn , while if Tm −→ Tn , Wm must be executed before Rn , and therefore
before Wn . But any cycle among transactions must somehow “go back in time”
at some point, which cannot occur in a history with only wr and ww depen-
rw
dencies. A rw dependency Tm −→ Tn allows Wn to occur before Wm when the
two transactions are concurrent. See Figure 2, where Ti.1 is Tm and T1.2 is Tn ;
rw
clearly Ti.1 −→ Ti.2 , but Ti.2 occurs before Ti.1 . Of course Figure 2 does not il-
lustrate a cycle, since Ti.1 and Ti.2 are concurrent. However, a sequence of two
rw
successive Tm −→ Tn dependencies can complete such a cycle, as we see in
Figure 3.
The cycle we see in Figure 3 is along the double-line rw dependency from
Ri.1 to Wi.2 , back in time along the sibling edge from Wi.2 to Ri.2 , then along the
double-line rw dependency from Ri.2 to Wi.3 , and finally along the wr dependency
from Wi.3 to Ri.1 . This exemplifies the cycle that was described in the proof of
Theorem 2.1. All dependency edges are directed, and any cycle must follow
the direction of the directed edges. The Sibling edges on the other hand are
undirected, and a cycle representing an anomaly can traverse a sibling edge
in either direction. We note that the sibling edge separation of the Ri and Wi
vertices provides an intuitive feel for how a cycle “goes back in time”, but doesn’t
add any power to the DSG(H) diagrams of Figure 1. After providing illustrative
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
Making Snapshot Isolation Serializable • 509
v0 , e1 , v1 , e2 , . . . , en , vn ,
beginning and ending with a vertex, such that each edge is immediately
preceded and followed by the two vertices that lie on it; directed edges must
be incident from the vertex that precedes it and incident to the vertex that
follows it. We refer to the number of edges in the sequence, n, as the length
of the path. A simple path is a path whose edges are all distinct. A circuit
is a path where v0 = vn , and where the edges are all distinct. An edge with
identical initial and terminal vertices is called a loop. Note that a loop forms
a circuit of length 1. An elementary circuit is a circuit whose vertices are dis-
tinct except for the beginning vertex and the ending vertex. (Other mathe-
matics references give essentially the same definitions with slightly different
names.)
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
Making Snapshot Isolation Serializable • 513
wr
the W −→ W edge (which might arise, for example, if one execution of W com-
ww
mitted before a second execution started), and the W −→ W edge (which might
arise if two successive executions of W performed a withdrawal from the same
account).
The inverse of lifting (which might be called a sinking) is not always well
defined. For example, there is no way we can sink the cycle in SDG(A) of the rw
wr
and wr edges Figure 8 to a cycle in DSG(H). This is clear because Tn −→ Tm
rw
implies that start(Tm ) comes after commit(Tn ) and the edge Tm −→ Tn implies
that start(Tm ) comes before commit(Tn ) (concurrently or not). More generally,
Theorem 2.1 states that no cycle in DSG(H) under SI can fail to have two succes-
sive concurrent rw anti-dependencies, but the transactional cycle we attempted
to sink from Figure 8 does not have two vulnerable anti-dependencies. It turns
out that even cycles in an SDG(A) with two successive vulnerable static anti-
dependencies will sometimes not successfully sink to a cycle in DSG(H), and
the problem becomes complex, so from now, we shall concentrate on what we
can learn from studying lifting from DSG(H) to SDG(A).
4. ANALYZING AN APPLICATION
In this section, we show how to determine the static dependencies that de-
fine the graph SDG(A) for the TPC-C application. At the end of this section,
we consider the issues that might arise with other applications. We need the
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
516 • A. Fekete et al.
Fig. 9. Abbreviations and names of tables and transactions in the TPC-C benchmark.
the tables and transactions, both the actual names and abbreviations (listed as
Tbl Abbrev and Tx Abbrev) that we use in our analysis.
Note that we list table abbreviations with concluding dots (.), something that
is not done in the TPC-C benchmark specification [TPC-C]. In what follows, we
also list tablename/columname pairs in the form T.CN; for example, to specify
what Warehouse a Customer row is affiliated with (see definition below), we
would use C.WID, whereas in [TPC-C] it is represented as C W ID. We find
that the complexity of names in the TPC-C specification is distracting; the
abbreviations we use should be easily understood by readers wishing to tie our
discussion to the original.
Figure 10 provides a schema for how the tables of Figure 9 relate to one
another. If we assume that a database on which the TPC-C benchmark is run
has W Warehouse rows, then some of the other tables have cardinality that is
a multiple of W. There are 10 rows in the District table associated with each
Warehouse row, and the primary key for District is D.DID D.WID, specifying
each district row’s parent Warehouse; there are 3000 Customer rows for each
District, and the primary key for Customer is C.CID C.DID C.WID, specifying
each Customer row’s parent District and grandparent Warehouse.
We now describe tables and programs, while discussing the program logic.
The reader may wish to refer back to Figure 9 to follow this short (and incom-
plete) discussion. Full Transaction Profiles for programs are given in [TPC-C].
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
518 • A. Fekete et al.
Starting with the District table in Figure 11, we see that the only con-
flicts involve D.YTD and D.NEXT, since the other columns of D are only
read, not written. We note a potential data item conflict between SLEV and
NEWO on D.NEXT. This leads to an SLEV ⇒ NEWO static dependency and
wr
an NEWO−→SLEV dependency in our SDG(TPC-C) graph, see Figure 13. In
addition, we have ww conflicts of NEWO with itself and PAY with itself, indi-
cated by back arcs on Figure 13.
In the New-Order table of Figure 11, we see four programs listed as accessing
NO.PK, that is, NO.WID, NO.DID, and NO.OID. We need to consider them by
pairs, where each pair has a writing program member.
(1) DLVY1 and NEWO. DLVY1 has a predicate read for the smallest OID for
given WID and DID, and finds none. There is a DLVY1 ⇒ NEWO conflict
arising from a predicate conflict, because NEWO could insert a new row,
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
522 • A. Fekete et al.
concurrently, that would change the result of DLVY1 s predicate read. There
wr
is no NEWO−→DLVY1 conflict, however, because if DLVY successfully
reads a row output by NEWO, it is counted as DLVY2 .
(2) DLVY2 and NEWO. DLVY2 has the same predicate read, but finds such
wr
a row. It can Read what NEWO inserts, so we have NEWO−→DLVY2 .
However, NEWO cannot make a difference to the predicate read of DLVY2 ,
because additional rows beyond the one retrieved will have higher OIDs, so
we have no rw conflict here.
(3) NEWO and NEWO. There is only a ww conflict.
(4) DLVY2 and DLVY2 . There are ww, rw and wr conflicts here, but none
vulnerable. Concurrent transactions of this program would access the same
smallest OID, and have a ww conflict.
Note that although table OL shows reads by SLEV and a write by DLVY2 , it
generates no field-level conflicts between SLEV and DLVY2 , since these accesses
are to different columns.
We leave the rest of the details of finding conflicts in the annotated table of
Figure 11 as an exercise for the reader. The result, after regrouping by program
name is given in Figure 12. Cases where all columns in a table are involved in
conflicts are indicated by wild-card notation. For example, the NEWO trans-
action is shown as writing O.*, that is, it writes all columns in the O. table.
ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
Making Snapshot Isolation Serializable • 523
Figure 12 includes some data-item conflicts involving the STOCK table that
were not shown in Figure 11. Note that although all programs use predicate
reads to access tables, some are not shown because they cannot provide conflicts:
they are accessing tables that never get inserted or deleted in this transaction
mix, and the columns involved in the predicate are never updated.
programs from the application and running under Snapshot Isolation, will be
serializable. This is the case when the static dependency graph has no dan-
gerous structures. However, when the DBA produces the graph, he/she may
find in it some dangerous structures. In some situations such as happened with
TPC-C, a more precise analysis using application splitting could resolve mat-
ters and allow us to be sure that all executions are serializable. However, some
applications may allow nonserializable executions when SI is the concurrency
control mechanism. This means that some integrity constraints, which are not
declared explicitly in the schema and so are not enforced by the DBMS, could
be violated during execution. We think that the DBA would be very unwise
to carry the risk of this happening. In such cases, the solution we suggest is
to modify the application programs so as to avoid dangerous structures with-
out changing the functionality of the programs. This will normally require a
rather small set of changes. The DBA should identify every place in the static
dependency graph where a dangerous structure exists. Every such structure is
defined by one or two distinct vulnerable edges. Then, the DBA can then choose
one of these vulnerable edges in each dangerous structure, and modify one or
both application programs corresponding to the vertices of the edge so that the
edge ceases to be vulnerable.4
4 The Oracle White Paper [Jacobs et al. 1995] provided a number of program modification sugges-
tions that follow. What is new in our development is an effective procedure to guarantee that all
serializability problems are addressed.
the same day can result in a non-serializable execution that breaks a constraint
requiring the total number of hours worked by any employee on any day to be
no more than eight. If we were to use a single value X in a Conflict table
to materialize any conflict of this sort, it would become impossible to assign
two tasks to two different employees concurrently, a problematic situation for
a large company. But a minor variation of the Conflict table approach would
factor the problem along natural parameters of the program. Simply create a
table TotalHours(eid, day, total), with eid and day a primary key. The program
to add new tasks for employees should then keep the row for each eid-day up to
date with the total number of hours assigned: whenever the WorkAssignments
table has a row added, the program making the assignment will also update
the TotalHours row, guaranteeing that no vulnerable conflict can arise. It is
even acceptable for the program that inserts new tasks to also insert the Total-
Hours row when none is found; the system-enforced primary key constraint will
then guarantee that there are no cases of write skew on the TotalHours table
itself!
5.2 Promotion
To remove vulnerability from a data-item anti-dependency, we can use a simpler
approach. The data item on which the conflict arises already exists in both
programs, so all we need to do is guarantee that when we read a data item we
are going to depend on, another transaction cannot simultaneously modify the
same data item. We have a way of ensuring this that we call Promotion, which
has the advantage that only one of the programs needs to be modified.
A data-item anti-dependency implies that one can find one data item
identified by a program that is both read by the transaction generated by P1
and written by the transaction generated by P2 in every case where there is a
conflict. Therefore, we can alter P1 so that in addition to reading this item, it
performs an identity write. That is, if the data item is column c of table X , P1
updates X to set X.c = X.c under the same conditions (WHERE clause) as used
in the read. We describe this as a Promotion of the read of X to a write. Again,
the First Committer Wins rule will prevent both transactions committing in
any concurrent execution of P1 and P2 . In fact, with the Oracle implementation
of Snapshot Isolation, it is enough to modify P1 so that the read of X is done
in a SELECT FOR UPDATE statement; Oracle treats SELECT FOR UPDATE
just like a write in all aspects of concurrency control, but has smaller overhead
because no logging has to take place to recover an update.
6. CONCLUSIONS
Snapshot Isolation is now an important concurrency mechanism offering good
performance in many commonly needed circumstances, but it can produce er-
rors in common situations. This may corrupt the database, leading to financial
loss or even injury.
This work provides a mechanism to avoid serializability violations through
analysis of the transaction programs, without requiring any modification of
the DBMS engine. In some cases, our results show that an application will
run serializably under Snapshot Isolation as it stands. When this is not the
case, we have shown how to identify specific pairs of programs where small
modifications of the text will lead to a semantically equivalent application
that executes serializably. Given the ability to detect static dependencies be-
tween the application programs, and an understanding of the static dependency
graph construction, programmers will have a practical way to ensure that their
applications execute correctly even when the underlying database management
doesn’t. A utility providing the user an intuitive interface to perform this kind
of analysis is the next step in our research.
REFERENCES
ADYA, A., LISKOV, B., AND O’NEIL, P. 2000. Generalized isolation level definitions. In Proceedings
of IEEE International Conference on Data Engineering, (Feb.) IEEE Computer Society Press, Los
Alamitos, Calif., 67–78.
ANDERSON, T., BRETBART, Y., KORTH, H., AND WOOL, A. 1998. Replication, consistency and practical-
ity: are these mutually exclusive? In Proceedings of the ACM SIGMOD International Conference
on Management of Data (June). ACM, New York, 484–495.
BERENSON, H., BERNSTEIN, P., GRAY, J., MELTON, J., O’NEIL, E., AND O’NEIL, P. 1995. A critique of
ANSI SQL isolation levels. In Proceedings of the ACM SIGMOD International Conference on
Management of Data (June) ACM, New York, 1–10.
BERGE, C. 1976. Graphs and Hypergraphs (2nd edition). North-Holland Mathematical Library,
Volume 6.
BERNSTEIN, P., HADZILACOS, V., AND GOODMAN, N. 1987. Concurrency Control and Recovery in
Database Systems. Addison-Wesley. (This text is now out of print but can be downloaded from
http://research.microsoft.com/pubs/ccontrol/default.htm)
BERNSTEIN, A., LEWIS, P., AND LU, S. 2000. Semantic conditions for correctness at different isola-
tion levels. In Proceedings of IEEE International Conference on Data Engineering (Feb.). IEEE
Computer Society Press, Los Alamitos, Calif., 57–66.
BREITBART, Y., KOMONDOOR, R., RASTOGI, R., SESHADRI, S., AND SILBERSCHATZ, A. 1999. Update prop-
agation protocols for replicated databases. In Proceedings of the ACM SIGMOD International
Conference on Management of Data (June). ACM, New York, 97–108.
CHAMBERLIN, D., ASTRAHAN, M., BLASGEN, M., GRAY, J., KING, W., LINDSAY, B., LORIE, R., MEHL, J., PRICE,
T., PUTZOLU, F., SELINGER, P., SCHKOLNICK, M., SLUTZ, D., TRAIGER, I., WADE, B., AND YOST, R. 1981.
A history and evaluation of System R. Commun. ACM 24, 10 (Oct.). 632–646. (Also in: M. Stone-
braker and J. Hellerstein, Readings in Database Systems, Third Edition, Morgan Kaufmann
1998.)
ELNIKETY, S., PEDONE, F., AND ZWAENEPOEL, W. 2004. Generalized snapshot isolation and a prefix-
consistent implementation. Tech. Rep. IC/2004/21, EPFL, Mar.
ESWARAN, K., GRAY, J., LORIE, R., AND TRAIGER, I. 1976. The notions of consistency and predicate
locks in a database system. Commun. ACM 19, 11 (Nov.), 624–633.
FEKETE, A. 1999. Serializability and snapshot isolation. In Proceedings of the Australian
Database Conference (Auckland, New Zealand, Jan.). 201–210.
FEKETE, A., O’NEIL, E., AND O’NEIL, P. 2004. A read-only transaction anomaly under snapshot
isolation. ACM SIGMOD Record 33, 3 (Sept.), 12–14.
GRAY, J. (ED.). 1993. The Benchmark Handbook (2nd edition). Morgan-Kaufmann, San Francisco,
Calif.
GRAY, J. AND REUTER, A. 1993. Transaction Processing: Concepts and Techniques. Morgan-
Kaufmann, San Francisco, Calif.
GRAY, J., HELLAND, P., O’NEIL, P., AND SHASHA, D. 1996. The dangers of replication and a solution.
In Proceedings of the ACM SIGMOD International Conference on Management of Data (June).
ACM, New York, 173–182.
JACOBS, K., BAMFORD, R., DOHERTY, G., HAAS, K., HOLT, M., PUTZOLU, F., AND QUIGLEY, B. 1995. Con-
currency Control: Transaction Isolation and Serializability in SQL92 and Oracle7. Oracle White
Paper, Part No. A33745 (July).
LIAROKAPIS, D. 2001. Testing Isolation Levels of Relational Database Management Systems, Ph.D.
dissertation, University of Massachusetts, Boston, Mass. (Dec.). (This thesis can be downloaded
from http://www.cs.umb.edu/∼dimitris/thesis/thesisDec20.pdf.)
PAPADIMITRIOU, C. 1986. The Theory of Database Concurrency Control. Computer Science Press.
SHASHA, D. AND BONNET, P. 2002. Database Tuning: Principles, Experiments, and Troubleshooting
Techniques. Morgan-Kaufmann, San Francisco, Calif.
SHASHA, D., LLIRBAT, F., SIMON, E., AND VALDURIEZ, P. 1995. Transaction chopping: Algorithms and
performance studies. ACM Trans. Datab. Syst. 20, 3 (Sept.), 325–363.
SCHENKEL, R. AND WEIKUM, G. 2000. Integrating snapshot isolation into transactional federa-
tions. In Proceedings of 5th IFCIS International Conference on Cooperative Information Systems
(CoopIS 2000) (Sept.), 90–101.
TPC-C BENCHMARK SPECIFICATION, available at http://www.tpc.org/tpcc/.