IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications
\minibox[frame]This extended version of a PLDI 2024 paper adds an appendix with additional material

Chujun Geng 0009-0000-6149-0208 Ohio State UniversityColumbusUSA [email protected] , Spyros Blanas 0009-0004-2703-7177 Ohio State UniversityColumbusUSA [email protected] , Michael D. Bond 0000-0002-8971-4944 Ohio State UniversityColumbusUSA [email protected] and Yang Wang 0000-0002-9721-4923 Ohio State UniversityColumbusUSA [email protected]

(2024; 2023-11-03; 2024-03-31)

IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications

(2024; 2023-11-03; 2024-03-31)

Abstract.

Distributed data stores typically provide weak isolation levels, which are efficient but can lead to unserializable behaviors, which are hard for programmers to understand and often result in errors. This paper presents the first dynamic predictive analysis for data store applications under weak isolation levels, called IsoPredict. Given an observed serializable execution of a data store application, IsoPredict generates and solves SMT constraints to find an unserializable execution that is a feasible execution of the application. IsoPredict introduces novel techniques that handle divergent application behavior; solve mutually recursive sets of constraints; and balance coverage, precision, and performance. An evaluation on four transactional data store benchmarks shows that IsoPredict often predicts unserializable behaviors, 99% of which are feasible.

weak isolation levels, dynamic predictive analysis, data stores, transactions

^†^†ccs: Software and its engineering Software testing and debugging^†^†copyright: rightsretained^†^†doi: 10.1145/3656391^†^†journalyear: 2024^†^†submissionid: pldi24main-p59-p^†^†journal: PACMPL^†^†journalvolume: 8^†^†journalnumber: PLDI^†^†article: 161^†^†publicationmonth: 6

1. Introduction

Distributed data stores are the foundation of today’s service infrastructure, due to their scalability, fault tolerance, and ease of use (Corbett et al., 2012; Elhemali et al., 2022; Snowflake, 2023; MySQL, 2023b). Many real-world data stores only support weak isolation levels, such as causal consistency (causal) (Ahamad et al., 1995), which is the strongest level that achieves availability under network partitions (Burckhardt, 2014; Gilbert and Lynch, 2002). Another weak isolation level is read committed (rc) (Berenson et al., 1995), which is commonly used by database applications to balance performance and correctness (Crooks et al., 2017; Pavlo, 2017; Cheng et al., 2023; Tang et al., 2022). Under weak isolation, an execution may be unserializable, producing an outcome that is impossible for any serial execution. Unserializable behaviors are poorly understood by most programmers, and often lead to errors and failures in real-world systems (Cheng et al., 2023; Tang et al., 2022; Warszawski and Bailis, 2017).

Prior work has introduced techniques to find unserializable behaviors in data store applications under weak isolation, but has scalability or accuracy limitations. Static analysis can find unserializable behaviors, but its precision scales poorly with program complexity, leading to many false positives (infeasible unserializable behaviors) (Brutschy et al., 2018; Nagar and Jagannathan, 2018; Rahmani et al., 2019). Dynamic analysis can avoid false positives by analyzing only the observed execution (Biswas et al., 2021; Brutschy et al., 2017), or it can extrapolate from an observed execution but report numerous false positives (Gan et al., 2020; Warszawski and Bailis, 2017). §8 discusses prior work in more detail.

Motivating example

Algorithm 1 shows code of a transactional data store application. The $\mathit{DataStore}$ provides a key–value interface. Our execution model requires that every $\mathit{get}$ (read) or $\mathit{put}$ (write) operation to execute in a transaction, so an operation starts a new transaction if the current session (i.e., client) is not in a transaction. A $\mathit{commit}$ operation ends the session’s ongoing transaction.

Figure 1 shows two different executions of the application. In each execution, two sessions (i.e., clients) call deposit concurrently on the same empty account to deposit 50 and 60, respectively. Developers would expect that the ending balance will be 110, which is the only serializable outcome. However, under weak isolation levels causal and rc, the ending balance may be 110, 50, or 60.

Algorithm 1 A procedure in a data store application that deposits money in an account.

procedure deposit(

\mathit{account}

amount

)

\mathit{balance}\leftarrow\mathit{DataStore}.\mathit{get}(\mathit{account})

\triangleright

Read balance; implicitly starts transaction if not in one

\mathit{DataStore}.\mathit{put}(\mathit{account},\mathit{balance}+\mathit{% amount})

\triangleright

Update balance

\mathit{DataStore}.\mathit{commit}()

\triangleright

Commits transaction

((a)) The execution in which

t_{2}

reads from

t_{1}

is causal, rc, and serializable.

((b)) The execution in which

t_{1}

and

t_{2}

both read from the initial state is causal and rc but not serializable.

Figure 1. Different executions of two sessions (clients) concurrently on the same account.

Contributions

This paper introduces IsoPredict, the first predictive analysis for transactional data store applications, and shows that the approach is effective at finding unserializable behaviors. Given a serializable execution such as Figure 0(a) as input, IsoPredict finds an unserializable execution such as Figure 0(b). IsoPredict uses dynamic predictive analysis, which analyzes an observed execution of a program and detects alternative feasible, unserializable executions of the program.

Predictive analysis is powerful because, in essence, it explores many executions at once. To predict an unserializable execution from an observed serializable execution, IsoPredict generates SMT constraints that encode execution feasibility, unserializability, and weak isolation level (causal or rc), and uses an off-the-shelf SMT solver to solve them. We introduce analysis variants that trade coverage for performance, and precision for coverage. To account for the possibility of predicting infeasible executions, IsoPredict can optionally validate a predicted unserializable execution. An evaluation on transactional data store benchmarks shows that IsoPredict is effective at predicting unserializable executions from observed executions under causal and rc. More than 99% of predictions are validated as feasible executions.

While prior work introduces predictive analysis for shared-memory programs (Said et al., 2011; Kini et al., 2017; Roemer et al., 2020; Huang et al., 2014; Sinha et al., 2012; Tunç et al., 2023), to our knowledge IsoPredict is the first predictive analysis approach for transactional data store applications, which present unique challenges (§8). Compared to prior work MonkeyDB (Biswas et al., 2021), IsoPredict is comparably effective at finding unserializable executions of the evaluated programs (§7.3). However, IsoPredict and MonkeyDB use completely different approaches to find erroneous executions. While MonkeyDB uses random exploration to produce an erroneous execution, IsoPredict uses predictive analysis to evaluate an equivalence class of many executions at once. Furthermore, MonkeyDB requires applications to run on its specialized data store, while IsoPredict’s predictive analysis approach is in principle suitable for analyzing executions from any data store, although demonstrating so is outside the scope of this paper.

2. Background

This section introduces this paper’s formalisms for weakly isolated executions of transactional data store applications, which are closely based on the axiomatic framework of Biswas and Enea (Biswas and Enea, 2019). We use this framework because it supports a variety of isolation levels, is well suited to encoding as constraints, and has been employed by recent work (Biswas et al., 2021; Bouajjani et al., 2023).

2.1. Weakly Isolated Execution Histories

A transactional data store is modeled as a distributed store of key–value pairs. A data store application performs read (get) and write (put) operations on keys, all executed in transactions. Non-transactional applications can be handled by treating each read and write operation as a separate transaction. An execution consists of events in committed transactions (aborted transactions are not part of an execution). Each event is either read( $k$ ), or write( $k$ ) or commit, where $k$ is a key. Other operations, such as insertion into and deletion from a set, can be modeled in terms of reads and writes. Multiple clients may open connections, or sessions, to the data store. If a session is not in a transaction, its next event implicitly starts a new transaction, ensuring every event is in a transaction. The commit event ends the current transaction. Within a session, transactions are ordered by the strict partial order session order ( $\mathit{so}$ ):

\displaystyle\mathit{so}(t_{1},t_{2})\coloneq\textnormal{$t_{1}$ precedes $t_{% 2}$ in the same session}

An important property of an execution is which write each read reads from. The strict partial order $\mathit{wr}_{k}$ (write–read on key $k$ ) orders transactions if one reads from the other:

\displaystyle\mathit{wr}_{k}(t_{1},t_{2})

\displaystyle\coloneq\textnormal{$t_{2}$ reads the write of $t_{1}$ on $k$}

If a read reads from a write in the same transaction, the read is not included as an event in the transaction (and thus this write–read ordering is not included in $\mathit{wr}_{k}$ ). If a transaction writes $k$ multiple times, only the last write is included as an event in the transaction. Thus a read( $k$ ) event always reads from a write( $k$ ) in another transaction, which is the transaction’s last write to $k$ . If a transaction $t$ reads $k$ from the data store’s initial state, then $\mathit{wr}_{k}(t_{0},t)$ , where $t_{0}$ is a special transaction representing the initial state. The union of $\mathit{wr}_{k}$ over all keys is $\mathit{wr}$ , i.e., $\mathit{wr}\coloneq\bigcup_{k\textnormal{ is a key}}\mathit{wr}_{k}$ . The transitive closure of $\mathit{so}$ and $\mathit{wr}$ is happens-before order, i.e., $\mathit{hb}\coloneq(\mathit{so}\cup\mathit{wr})^{+}$ . An execution history of a data store application is the set of all committed transactions ( $\mathit{T}$ ), session order ( $\mathit{so}$ ), and write–read order ( $\mathit{wr}$ ), i.e., $History\coloneq\langle T,so,wr\rangle$ . Every history includes the special transaction $t_{0}$ mentioned above that represents the initial state. $t_{0}$ implicitly writes the initial value to every key, and $t_{0}$ is $\mathit{so}$ -ordered before all other transactions.

Example

Figures 1(a) and 2(a) each show an execution history as a graph. Transactions are boxes containing read and write events implicitly concluded by a commit event. $t_{1}$ and $t_{2}$ execute in different sessions, and $t_{0}$ is the initial state transaction. The $\mathit{wr}_{k}$ edges indicate each read’s writer.

2.2. Serializablility

An execution history $\langle T,so,wr\rangle$ is serializable if and only if it could have been produced by a serial execution of the transactions in $T$ . (In a serial execution, transactions execute one at a time, and every read to $k$ reads from the most recent write to $k$ .) Equivalently, an execution is serializable if and only if there exists a commit order, $\mathit{co}$ , with the following constraints: (1) $\mathit{co}$ must be consistent with happens-before ( $\mathit{hb}$ ) order. (2) a transaction that writes to $k$ cannot be $\mathit{co}$ -ordered between two transactions ordered by $\mathit{wr}_{k}$ . The second constraint’s ordering is called arbitration order and represented by the strict partial order $\mathit{ww}$ , which is defined as follows:

(1)

\displaystyle\mathit{ww}(t_{1},t_{2})\coloneq\exists k,\textnormal{$t_{1}$ and% $t_{2}$ write to $k$}\land\>\exists t_{3}\in\mathit{T},wr_{k}(t_{2},t_{3})% \land\mathit{co}(t_{1},t_{3})

Note the circular dependency between $\mathit{ww}$ and $\mathit{co}$ : Commit ordering may imply additional arbitration ordering, which in turn may imply additional commit ordering. This property leads to challenges in encoding SMT constraints that §4 explains and addresses. Thus a history is serializable if and only if there exists a $\mathit{co}$ that is consistent with $\mathit{hb}$ and $\mathit{ww}$ :

\displaystyle\langle T,so,wr\rangle\textnormal{ is }\textsc{serializable}\iff% \exists\mathit{co},\mathit{hb}\cup\mathit{ww}\subseteq\mathit{co}

Equivalently, the history is serializable if and only if there exists $\mathit{co}$ such that $(\mathit{hb}\cup\mathit{ww}\cup\mathit{co})^{+}$ is acyclic. An execution is unserializable if and only if it is not serializable.

Example

Figure 1(a)’s history is serializable because there exists a commit order ( $t_{0}<_{\mathit{co}}t_{1}<_{\mathit{co}}t_{2}$ ), shown in Figure 1(b), that is consistent with the serializable axioms. Note that the arbitration rule (Equation 1) never applies in Figure 1(a), and so Figure 1(b) shows no $\mathit{ww}$ edges.

The history in Figure 2(a) is unserializable because there does not exist a commit order that satifies the serializable axioms. For example, as Figure 2(b) shows, if $\mathit{co}(t_{1},t_{2})$ , then $\mathit{ww}(t_{1},t_{0})$ by Equation 1, which implies $\mathit{co}(t_{1},t_{0})$ and thus $\mathit{co}$ is cyclic. Alternatively, if $\mathit{co}(t_{2},t_{1})$ , then $\mathit{ww}(t_{2},t_{0})$ and thus $\mathit{co}(t_{2},t_{0})$ , and again $\mathit{co}$ is cyclic.

Figure 2. A causal, serializable history corresponding to Figure 0(a).

((a)) Execution history

((b)) A

\mathit{co}

(dashed arrows) consistent with the serializable axioms.

((a)) Execution history

((b)) A

\mathit{co}

(dashed arrows) inconsistent with the serializable axioms (contradiction shown in red).

Figure 2. A causal, serializable history corresponding to Figure 0(a).

Figure 3. A causal, unserializable history corresponding to Figure 0(b).

2.3. Causal Consistency

Causal consistency (causal) is a weak isolation level that preserves the order of operations that are causally related (Ahamad et al., 1995). causal is of theoretical and practical interest because it is the strongest isolation level achievable when a data store requires availability under network partitions (Burckhardt, 2014; Gilbert and Lynch, 2002; Mahajan et al., 2011).

Similar to serializable, causal is defined in terms of whether there exists a commit order that is consistent with happens-before ( $\mathit{hb}$ ) and an arbitration order, which we call $\mathit{ww}_{\mathit{causal}}$ to distinguish it from the arbitration order for serializable ( $\mathit{ww}$ ). Two transactions $t_{1}$ and $t_{2}$ are ordered by $\mathit{ww}_{\mathit{causal}}$ if they write the same key and if there is a third transaction $t_{3}$ that happens-after $t_{1}$ ( $\mathit{hb}(t_{1},t_{3})$ ) and reads from $t_{2}$ ’s write to the same key ( $\mathit{wr}(t_{2},t_{3})$ ). More formally,

(2)

\displaystyle\mathit{ww}_{\mathit{causal}}(t_{1},t_{2})\coloneq\exists k,% \textnormal{$t_{1}$ and $t_{2}$ write to $k$}\land\exists t_{3}\in\mathit{T},% wr_{k}(t_{2},t_{3})\land hb(t_{1},t_{3})

A history is causal if and only if there exists a commit order consistent with $\mathit{hb}$ and $\mathit{ww}_{\mathit{causal}}$ :

(3)

\displaystyle\langle T,\mathit{so},\mathit{wr}\rangle\textnormal{ is }\textsc{% causal}\iff\exists\mathit{co},\mathit{hb}\cup\mathit{ww}_{\mathit{causal}}% \subseteq\mathit{co}

Equivalently, a history is causal if and only if $(\mathit{hb}\cup\mathit{ww}_{\mathit{causal}})^{+}$ is acyclic.¹¹1Unlike serializable, causal can be defined in terms of whether $(\mathit{hb}\cup\mathit{ww}_{\mathit{causal}})^{+}$ is acyclic, which implies that a total commit order must exist. In contrast, serializable’s arbitration order ( $\mathit{ww}$ ) is dependent on the commit order, so serializable must be defined in terms of whether $(\mathit{hb}\cup\mathit{ww}\cup\mathit{co})^{+}$ is acyclic.

Example

The history in Figure 1(a) is causal because there exists a commit order $t_{0}<_{\mathit{co}}t_{1}<_{\mathit{co}}t_{2}$ that is consistent with the causal axioms. (Or, since the history is serializable, which is strictly stronger than causal, the history must be causal.) The history in Figure 2(a) is causal because there exists a commit order, $t_{0}<_{\mathit{co}}t_{1}<_{\mathit{co}}t_{2}$ (or $t_{0}<_{\mathit{co}}t_{2}<_{\mathit{co}}t_{1}$ ), that is consistent with the causal axioms.

2.4. Read Committed

Read committed (rc) is a popular weak isolation level because of the balance between performance and consistency it provides (Berenson et al., 1995). Whereas causal requires transactions ordered by happens-before ( $\mathit{hb}$ ) to be viewed by other transactions in the same order, rc’s arbitration order, $\mathit{ww}_{\mathit{rc}}$ , only applies to write transactions that are read by multiple read events from the same transaction. More formally, rc is defined based on whether there exists a commit order that is consistent with $\mathit{hb}$ and $\mathit{ww}_{\mathit{rc}}$ , which is defined as follows:

(4)

\displaystyle\mathit{ww}_{\mathit{rc}}(t_{1},t_{2})\coloneq\exists k,% \textnormal{$t_{1}$ and $t_{2}$ write to $k$}\land\exists\>\!\alpha,\beta,% \mathit{po}(\beta,\alpha)\land\overline{\mathit{wr}}_{k}(t_{2},\alpha)\land% \exists k^{\prime},\overline{\mathit{wr}}_{k^{\prime}}(t_{1},\beta)

where $\mathit{po}$ is program order, a strict partial order that orders events within a transaction; and $\overline{\mathit{wr}}_{k}(t,e)$ is true if and only if $e$ is a read event that reads from a write in transaction $t$ (and thus $e\neq t$ ). Thus $\alpha$ and $\beta$ must be events in the same transaction such that $\alpha$ is a read( $k$ ) event that reads from write( $k$ ) in $t_{2}$ , and $\beta$ is a read event that reads from any write in $t_{1}$ . An execution history is rc if and only if there exists a commit order that is consistent with $\mathit{hb}$ and $\mathit{ww}_{\mathit{rc}}$ :

(5)

\displaystyle\langle T,\mathit{so},\mathit{wr}\rangle\textnormal{ is }\textsc{% rc}\iff\exists\mathit{co},\mathit{hb}\cup\mathit{ww}_{\mathit{rc}}\subset% \mathit{co}

Example

The execution histories in Figures 1(a) and 2(a) are rc because there exist commit orders (in fact, the same commit orders used to establish causal) satisfying the above condition. Or, the histories are rc because they are causal, which is strictly stronger than rc.

3. IsoPredict Overview

IsoPredict consists of two main components, as shown in Figure 4: predictive analysis and validation.

Figure 4. IsoPredict’s components and workflow.

The predictive analysis component takes as input an observed execution history that is recorded at the client application’s backend data store, generates SMT constraints, and uses an SMT solver to find a predicted unserializable execution if one exists. §4 describes IsoPredict’s predictive analysis.

The validation component tries to execute the predicted execution history to determine if it is feasible, and it generates and solves constraints to determine if the resulting execution is unserializable. If so, IsoPredict outputs the validated history alongside a visualization of the validated unserializable execution. §5 describes IsoPredict’s validation component.

Validation is optional; developers may choose to skip it for two reasons. First, it may be overkill—in our experiments, over 99% of predicted unserializable executions are successfully validated. Second, validation may be impractical if the application cannot be replayed easily. Validation is, however, useful to our evaluation to measure how many predicted executions are feasible.

4. Predictive Analysis

IsoPredict’s predictive analysis component takes as input an observed execution history of a data store application. The observed history $\mathit{History}=\langle\mathit{T},\mathit{so},\mathit{wr_{obs}}\rangle$ consists of a set of transactions $\mathit{T}$ , session order $\mathit{so}$ between transactions, and observed write–read ordering $\mathit{wr_{obs}}$ . The goal of IsoPredict is to find a feasible, unserializable execution that is valid under a weak isolation model $M$ (i.e., causal or rc). To find such an execution, IsoPredict encodes and solves the following necessary and sufficient constraints for a predicted execution history, $\mathit{History}^{\prime}=\langle T^{\prime},\mathit{so},\mathit{wr}\rangle$ :

(1)

$\mathit{History}^{\prime}$ must be a feasible execution prefix²²2We allow $T^{\prime}$ to be a subset of $T$ to exclude transactions that may diverge from the observed execution (§4.5). An execution prefix is sufficient: If $\mathit{History}^{\prime}$ exists, a full execution history exists that has $\mathit{History}^{\prime}$ as a prefix and meets the criteria above. of the program that produced $\mathit{History}$ (§4.1).
(2)

$\mathit{History}^{\prime}$ must be unserializable (§4.2).
(3)

$\mathit{History}^{\prime}$ must be valid under $M$ (§4.3).

As an example, Figure 1(a) shows a serializable execution history that contains two deposit transactions (Algorithm 1) running concurrently. IsoPredict generates and solves the constraints sketched above, in order to predict the causal and rc but unserializable execution from Figure 2(a).

4.1. Encoding of Feasible Execution

This section describes the constraints that IsoPredict generates to ensure that $\mathit{History}^{\prime}=\langle T^{\prime},\mathit{so},\mathit{wr}\rangle$ is a feasible execution of the application that produced $\mathit{History}=\langle T,\mathit{so},\mathit{wr_{obs}}\rangle$ .

Session order

The predicted execution must preserve the observed execution’s session order ( $\mathit{so}$ ). IsoPredict generates constraints over a Boolean SMT function $\phi_{\mathit{so}}(t_{1},t_{2})$ that takes two transactions as input; a transaction is an SMT data type representing the set of all executed transactions $T$ . The analysis generates the following constraints to preserve the observed execution’s $\mathit{so}$ :

	$\displaystyle\hbox{\multirowsetup$\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},% \quad$}\quad\boxed{\phi_{\mathit{so}}(t_{1},t_{2})}$	$\displaystyle\quad\textnormal{if }\mathit{so}(t_{1},t_{2})$
	$\displaystyle\boxed{\neg\phi_{\mathit{so}}(t_{1},t_{2})}$	otherwise

For clarity, SMT constraints generated by IsoPredict are boxed throughput the paper. The way to understand the above is that, for every $t_{1},t_{2}\in T$ such that $t_{1}\neq t_{2}$ ,³³3Although the partial and total orders throughout the paper are irreflexive, the analysis never needs to generate irreflexivity constraints (e.g., $\forall{t},\boxed{\neg\phi_{r}(t,t)}$ for relation $r$ ) because it never generates any constraints that use $\phi_{r}(t,t)$ . the analysis generates a constraint—either $\phi_{\mathit{so}}(t_{1},t_{2})$ or $\neg\phi_{\mathit{so}}(t_{1},t_{2})$ depending on whether the transactions are ordered by $\mathit{so}$ .

Write–read order

Each read in the predicted execution can potentially read from any transaction that writes the same key.⁴⁴4Recall that a read to $k$ can only read from another transaction’s last write to $k$ (§2.1). To help reason about multiple reads in a transaction to the same key that have different writer transactions (and to help exclude potentially divergent events; §4.5), we introduce the notion of an event’s position: In each session, events are numbered with monotonically increasing integers. To ensure each read has exactly one writer transaction in the predicted execution, IsoPredict introduces an SMT function $\phi_{\mathit{choice}}(s,i)$ that takes as input a session and the position of a read event in the session, and returns the writer transaction that the read reads from. Like transactions, sessions are a finite SMT data type representing the set of all sessions. (Note that $\phi_{\mathit{choice}}(s,i)$ is left undefined if $i$ is not the position of a read event in $s$ .) IsoPredict generates the following constraints to ensure that $\phi_{\mathit{choice}}(s,i)$ is equal to some transaction that writes the same key:

\displaystyle\forall k\textnormal{ is a key},\forall t_{2}\textnormal{ reads }% k,\forall i\in\mathit{rdpos_{k}}(t_{2}),\quad\boxed{\bigvee_{t_{1}\neq t_{2}% \textnormal{ writes }k}\phi_{\mathit{choice}}(s_{2},i)=t_{1}}

where $s_{2}$ is $t_{2}$ ’s session, and $\mathit{rdpos_{k}}(t)$ is the set of positions of reads to $k$ in transaction $t$ .

IsoPredict encodes $\mathit{wr}_{k}$ by generating constraints on Boolean SMT functions $\phi_{\mathit{wr}_{k}}(t_{1},t_{2})$ :

\displaystyle\forall k\textnormal{ is a key},\forall t_{1}\textnormal{ writes % }k,\forall t_{2}\textnormal{ reads }k,t_{1}\neq t_{2},\quad\boxed{\phi_{% \mathit{wr}_{k}}(t_{1},t_{2})=\bigvee_{i\in\mathit{rdpos_{k}}(t_{2})}\phi_{% \mathit{choice}}(s_{2},i)=t_{1}}

where $s_{2}$ is $t_{2}$ ’s session.

To encode $\mathit{wr}(t_{1},t_{2})$ , the analysis generates constraints on a Boolean SMT function $\phi_{\mathit{wr}}(t_{1},t_{2})$ that represents the union of all $\phi_{\mathit{wr}_{k}}(t_{1},t_{2})$ :

\displaystyle\forall t_{1},t_{2}\in\mathit{T},t_{1}\neq t_{2},\quad\boxed{\phi% _{\mathit{wr}}(t_{1},t_{2})=\bigvee_{k\textnormal{ is a key}}\phi_{\mathit{wr}% _{k}}(t_{1},t_{2})}

4.2. Encoding Unserializability

This section describes how the analysis encodes constraints for the predicted execution to be unserializable. The constraints must ensure that all possible commit orders are cyclic. §4.2.1 presents an approach that encodes the needed constraints exactly, resulting in long solving times. §4.2.2 presents an alternative approach that encodes a sufficient condition for unserializability, which has lower solving time than the first approach, but still has high coverage in our experiments.

4.2.1. Constraints that encode an exact condition

To encode that no acyclic $\mathit{co}$ exists for the predicted execution history, IsoPredict generates the following constraint:

\displaystyle\boxed{\forall\phi_{\mathit{co}},\neg\mathit{IsSerializable}(\phi% _{\mathit{co}})}

where $\mathit{IsSerializable}$ is defined as shown below. Note that in the constraint above, $\phi_{\mathit{co}}(t)$ , which takes a transaction $t$ as input and evaluates to an integer indicating $t$ ’s position in the $\mathit{co}$ total order, is not an SMT function—it is a bound variable of the quantifier. Function $\mathit{IsSerializable}$ is defined as follows:

	$\displaystyle\mathit{IsSerializable}(\phi_{\mathit{co}})\coloneq\;$	$\displaystyle\mathit{Distinct}(\phi_{\mathit{co}}(t_{1}),\dots,\phi_{\mathit{% co}}(t_{n}))\;\land$
		$\displaystyle\bigwedge_{\forall t_{1},t_{2}\in T,t_{1}\neq t_{2}}(\phi_{% \mathit{wr}}(t_{1},t_{2})\lor\phi_{\mathit{so}}(t_{1},t_{2})\lor\mathit{% Arbitration}(t_{1},t_{2}))\Rightarrow\phi_{\mathit{co}}(t_{1})<\phi_{\mathit{% co}}(t_{2})$

where $t_{1},\dots,t_{n}$ are all transactions in $\mathit{T}$ , and $\mathit{Distinct}(v_{1},\dots,v_{k})$ is a built-in SMT function that requires all input values to be distinct from each other. By mapping $\phi_{\mathit{co}}$ (t) to a unique integer for each $t$ , the first line of the equation above ensures that $\mathit{co}$ is a total order.

The second line of the equation ensures that $\mathit{co}$ is consistent with $\mathit{wr}$ , $\mathit{so}$ , and $\mathit{ww}$ , respectively. For simplicity and to reduce the size of the constraints, arbitration constraints are factored out into the $\mathit{Arbitration}$ function, which is defined as follows:

\displaystyle\boxed{\mathit{Arbitration}(t_{1},t_{2})\coloneq\bigvee_{\begin{% subarray}{c}\forall k,t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ reads }k% \end{subarray}}\phi_{\mathit{wr}_{k}}(t_{2},t_{3})\land\bigl{(}\phi_{\mathit{% co}}(t_{1})<\phi_{\mathit{co}}(t_{3})\bigr{)}}

which is a straightforward encoding of the serializable arbitration constraints in Equation 1.

By using this approach we are pushing all the heavy lifting to the SMT solver. However, SMT solvers are known to be inefficient at solving constraints with universal quantifiers (Leino and Pit-Claudel, 2016)—an issue confirmed by our performance results (§7.2).

4.2.2. Constraints encoding a sufficient but unnecessary condition

Alternatively, the analysis can encode a sufficient, but unnecessary, condition for predicting an unserializable execution. We introduce a partial order, $\mathit{pco}$ , that is a subset of every commit order for every valid predicted execution. If there exists a predicted execution for which $\mathit{pco}$ is cyclic, then there cannot exist an acyclic $\mathit{co}$ for the predicted execution, meaning it is unserializable. In theory, this approach has the potential for missing unserializable executions that §4.2.1’s approach finds. But in our experiments, the $\mathit{pco}$ -based approach predicts all unserializable executions that §4.2.1’s approach finds (§7.2).

We define $\mathit{pco}$ to include all orders that must be in $\mathit{co}$ : session ( $\mathit{so}$ ), write–read ( $\mathit{wr}$ ), and arbitration ( $\mathit{ww}$ ) orders. We also introduce an anti-dependency order ( $\mathit{rw}$ ) that must be in every $\mathit{co}$ , which allows adding more edges to $\mathit{pco}$ and thus finding more unserializable executions. A challenge with encoding $\mathit{pco}$ is that the arbitration and anti-dependency orders are both defined in terms of commit order, creating a circular dependency that leads to erroneous self-justifying edges in $\mathit{pco}$ . We break both circular dependencies by introducing the notion of rank in the generated constraints. Next we describe anti-dependency order ( $\mathit{rw}$ ), the circular dependency problem and our rank-based solution to it, and finally the constraints that the analysis generates.

Adding anti-dependency order ( $\mathit{rw}$ ) to $\mathit{pco}$

To make $\mathit{pco}$ as large as possible while still being consistent with every valid $\mathit{co}$ , we add an anti-dependency ( $\mathit{rw}$ ) order to $\mathit{pco}$ . $\mathit{rw}$ must be part of any valid $\mathit{co}$ , as we prove in Appendix A. Intuitively, for any write–read relation $\mathit{wr}_{k}(t_{1},t_{2})$ , anti-dependency prevents future transactions that also write $k$ from being ordered between $t_{1}$ and $t_{2}$ in the commit order. More formally, we define $\mathit{rw}(t_{1},t_{2})$ as follows:

\displaystyle\mathit{rw}(t_{1},t_{2})\coloneq\exists k,t_{2}\textnormal{ % writes }k\land\exists t_{w},\mathit{wr}_{k}(t_{w},t_{1})\land\mathit{pco}(t_{w% },t_{2})

Figure 6 shows an example in which $\mathit{pco}$ is cyclic only if $\mathit{rw}$ is included.

Figure 5. Including anti-dependency ordering (

\mathit{rw}

; dashed arrows) in

\mathit{pco}

makes

\mathit{pco}

cyclic.

Figure 5. Including anti-dependency ordering (

\mathit{rw}

; dashed arrows) in

\mathit{pco}

makes

\mathit{pco}

cyclic.

Figure 6. An example of circular dependency:

\mathit{ww}(t_{1},t_{2})

depends on

\mathit{pco}(t_{1},t_{3})

, which in turn depends on

\mathit{ww}(t_{1},t_{2})

The partial order $\mathit{pco}$ can now be defined as the union of all orders that must be part of $\mathit{co}$ :

\mathit{pco}=(\mathit{so}\cup\mathit{wr}\cup\mathit{ww}\cup\mathit{rw})^{+}

Adapting Equation 1 to use $\mathit{pco}$ instead of $\mathit{co}$ , we define arbitration order, $\mathit{ww}$ , as follows:

\displaystyle\mathit{ww}(t_{1},t_{2})\coloneq\exists k,\textnormal{$t_{1}$ and% $t_{2}$ write to $k$}\land\exists t_{3}\in\mathit{T},wr_{k}(t_{2},t_{3})\land% \mathit{pco}(t_{1},t_{3})

Circular dependency and rank

In the definitions above, note the circular dependencies between $\mathit{pco}$ and $\mathit{ww}$ and between $\mathit{pco}$ and $\mathit{rw}$ , which seem to permit “self-justifying” edges. As an example, consider Figure 6. According to the definitions, $\mathit{pco}(t_{1},t_{3})\Rightarrow\mathit{ww}(t_{1},t_{2})$ , and $\mathit{ww}(t_{1},t_{2})\Rightarrow\mathit{pco}(t_{1},t_{3})$ , allowing us to wrongly conclude $\mathit{ww}(t_{1},t_{2})$ and $\mathit{pco}(t_{1},t_{3})$ . To avoid such self-justifying edges, $\mathit{pco}$ , $\mathit{ww}$ , and $\mathit{rw}$ in fact must be defined as the minimal relations that satisfy the above definitions.

How can we encode this “minimal relation” property in the SMT constraints? If IsoPredict simply encodes the above definitions as SMT constraints, the constraint solver will find self-justifying edges, resulting in spurious cycles and reporting executions that are not actually unserializable. For example, for Figure 6, the SMT solver would choose both $\mathit{ww}(t_{1},t_{2})$ and $\mathit{pco}(t_{1},t_{3})$ to be true, finding a cycle and wrongly reporting a predicted execution that is actually serializable.

We address this problem by introducing the notion of rank, which orders $\mathit{pco}$ edges that depend on each other. IsoPredict relies on an integer SMT function $\mathit{rank}(t_{1},t_{2})$ to enforce the following rule:

For any relations $r$ and $r^{\prime}$ , if $r(t_{1},t_{2})$ depends on $r^{\prime}(t^{\prime}_{1},t^{\prime}_{2})$ , then $\mathit{rank}(t_{1},t_{2})>\mathit{rank}(t^{\prime}_{1},t^{\prime}_{2})$ .

Note that the rule does not require $t_{1}\neq t^{\prime}_{1}$ or $t_{2}\neq t^{\prime}_{2}$ . For Figure 6, rank constraints disallow $\mathit{ww}(t_{1},t_{2})$ and $\mathit{pco}(t_{1},t_{3})$ , which would require both $\mathit{rank}(t_{1},t_{2})>\mathit{rank}(t_{1},t_{3})$ and $\mathit{rank}(t_{1},t_{3})>\mathit{rank}(t_{1},t_{2})$ .

Generated constraints

IsoPredict generates arbitration and anti-dependency constraints on Boolean SMT functions $\phi_{\mathit{ww}}(t_{1},t_{2})$ and $\phi_{\mathit{rw}}(t_{1},t_{2})$ :

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},

\displaystyle\boxed{\phi_{\mathit{ww}}(t_{1},t_{2})=\bigvee_{\begin{subarray}{% c}\forall k,t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ reads }k% \end{subarray}}\phi_{\mathit{wr}_{k}}(t_{2},t_{3})\land\phi_{\mathit{pco}}(t_{% 1},t_{3}))\land\mathit{rank}(t_{1},t_{2})>\mathit{rank}(t_{1},t_{3})}

\displaystyle\boxed{\phi_{\mathit{rw}}(t_{1},t_{2})=\bigvee_{\begin{subarray}{% c}\forall k,t_{1}\textnormal{ reads }k\>\land\>t_{2}\textnormal{ writes }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ writes }k% \end{subarray}}\phi_{\mathit{wr}_{k}}(t_{3},t_{1})\land\phi_{\mathit{pco}}(t_{% 3},t_{2})\land\mathit{rank}(t_{1},t_{2})>\mathit{rank}(t_{3},t_{2})}

The following constraints ensure that $\mathit{pco}$ is a partial order implied by $\mathit{so}$ , $\mathit{wr}$ , $\mathit{ww}$ , and $\mathit{rw}$ :

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},

	$\displaystyle\phi_{\mathit{pco}}(t_{1},t_{2})=\;$	$\displaystyle\phi_{\mathit{so}}(t_{1},t_{2})\lor\phi_{\mathit{wr}}(t_{1},t_{2}% )\lor\phi_{\mathit{ww}}(t_{1},t_{2})\lor\phi_{\mathit{rw}}(t_{1},t_{2})\;\lor$
		$\displaystyle\bigvee_{t\in\mathit{T}\setminus\{t_{1},t_{2}\}}\!\!\phi_{\mathit% {pco}}(t_{1},t)\land\phi_{\mathit{pco}}(t,t_{2})\land\mathit{rank}(t_{1},t_{2}% )>\mathit{rank}(t_{1},t)\land\mathit{rank}(t_{1},t_{2})>\mathit{rank}(t,t_{2})$

To ensure that $\mathit{pco}$ is cyclic, the analysis generates the following constraint:

\displaystyle\boxed{\bigvee_{\forall t_{1},t_{2}\in\mathit{T},t_{1}\neq t_{2}}% \phi_{\mathit{pco}}(t_{1},t_{2})\land\phi_{\mathit{pco}}(t_{2},t_{1})}

If the solver finds a satisfying solution, a predicted unserializable execution exists. If the solver reports no satisfying solution, a predicted unserializable execution may or may not exist. In our experiments, a predicted unserializable execution never exists in this case.

We have not been able to come up with an execution for which our $\mathit{pco}$ -based approach misses a predicted unserializable execution. We believe that such an execution should exist because otherwise it would imply a polynomial-time algorithm for deciding if an execution history is serializable—a problem that is NP-hard (Biswas and Enea, 2019).

4.3. Encoding Weak Isolation

This section describes the constraints that IsoPredict generates to ensure that the execution conforms to the target weak isolation model (causal or rc).

Regardless of the model, IsoPredict encodes $\mathit{hb}$ as the transitive closure of $\mathit{so}$ and $\mathit{wr}$ (§2.1), by generating constraints on a Boolean SMT function $\phi_{\mathit{hb}}(t_{1},t_{2})$ :

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},\quad\boxed{\phi_{% \mathit{hb}}(t_{1},t_{2})=\phi_{\mathit{so}}(t_{1},t_{2})\lor\phi_{\mathit{wr}% }(t_{1},t_{2})\lor\bigvee_{\forall t\in T\setminus\{t_{1},t_{2}\}}\phi_{% \mathit{hb}}(t_{1},t)\land\phi_{\mathit{hb}}(t,t_{2})}

4.3.1. Causal consistency (causal)

To ensure that the predicted execution is causal, IsoPredict generates constraints that ensure that the transitive closure of causal arbitration order ( $\mathit{ww}_{\mathit{causal}}$ ) and happens-before ( $\mathit{hb}$ ) is acyclic (§2.3). IsoPredict encodes the causal axiom (Equation 2) by generating constraints on a Boolean SMT function $\phi_{\mathit{ww}_{\mathit{causal}}}(t_{1},t_{2})$ representing $\mathit{ww}_{\mathit{causal}}$ :

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},\quad\boxed{\phi_{% \mathit{ww}_{\mathit{causal}}}(t_{1},t_{2})=\bigvee_{\begin{subarray}{c}% \forall k,t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ reads }k% \end{subarray}}\phi_{\mathit{wr}_{k}}(t_{2},t_{3})\land\phi_{\mathit{hb}}(t_{1% },t_{3})}

To ensure the execution is causal, there must exist a strict total order that is consistent with $(\mathit{hb}\cup\mathit{ww}_{\mathit{causal}})^{+}$ (Equation 3). IsoPredict generates the constraints on an integer SMT function $\phi_{\mathit{co_{causal}}}(t)$ :

\displaystyle\forall t,t_{1},t_{2}\in T,t_{1}\neq t_{2},\quad\boxed{\phi_{% \mathit{hb}}(t_{1},t_{2})\lor\phi_{\mathit{ww}_{\mathit{causal}}}(t_{1},t_{2})% \;\Rightarrow\;\phi_{\mathit{co_{causal}}}(t_{1})<\phi_{\mathit{co_{causal}}}(% t_{2})}

4.3.2. Read committed (rc)

Similar to causal, IsoPredict generates constraints so that the transitive closure of rc arbitration order ( $\mathit{ww}_{\mathit{rc}}$ ) and happens-before ( $\mathit{hb}$ ) is acyclic (§2.4). IsoPredict encodes the rc axiom (Equation 4) with the help of a Boolean SMT function $\phi_{\mathit{ww}_{\mathit{rc}}}(t_{1},t_{2})$ that represents $\mathit{ww}_{\mathit{rc}}$ :

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},\;\;\;\boxed{\phi_{% \mathit{ww}_{\mathit{rc}}}(t_{1},t_{2})=\bigvee_{\begin{subarray}{c}\forall k,% \;t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},\;t_{3}\textnormal{ reads }% k\\ \forall i\in\mathit{rdpos}_{\ast}(t_{3}),\forall j\in\mathit{rdpos_{k}}(t_{3})% ,\;i<j\end{subarray}}\phi_{\mathit{choice}}(s_{3},i)=t_{1}\land\phi_{\mathit{% choice}}(s_{3},j)=t_{2}}

where $\mathit{rdpos}_{\ast}(t)$ is the set of positions of read events in transaction $t$ , $\mathit{rdpos_{k}}(t)$ is the set of positions of read to $k$ in transaction $t$ , and $s_{3}$ is $t_{3}$ ’s transaction. To ensure there exists a strict total order that is consistent with $(\mathit{hb}\cup\mathit{ww}_{\mathit{rc}})^{+}$ (Equation 5), IsoPredict generates constraints on an integer SMT function $\phi_{\mathit{co_{rc}}}(t)$ :

\displaystyle\forall t,t_{1},t_{2}\in T,t_{1}\neq t_{2},\quad\boxed{\phi_{% \mathit{hb}}(t_{1},t_{2})\lor\phi_{\mathit{ww}_{\mathit{rc}}}(t_{1},t_{2})\;% \Rightarrow\;\phi_{\mathit{co_{rc}}}(t_{1})<\phi_{\mathit{co_{rc}}}(t_{2})}

4.4. Prediction Examples

This section shows causal, unserializable behaviors predicted by IsoPredict on programs evaluated in §7. The actual executions consist of dozens of transactions and thousands of events; the figures show only the transactions and events relevant to predicting unserializable behavior.

Figure 6(a) shows an observed execution of the Wikipedia benchmark, and Figure 6(b) shows the causal, unserializable execution predicted by IsoPredict. In contrast, Figure 6(c) shows a different observed execution of Wikipedia, from which no causal, unserializable execution can be predicted. Figure 6(d) serves to illustrate that changing $t_{3}$ ’s read of $x$ to read from $t_{0}$ would lead to a non-causal execution (and thus will not be reported by IsoPredict).

((a)) An observed execution of Wikipedia for which a predicted causal, unserializable execution exists.

((b)) A causal, unserializable prediction; the

\mathit{pco}

cycle (including

\mathit{rw}

edges) shows it is unserializable.

((c)) An observed execution of Wikipedia, for which no predicted causal, unserializable execution exists.

((d)) The non-causal execution that results if we try to change (6(c)) so

t_{3}

reads from

t_{0}

Figure 7. Comparison of (relevant subsets of) executions from Wikipedia. Blue edges highlight the differences between observed and predicted executions.

Figure 7(a) shows an observed execution of the Smallbank benchmark, and Figure 7(b) shows the IsoPredict-predicted execution. As Figure 7(b) shows, a causal, unserializable predicted execution exists in which both reads read from the initial state ( $t_{0}$ ), as demonstrated by the $\mathit{pco}$ cycle $t_{1}<_{\mathit{co}}t_{3}<_{\mathit{co}}t_{2}<_{\mathit{co}}t_{4}<_{\mathit{co% }}t_{1}$ .

((a)) An observed execution for which a causal, unserializable predicted execution exists.

((b)) A causal, unserializable predicted execution as shown by the

\mathit{pco}

cycles including

\mathit{rw}

edges.

Figure 8. Observed and predicted executions of Smallbank. For simplicity, each history shows a subset of the executed transactions, and each transaction shows a subset of the executed events.

4.5. Handling Divergence in the Predicted Execution

Reading from a different write in the predicted execution than in the observed execution, may lead to different application behaviors. Specifically, code in the data store application that is control dependent on a read from a different writer transaction may generate different events. For example, consider the observed execution shown in Figures 8(a) and 8(b), which executes transactions shown in Algorithms 1 and 2. Figure 8(c) shows an unserializable predicted history that IsoPredict would find using the constraints presented so far. However, the predicted execution is infeasible: $t_{2}$ aborts if it reads from $t_{0}$ , making it impossible for $t_{3}$ to read from $t_{2}$ , as Figure 8(d) shows. IsoPredict (mostly) avoids make spurious predictions, by excluding (much of the) potentially divergent behavior.

((a)) One session deposits into an account twice, while another session withdraws once.

((b)) The execution history for (8(a)), which is serializable. The write–read edge from

t_{1}

t_{2}

(shown in blue) is not present in the predicted execution in (8(c)).

((c)) A predicted execution history that is unserializable. The write–read edge from

t_{0}

t_{2}

(shown in blue) was not present in the observed execution (8(b)).

((d)) The validating execution based on the predicted execution in (8(c)). It diverges because

t_{2}

aborts, and the resulting execution is serializable.

((e)) This execution history consisting of the events from the predicted execution in (8(c)) that are within the strict prediction boundary is serializable.

((f)) This execution history consisting of the events from the predicted execution in (8(c)) that are within the relaxed prediction boundary is unserializable.

Figure 9. Motivation for a prediction boundary (8(a)–8(d)) and illustration of the two kinds of prediction boundaries (8(e)–8(f)). The target weak isolation model is causal. Dashed arrows represent

\mathit{pco}

edges that are not part of the history.

Algorithm 2 A procedure in a data store application that withdraws money from an account.

procedure withdraw(

\mathit{account}

\mathit{amount}

)

\mathit{balance}\leftarrow\mathit{DataStore}.\mathit{get}(\mathit{account})

\triangleright

Read balance; implicitly starts transaction

\mathit{balance}<\mathit{amount}

then

\mathit{DataStore}.\mathit{rollback}()

\triangleright

Abort transaction

else

\mathit{DataStore}.\mathit{put}(\mathit{account},\mathit{balance}-\mathit{% amount})

\triangleright

Update balance

\mathit{DataStore}.\mathit{commit}()

\triangleright

Commit transaction

Divergent behavior

To account for divergent behavior, we make a distinction between the predicted execution, which is generated by IsoPredict based on the observed execution, and what we call the validating execution, which is the execution that actually occurs if one tries to produce the predicted execution using the data store application. Divergent behaviors are behaviors that differ between the predicted and validating executions. We categorize divergent behaviors into two categories:

•

The validating execution reads or writes different keys or omits or adds events from the predicted execution, leading to a different execution history with different properties.
•

A transaction that commits in the predicted execution, aborts in the validating execution (e.g., an application might have logic that aborts if a consistency check fails), as Figures 8(c) and 8(d) show.

The problem with divergent behavior is that an unserializable predicted execution can lead to a serializable validating execution. (The validating execution will always be a feasible execution conforming to the weak isolation model because validation ensures these properties; §5.)

Prediction boundary

IsoPredict accounts for divergence by generating prediction boundary constraints that exclude events that may be impacted by divergence—specifically, events that happen-after (i.e., inverse of $\mathit{hb}$ ) any read event that reads from different writers in the predicted and observed executions. IsoPredict supports a prediction boundary that is strict or relaxed, as shown in Table 1. The strict boundary excludes events that happen-after events that read from a different writer in the predicted execution than in the observed execution. The strict boundary prevents false predictions except when a transaction in the predicted execution aborts in the validating execution. Alternatively, the relaxed boundary excludes events that happen-after transactions that read from a different writer, risking more false predictions but increasing the chances of finding an unserializable predicted execution.

Table 1. Comparison of strict and relaxed prediction boundaries.

Prediction		Divergent behaviors can
boundary	Excluded events	cause false predictions
Strict	Events that happen-after any read event with a different writer	Abort-related only
Relaxed	Events that happen-after any transaction containing a read event with a different writer	Any

Figures 8(e) and 8(f) show strict and relaxed boundaries, respectively, applied to the prediction in Figure 8(c). The strict boundary excludes all events that happen-after $t_{2}$ ’s read (since it has a different writer than in Figure 8(b)); the resulting execution history is serializable. The relaxed boundary excludes all transactions that happen-after $t_{2}$ ’s read; the resulting execution history is unserializable. Although the relaxed boundary allows a false prediction in this example, in our evaluation the relaxed boundary results in few false predictions.

Generating prediction boundary constraints

Here we present IsoPredict’s constraints for excluding events using the prediction boundary. We show constraints for the strict prediction boundary, but the constraints for the relaxed prediction boundary are similar except they also constrain every session’s boundary to be the last event of a transaction.

The prediction boundary is delimited by a boundary event in each session, which is either (1) a read event, which reads from a different write in the predicted execution than in the observed execution, or (2) the last event in the session (which will always be a commit event). IsoPredict generates the following constraints on an integer SMT function $\phi_{\mathit{boundary}}(s)$ to ensure that the boundary event for each session is either a read event or the last event (represented by position $\infty$ ):

\displaystyle\forall s\textnormal{ is a session},\quad\boxed{\Big{(}\bigvee_{% \begin{subarray}{c}t\textnormal{ is a transaction in }s\\ i\in\mathit{rdpos_{k}}(t)\end{subarray}}\phi_{\mathit{boundary}}(s)=i\Big{)}% \lor\phi_{\mathit{boundary}}(s)=\infty}

Recall that $\mathit{rdpos_{k}}$ (s) is the set of positions of reads to $k$ in the transaction $t$ .

To ensure that each read that happens-before the prediction boundary reads from the same write as in the observed execution, IsoPredict generates the following constraints, where $\phi_{\mathit{obs}}(s,i)$ is an integer SMT function that represents the last write of each read in the observed execution history (and is thus the analogue of $\phi_{\mathit{choice}}$ for the observed execution):

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},\forall i\in\mathit{% rdpos_{k}}(t_{2})=i,t_{2}\textnormal{'s read at pos $i$ reads from $t_{1}$ in % }\mathit{wr_{obs}},\quad\boxed{\phi_{\mathit{obs}}(s_{2},i)=t_{1}}

\displaystyle\forall k\textnormal{ is a key},\forall t_{1}\textnormal{ writes % }k,\forall t_{2}\textnormal{ reads }k,\forall i\in\mathit{rdpos_{k}}(t_{2}),\;% \;\boxed{i<\phi_{\mathit{boundary}}(s_{2})\;\Rightarrow\,\phi_{\mathit{choice}% }(s_{2},i)=\phi_{\mathit{obs}}(s_{2},i)}

where $s_{1}$ is $t_{1}$ ’s session and $s_{2}$ is $t_{2}$ ’s session.

A read to $k$ on or before the prediction boundary must read from a write to $k$ that is before the prediction boundary. IsoPredict ensures this property by generating the following constraints:

	$\displaystyle\forall k\textnormal{ is a key},\forall t_{1}\textnormal{ writes % }k,\forall t_{2}\textnormal{ reads }k,\forall i\in\mathit{rdpos_{k}}(t_{2}),$
	$\displaystyle\boxed{\phi_{\mathit{choice}}(s_{2},i)=t_{1}\land i\leq\phi_{% \mathit{boundary}}(s_{2})\implies\mathit{wrpos}_{k}(t_{1})<\phi_{\mathit{% boundary}}(s_{1})}$

where $s_{1}$ is $t_{1}$ ’s session, $s_{2}$ is $t_{2}$ ’s session, and $\mathit{wrpos}_{k}$ (t) is the position of $t$ ’s last write to key $k$ .

To exclude events after the prediction boundary, IsoPredict generates modified constraints for all arbitration and anti-dependency rules, as detailed in Appendix B.

5. Validation

Even by using the prediction boundary, IsoPredict’s predictive analysis may report unserializable predicted executions for which the corresponding validating execution is serializable. To rule out such predictions, IsoPredict can attempt to validate predicted executions, by executing the data store application based on the predicted execution history, and checking whether the resulting validating execution is unserializable.

Validating execution

Validation produces the validating execution using a query engine that takes the predicted execution as input. At each read( $k$ ) event, the query engine checks that (1) the corresponding read in the predicted execution also read from $k$ ; (2) the writer transaction $t$ from the predicted execution also wrote to $k$ in the validating execution; and (3) reading from $t$ in the validating execution will satisfy the weak isolation model (causal or rc). If any of these conditions is violated, we categorize the execution as having diverged, and the query engine chooses a different, weak isolation model–conforming writer for the read to read from. Note that it is always possible to keep executing while preserving causal or rc (Bouajjani et al., 2023). Furthermore, the validating execution may still be unserializable, as our evaluation shows.

Recall that the predicted execution history contains events only up to the prediction boundary. To avoid serendipitously introducing unserializable behaviors that were not part of the predicted execution (which could make it tough to measure the effectiveness of IsoPredict’s predictive analysis), validation executes each transaction in full that is on the boundary or that happens-before any transaction on the boundary—and then it terminates the execution. This approach is sufficient: If this execution prefix is unserializable, then so is the full execution.

Note that validation must directly control what transaction each read reads from, i.e., the write–read relation ( $\mathit{wr}$ ). Our evaluation extends MonkeyDB (Biswas et al., 2021) to allow explicit control of $\mathit{wr}$ (§6). In settings where MonkeyDB cannot be used, such as production systems, there are other ways to control $\mathit{wr}$ . One is using resource locks (e.g., sp_getapplock in SQL Server) to force specific transaction orders that produce the desired $\mathit{wr}$ relation.

Checking serializability

Validation generates constraints to check whether the validating execution history is serializable (which can be encoded more efficiently than unserializable, since serializable implies a total commit order exists). If the solver returns “satisfiable,” IsoPredict reports no prediction. Otherwise (the solver returns “unsatisfiable”), IsoPredict reports the validating execution, which is known to be a feasible, unserializable, weak isolation model–conforming execution.

6. Implementation

This section describes the implementation of IsoPredict, which is publicly available (Geng et al., 2024b).

Predictive analysis

We implemented IsoPredict’s predictive analysis (§4) as a Python program that uses Z3Py, the Python binding of the Z3 SMT solver (de Moura and Bjørner, 2008). Observed and predicted execution histories are in the form of traces containing read and write events and transaction and session identifiers, including the transaction that each read reads from. If Z3 finds a predicted unserializable execution, it either reports the predicted execution history in both textual and graphical forms, or passes the predicted history to the validation component, depending on how IsoPredict is configured.

To generate observed execution traces, we extended the implementation of MonkeyDB, a transactional key–value data store (Biswas et al., 2021). MonkeyDB handles relational queries by translating them to key–value queries. MonkeyDB executes transactions serially, and we configured it to choose the latest writer at each read, so observed executions are always serializable.

Validation

IsoPredict’s validation component replays the client application on a customized query engine that we also built on top of MonkeyDB. The query engine executes transactions one at a time, in an order dictated by the predicted execution, to ensure that read events always occur after their writers. At each read, the query engine chooses a last writer that satisfies the weak isolation model and, if possible, matches the predicted execution (§5). Validation uses Z3Py to generate and solve SMT constraints to determine if the validating execution history is unserializable, reporting the validating execution to the user in both textual and graphical forms if so.

The customized query engine handles transaction aborts by rewinding the predicted execution trace to the beginning of the current transaction. In our experiments, every transaction that aborted during the observed execution also aborts during the validating execution—except in a few cases, when a transaction that aborted in the observed execution and immediately precedes a committed transaction on the prediction boundary, actually commits in the validating execution. As for other divergent behavior, the resulting validating execution may or may not be unserializable.

7. Evaluation

This section evaluates how effectively and efficiently IsoPredict predicts unserializable executions under causal and rc, and it compares empirically against prior work MonkeyDB (Biswas et al., 2021).

7.1. Methodology

Prediction strategies

Table 2. IsoPredict prediction strategies.

Pred. strategy	Encoding precision	Pred. boundary	Divergence $\Rightarrow$ false predictions?
Exact-Strict	Exact encoding	Strict	Only because of aborts
Approx-Strict	Approximate encoding	Strict	Only because of aborts
Approx-Relaxed	Approximate encoding	Relaxed	Yes

Table 2 shows the combinations of unserializability constraints and prediction boundaries that we evaluated, which we call prediction strategies. The Exact-Strict prediction strategy uses precise encoding of unserializability (§4.2.1), while Approx-Strict and Approx-Relaxed encode the sufficient condition for unserializability (§4.2.2). Exact-Strict and Approx-Strict encode the strict prediction boundary, while Approx-Relaxed encode the relaxed prediction boundary.

Benchmarks

We evaluated IsoPredict and MonkeyDB using transactional workloads from OLTP-Bench, a database testing framework that generates various workloads for benchmarking relational databases (Difallah et al., 2013). Table 3 shows quantitative characteristics of the evaluated Benchmarks.

Our experiments used versions of the OLTP-Bench programs that the MonkeyDB authors ported to use simplified SQL queries recognized by MonkeyDB (Biswas et al., 2021). In these versions, each benchmark runs a nondeterministic number of transactions based on a specified time limit. For the purposes of our evaluation, we modified the benchmarks to be more deterministic for two reasons. First, determinism provides a more stable comparison among IsoPredict’s prediction strategies. Second, determinism helps with validation, since the validating execution can run the benchmark with the same RNG seed that the observed execution used. (To use validation in a production setting, one should record and replay the application (Galanis et al., 2008; Li et al., 2023).) We modified the benchmarks to be more deterministic by (1) fixing the number of sessions and transactions per session and (2) adding a random number generator (RNG) seed as a parameter to each benchmark. Although these modifications increase determinism, the benchmarks still execute nondeterministically because the interleaving of transactions is timing dependent. This source of nondeterminism does not hinder validation, which executes transactions in an order consistent with the predicted execution’s $\mathit{hb}$ relation.

Table 3. Average number of events and committed transactions across 10 trials of each OLTP-Bench program.

	Small workload				Large workload
	KV accesses		Committed txns		KV accesses		Committed txns
Program	Reads	Writes	Total	(Read-only)	Reads	Writes	Total	(Read-only)
Smallbank	669.7	14.7	11.0	(3.5)	1271.3	30.5	20.3	(6.6)
Voter	763.0	6.0	12.0	(11.0)	919.0	6.0	24.0	(23.0)
TPC-C	3297.3	763.0	11.9	(0.9)	7025.6	1502.4	23.8	(1.7)
Wikipedia	1067.7	55.1	9.9	(8.8)	2677.1	111.1	22.8	(20.6)

Algorithm 3 Code executed by each of Voter’s transactions.

procedure Vote(id)

\mathit{votes}\leftarrow\mathit{DataStore}.\mathit{get}(\mathit{id})

\mathit{votes}<1

then

\mathit{DataStore}.\mathit{put}(\mathit{id},1)

\mathit{DataStore}.\mathit{commit}()

We configured each benchmark with both small and large workloads, in which three sessions each execute four or eight transactions, resulting in 12 or 24 attempted transactions, respectively. The number of committed transactions is somewhat fewer because all programs except Voter occasionally abort a transaction based on application-specific logic.

Platform

All experiments ran on an Intel Xeon server at 2.3 GHz with 16 cores, hyperthreading enabled, and 187 GB of RAM, running Linux.

7.2. IsoPredict’s Effectiveness and Performance

Tables 4 and 5 show IsoPredict’s effectiveness and performance at predicting unserializable executions under causal and rc, respectively. For each benchmark and each of IsoPredict’s three prediction strategies, we ran IsoPredict on 10 executions, each of which used one of 10 RNG seeds, which we kept consistent across prediction strategies and isolation levels.

Table 4. IsoPredict effectiveness and performance under causal. “T/O” means the solver did not finish within 24 hours. “Unk” means the solver returned “unknown” without reaching the timeout.

	Prediction	Prediction			Validation		Constraint gen.		Solving time
Program	strategy	Unk	Unsat	Sat	Validated	(Diverged)	# Literals	Time	Sat	Unsat
Smallbank	Exact-Strict	0	6	4	4	(0)	$140$ K	$8.8$ s	$13.9$ s	$11.3$ s
	Approx-Strict	0	6	4	4	(1)	$366$ K	$22.9$ s	$1.0$ s	$3.2$ s
	Approx-Relaxed	0	0	10	9	(1)	$366$ K	$22.9$ s	$0.6$ s	–
Voter	Exact-Strict	0	10	0	0	(0)	$687$ K	$61.7$ s	–	$64.5$ s
	Approx-Strict	0	10	0	0	(0)	$1,526$ K	$131.7$ s	–	$10.4$ s
	Approx-Relaxed	0	10	0	0	(0)	$1,526$ K	$132.1$ s	–	$10.0$ s
TPC-C	Exact-Strict	0	1	9	9	(0)	$3,493$ K	$220.4$ s	$230.4$ s	$752.3$ s
	Approx-Strict	0	1	9	9	(0)	$6,508$ K	$425.8$ s	$35.1$ s	$105.2$ s
	Approx-Relaxed	0	0	10	10	(0)	$6,508$ K	$425.5$ s	$22.7$ s	–
Wikipedia	Exact-Strict	1	9	0	0	(0)	$180$ K	$13.9$ s	–	$24.0$ s
	Approx-Strict	0	10	0	0	(0)	$529$ K	$36.3$ s	–	$1.3$ s
	Approx-Relaxed	0	8	2	2	(1)	$529$ K	$36.3$ s	$2.5$ s	$1.0$ s

((a)) Small workload

	Prediction	Prediction			Validation		Constraint gen.		Solving time
Program	strategy	T/O	Unsat	Sat	Validated	(Diverged)	# Literals	Time	Sat	Unsat
Smallbank	Exact-Strict	4	1	5	5	(1)	$1,073$ K	$55.6$ s	$8,618.9$ s	$2,366.2$ s
	Approx-Strict	1	0	9	9	(0)	$2,175$ K	$121.0$ s	$332.5$ s	–
	Approx-Relaxed	0	0	10	10	(0)	$1,073$ K	$118.8$ s	$19.3$ s	–
Voter	Exact-Strict	9	1	0	0	(0)	$2,623$ K	$235.1$ s	–	$5,708.7$ s
	Approx-Strict	0	10	0	0	(0)	$5,623$ K	$490.5$ s	–	$47.2$ s
	Approx-Relaxed	0	10	0	0	(0)	$5,623$ K	$496.1$ s	–	$47.1$ s
TPC-C	Exact-Strict	4	3	3	3	(0)	$36,434$ K	$1,914.6$ s	$30,413.1$ s	$24,281.2$ s
	Approx-Strict	2	0	8	8	(0)	$60,834$ K	$3,416.1$ s	$1,210.3$ s	–
	Approx-Relaxed	0	0	10	10	(2)	$60,834$ K	$3,332.3$ s	$186.2$ s	–
Wikipedia	Exact-Strict	8	1	1	1	(0)	$1,773$ K	$111.9$ s	$910.2$ s	$1,876.8$ s
	Approx-Strict	0	9	1	1	(0)	$4,316$ K	$263.7$ s	$15.6$ s	$30.1$ s
	Approx-Relaxed	0	8	2	2	(2)	$4,316$ K	$258.3$ s	$20.3$ s	$25.3$ s

((b)) Large workload

Table 5. IsoPredict effectiveness and performance under rc. “T/O” means the solver did not finish within 24 hours. “Unk” means the solver returned “unknown” without reaching the timeout.

	Prediction	Prediction			Validation		Constraint gen.		Solving time
Program	strategy	Unk	Unsat	Sat	Validated	(Diverged)	# Literals	Time	Sat	Unsat
Smallbank	Exact-Strict	0	0	10	10	(0)	$144$ K	$10.0$ s	$2.3$ s	–
	Approx-Strict	0	0	10	10	(0)	$370$ K	$24.3$ s	$0.8$ s	–
	Approx-Relaxed	0	0	10	10	(0)	$370$ K	$24.4$ s	$0.6$ s	–
Voter	Exact-Strict	0	0	10	10	(2)	$688$ K	$62.5$ s	$12.9$ s	–
	Approx-Strict	0	0	10	10	(7)	$1,527$ K	$133.0$ s	$12.3$ s	–
	Approx-Relaxed	0	0	10	10	(10)	$1,527$ K	$132.7$ s	$12.7$ s	–
TPC-C	Exact-Strict	0	0	10	10	(0)	$3,855$ K	$359.0$ s	$52.0$ s	–
	Approx-Strict	0	0	10	10	(0)	$6,869$ K	$569.2$ s	$27.2$ s	–
	Approx-Relaxed	0	0	10	10	(3)	$6,869$ K	$588.9$ s	$22.8$ s	–
Wikipedia	Exact-Strict	2	1	7	7	(2)	$184$ K	$15.4$ s	$3.9$ s	$8.0$ s
	Approx-Strict	0	3	7	7	(1)	$533$ K	$38.3$ s	$2.1$ s	$0.5$ s
	Approx-Relaxed	0	3	7	7	(7)	$533$ K	$38.1$ s	$1.7$ s	$0.5$ s

((a)) Small workload

	Prediction	Prediction			Validation		Constraint gen.		Solving time
Program	strategy	T/O	Unsat	Sat	Validated	(Diverged)	# Literals	Time	Sat	Unsat
Smallbank	Exact-Strict	0	0	10	9	(1)	$1,085$ K	$60.3$ s	$624.6$ s	–
	Approx-Strict	0	0	10	10	(1)	$2,187$ K	$124.5$ s	$19.0$ s	–
	Approx-Relaxed	0	0	10	10	(1)	$2,187$ K	$128.7$ s	$51.7$ s	–
Voter	Exact-Strict	0	0	10	10	(2)	$2,625$ K	$255.7$ s	$212.0$ s	–
	Approx-Strict	0	0	10	10	(6)	$5,625$ K	$491.7$ s	$76.2$ s	–
	Approx-Relaxed	0	0	10	10	(10)	$5,625$ K	$495.2$ s	$75.4$ s	–
TPC-C	Exact-Strict	0	0	10	10	(2)	$38,062$ K	$2,571.0$ s	$898.6$ s	–
	Approx-Strict	0	0	10	10	(2)	$62,462$ K	$3,981.9$ s	$279.7$ s	–
	Approx-Relaxed	0	0	10	10	(4)	$62,462$ K	$4,040.4$ s	$201.0$ s	–
Wikipedia	Exact-Strict	0	0	10	10	(1)	$1,807$ K	$124.6$ s	$81.4$ s	–
	Approx-Strict	0	0	10	10	(1)	$4,350$ K	$272.8$ s	$29.2$ s	–
	Approx-Relaxed	0	0	10	9	(10)	$4,350$ K	$272.9$ s	$16.9$ s	–

((b)) Large workload

Predictive analysis

The Sat column under Prediction reports the number of unserializable executions (out of 10) that IsoPredict found. The Approx-Relaxed prediction strategy generally predicts more than the other strategies because it uses the relaxed boundary. Although Exact-Strict can theoretically predict more executions than Approx-Strict, this never happened in our experiments.

IsoPredict consistently predicts more unserializable executions under rc than under causal, which makes sense because rc is strictly weaker than causal. Voter has the biggest difference—there were no successful predictions under causal. This is because every observed execution of Voter has only one writing (i.e., non-read-only) transaction (see Algorithm 3), which is not sufficient to predict an unserializable execution under causal.⁵⁵5More specifically, the initial state transaction $t_{0}$ and the writing transaction $t_{w}$ constitute the only pair of conflicting writes. If a transaction $t_{r}$ reads from the initial state, then a commit order with $t_{r}$ preceding $t_{w}$ is acyclic. On the other hand, if $t_{r}$ reads from another transaction, a commit order $t_{r}$ following $t_{w}$ is acyclic. Similarly, IsoPredict has low prediction rates for Wikipedia, which has few writing transactions. In contrast, under rc, a transaction may legally read both the initial state and the writing transaction, which is why IsoPredict has higher prediction rates for Voter and Wikipedia under rc than under causal. §4.4 and Appendix C present some observed and predicted executions from the evaluated benchmarks.

Validation

We configured IsoPredict to validate every predicted unserializable execution. The Validated column reports the number of validating executions that were unserializable. Across all experiments, all but three predicted executions were successfully validated as unserializable.

The Diverged column shows that, in many cases, the validating execution diverged, i.e., it could not match the predicted execution history (§5). Unsurprisingly, the relaxed boundary experienced significantly more divergence than the strict boundary. However, divergence rarely resulted in failed validation: Among the 81 divergent executions across Tables 4 and 5, only three failed validation (i.e., produced serializable executions). One validation failure was caused by divergent behavior unrelated to aborts (§5), and the other two failures were caused by previously aborted transactions being committed (an implementation issue discussed in §6).

Performance

The four rightmost columns of each table report the performance of IsoPredict’s predictive analysis, which consists of two components: (1) the time the analysis takes to generate SMT constraints (Constraint gen.) and (2) SMT solving time (Solving time). Each table also reports the size of the generated constraints (# Literals),⁶⁶6The Approx-Strict and Approx-Relaxed prediction strategies generate different constraints, but they have the same size. which correlates with constraint generation time. SMT solving is significantly faster for successful prediction (Sat) than for failed prediction (Unsat),⁷⁷7It makes sense that successful prediction, which finds a single satisfying solution, is faster than failed prediction, which requires the solver to prove that no satisfying solution exists. so the table reports the two average solving times separately.

Compared to the other prediction strategies, Exact-Strict, which generates a single quantified constraint, spends less time generating constraints but more time solving constraints because its constraints are inherently harder to solve. Approx-Relaxed and Approx-Strict have performance similar to each other, which makes sense since they share the same approximation techniques.

Generating constraints can take a long time—often longer than constraint-solving time. We investigated this issue by using the perf (perf, 2024) and py-spy (Frederickson, 2024) performance analysis tools on the slowest instance of constraint generation: the large workload of TPC-C under rc using the Approx-Relaxed strategy (Table 5). To the best of our understanding, 97% of time is spent in Python code (IsoPredict and Z3Py), and 3% is spent in C code (Z3). Of the time spent in Python, 81% is spent in Z3Py functions, with most time spent in the following Z3PY API functions and their callees: __call__(), And(), and Or(). The __call__() function is part of Z3Py’s implementation of SMT functions, which act as callable objects in Python. The And() and Or() functions create conjunction and disjunction clauses, respectively. Z3Py functions call into Z3 code written in C; an unknown fraction of the time spent in Z3Py is due to making cross-language calls from Z3Py to Z3.

7.3. Comparison with MonkeyDB

MonkeyDB is a transactional key–value data store that aims to produce unusual executions that are legal under a target isolation level (Biswas et al., 2021). MonkeyDB handles each read to a key by returning a randomly chosen value among the set of legal values under the target isolation level.

MonkeyDB and IsoPredict both aim to find erroneous executions under weak isolation, but they use completely different approaches. MonkeyDB relies on a customized query engine that produces a single execution, while IsoPredict uses predictive analysis to analyze an equivalence class of many executions at once. They also differ in how they define and expose unserializable behavior: IsoPredict tries to find an unserializable execution, while MonkeyDB uses programmer-crafted assertions to detect unserializable behaviors.

Tables 7 and 7 compare MonkeyDB and IsoPredict’s effectiveness at predicting unserializable executions. To account for MonkeyDB’s randomized approach, we ran it 100 times for each configuration: 10 times for each of the 10 RNG seeds used as benchmark input (§7.1). The percentage of these executions with an assertion failure is reported in the Fail column.

Table 6. Comparison between MonkeyDB (Biswas et al., 2021) and IsoPredict (Approx-Relaxed strategy) under causal. The numbers report how often a benchmark assertion failed (Fail) or the history was unserializable (Unser).

Table 7. Comparison between MonkeyDB (Biswas et al., 2021), IsoPredict (Approx-Strict strategy), and regular execution using MySQL under rc. Each number is the percentage of runs in which a benchmark assertion failed (Fail) or the history was unserializable (Unser).

	MonkeyDB		IsoPredict
Program	Fail	Unser	Unser
Smallbank	70%	98%	90%
Voter	70%	80%	0%
TPC-C	98%	100%	100%
Wikipedia	0%	11%	20%

((a)) Small workload

	MonkeyDB		IsoPredict
Program	Fail	Unser	Unser
Smallbank	84%	100%	100%
Voter	56%	80%	0%
TPC-C	100%	100%	100%
Wikipedia	0%	19%	20%

((b)) Large workload

	MonkeyDB		IsoPredict	MySQL
Program	Fail	Unser	Unser	Fail
Smallbank	100%	100%	100%	0%
Voter	89%	100%	100%	0%
TPC-C	100%	100%	100%	50%
Wikipedia	54%	54%	70%	0%

((a)) Small workload

	MonkeyDB		IsoPredict	MySQL
Program	Fail	Unser	Unser	Fail
Smallbank	100%	100%	100%	0%
Voter	95%	100%	100%	0%
TPC-C	100%	100%	100%	70%
Wikipedia	89%	89%	100%	0%

((b)) Large workload

To compare MonkeyDB and IsoPredict directly, we computed whether each execution produced by MonkeyDB was unserializable, by generating and solving constraints corresponding to the definition of serializable. An assertion failure is a sufficient but unnecessary condition for an unserializable execution; hence, for MonkeyDB, the number of executions failing assertions (Fail) never exceeds the number of unserializable executions (Unser).

The IsoPredict column shows the percentage of executions that led to unserializable predictions that were successfully validated (i.e., same results as the Validation columns in Tables 4 and 5). The tables use the best-performing prediction strategy for each isolation level.

Quantitatively, MonkeyDB and IsoPredict are comparable, finding erroneous executions at similar rates, except for two cases. In one case—Voter under causal—MonkeyDB produces unserializable executions, but IsoPredict never predicts any. Voter issues only one write transaction under serializable (Algorithm 3), from which it is impossible to predict an unserializable execution under causal, because IsoPredict cannot predict events that did not happen in the observed execution. In contrast, since MonkeyDB chooses values on the fly, its choices of reads can lead Voter to perform additional writes, leading to unserializable behavior. In another case—Wikipedia under causal—IsoPredict is able to predict several unserializable executions while MonkeyDB never has assertion failures, since its assertions are not sensitive enough to detect unserializable behaviors.

Qualitatively, the approaches differ in two significant ways. First, IsoPredict does not require programmers to write assertions. Second and more significantly, IsoPredict predicts unserializable executions from observed executions, which in theory could be produced by any data store. In contrast, MonkeyDB’s approach requires its specialized query engine.

Comparison with regular execution

Both MonkeyDB and IsoPredict routinely produce unserializable executions for the evaluated programs, but a natural question is whether executing these programs normally on a real-world data store yields unserializable executions. To evaluate this question, we executed the programs using MySQL (MySQL, 2023a) in rc mode (MySQL does not support causal). As for the MonkeyDB runs, we executed each program 100 times—10 times for each of the 10 RNG seeds used as input to the program—and evaluated the assertions used by MonkeyDB.

Table 7’s MySQL columns show the percentage of runs in which an assertion failed, a sufficient condition for an unserializable history. The results show that Smallbank, Voter, and Wikipedia never experienced an assertion failure under regular execution.⁸⁸8It is an open question whether MySQL in rc mode can actually produce unserializable executions for these programs. Data store implementations may preclude behaviors that are theoretically possible under the target isolation level. TPC-C experienced an assertion failure half of the time on the small workload and 70% of the time on the large workload. In contrast, MonkeyDB and IsoPredict often produce assertion-failing, unserializable executions.

Differences between our MonkeyDB results and the MonkeyDB paper’s results

In our experiments, MonkeyDB triggered fewer assertion failures than reported in the MonkeyDB paper (Biswas et al., 2021). These differences exist because we found and fixed a few bugs in the ported benchmarks and their assertions, which eliminated a few spurious failures. We confirmed all of the bugs and fixes with the MonkeyDB authors (Biswas et al., 2023). To be clear, the differences do not impact the MonkeyDB paper’s takeaway: MonkeyDB often produces unserializable, erroneous executions for the evaluated programs.

8. Related Work

The closest existing approaches to IsoPredict are arguably MonkeyDB (Biswas et al., 2021), IsoDiff (Gan et al., 2020), 2AD (Warszawski and Bailis, 2017), and Sinha et al.’s predictive analysis (Sinha et al., 2012). As §7.3 explained, MonkeyDB produces a single execution, which may or may not be unserializable, while IsoPredict predicts unserializable executions from an observed execution. IsoPredict can in theory work with any data store that can generate execution traces, while MonkeyDB requires its specialized query engine.

IsoDiff and 2AD detect unserializable behaviors based on an observed execution (Gan et al., 2020; Warszawski and Bailis, 2017). They build an abstract graph that does not take into account potential dependencies between read values. As the 2AD paper acknowledges, “2AD’s abstract histories are value-agnostic and do not account for control flow within a program; in effect, 2AD’s abstract history construction process assumes that each variable read and written can assume arbitrary values. However, there are often dependencies (e.g., $y=x+1$ ) between the values that variables assume” (Warszawski and Bailis, 2017). As a result, 2AD incurs high false positive rates even after using programmer-guided refinement: 37 reported “witnesses” on average per application, but only 22 bugs across 12 applications, or 2 bugs on average per application (Warszawski and Bailis, 2017).

In contrast, IsoPredict accounts for dependencies among read values through its axiomatic encoding of constraints, which permits encoding of potential dependencies using the prediction boundary. IsoPredict may still report false positives, but for narrower reasons: divergent aborts or (only when using the relaxed boundary) intra-transaction dependencies.

Sinha et al.’s analysis predicts atomicity violations in shared-memory multithreaded programs by encoding the conditions for unserializability as SMT constraints (Sinha et al., 2012). A key difference with IsoPredict is that Sinha et al.’s work deals with execution histories of shared-memory programs, in which all pairs of conflicting accesses are fully ordered, while IsoPredict deals with execution histories of distributed data store applications, in which conflicting accesses are unordered in general. As a result, Sinha et al.’s work only needs to encode graph cyclicity, while IsoPredict must encode that every potential commit order is acyclic. Addressing this unique challenge led us to develop IsoPredict’s approximate encoding (§4.2.2). Other differences include the different prediction spaces: Sinha et al.’s analysis predicts different orderings of critical sections on the same lock, while IsoPredict predicts different write–read orders.

Dynamic analysis

Non-predictive dynamic analysis can check if an observed execution satisfies an isolation level. ECRacer checks whether an observed execution is serializable, using a relaxed definition of serializability that accounts for commutative operations (Brutschy et al., 2017). In contrast, IsoPredict finds new executions that violate serializability.

Prior work uses run-time testing and constraint solving to check if a data store provides a stated weak isolation level (Biswas and Enea, 2019; Kingsbury and Alvaro, 2020; Tan et al., 2020; Zhang et al., 2023; Zennou et al., 2022). In contrast, IsoPredict assumes the data store provides the target weak isolation level and predicts feasible unserializable executions.

Model checking explores multiple executions, avoiding exhaustively exploring all possible executions by using techniques such as dynamic partial order reduction (DPOR) (Abdulla et al., 2023; Bouajjani et al., 2023; Ghafoor et al., 2016). Conschecker uses a DPOR-based stateless model checking algorithm to verify distributed shared-memory programs under causal consistency (Abdulla et al., 2023). Bouajjani et al.’s work adapts DPOR-based algorithms to transactional database applications to check them under a range of isolation levels (Bouajjani et al., 2023).

Static analysis

Static analysis can find unserializable behavior, but precision and performance scale poorly with program size. C⁴ and Nagar and Jagannathan’s analysis detect serializability violations under causal consistency, eventual consistency, and snapshot isolation (Brutschy et al., 2018; Nagar and Jagannathan, 2018). Clotho uses static analysis, model checking, and test generation to detect unserializable executions; it avoids false positives by verifying the feasibility of unserializable behaviors (Rahmani et al., 2019). In contrast, IsoPredict detects unserializable behaviors with high precision by basing it on a single observed execution.

Isolation levels

IsoPredict generates constraints based on isolation levels encoded in Biswas and Enea’s axiomatic framework (Biswas and Enea, 2019). Other prior work besides Biswas and Enea’s has introduced axiomatic encodings of weak isolation levels (Bouajjani et al., 2017; Perrin et al., 2016; Cerone et al., 2015; Kaki et al., 2018).

Adya et al. define various isolation levels with dependency graphs where each level allows certain types of cycles (Adya et al., 2000). Their approach encompasses “classical” database isolation levels such as read committed and snapshot isolation, but not isolation levels typically used in distributed data stores such as causal consistency (Alglave et al., 2014; Bouajjani et al., 2017; Burckhardt, 2014; Hamza, 2015; Perrin et al., 2016) and eventual consistency (Burckhardt, 2014).

IsoPredict currently supports only causal and rc, by encoding axioms from Biswas and Enea’s framework (Biswas and Enea, 2019). We expect that extending IsoPredict to more isolation levels from their framework—read atomic (a.k.a. repeated reads) and snapshot isolation—to be straightforward. We do not know how difficult it would be to encode other isolation levels (e.g., eventual consistency and monotonic atomic view) into Biswas and Enea’s framework or into IsoPredict.

9. Conclusion

IsoPredict is the first predictive analysis for detecting unserializable behaviors of applications backed by weakly isolated data stores. IsoPredict’s design introduces novel approaches to address challenges involving constraint complexity, constraint encoding, and divergent behaviors. An evaluation shows that, based on observed executions of data store applications, IsoPredict effectively, precisely, and efficiently predicts feasible, unserializable behaviors.

Data-Availability Statement

An artifact reproducing this paper’s results is publicly available (Geng et al., 2024a).

Acknowledgements.

We thank the MonkeyDB authors (Biswas et al., 2021) for making their implementation publicly available and answering our questions about it; Vincent Beardsley and Noah Charlton for helpful discussions; and the anonymous reviewers for valuable feedback. This material is based in part upon work supported by the National Science Foundation under Grant Numbers NSF CCF-2118745, CSR-2106117, and OAC-2112606, and by Oracle America, Inc.

References

(1)
Abdulla et al. (2023) Parosh Abdulla, Mohamed Faouzi Atig, S. Krishna, Ashutosh Gupta, and Omkar Tuppe. 2023. Optimal Stateless Model Checking for Causal Consistency. In Tools and Algorithms for the Construction and Analysis of Systems, Sriram Sankaranarayanan and Natasha Sharygina (Eds.). Springer Nature Switzerland, Cham, 105–125.
Adya et al. (2000) A. Adya, B. Liskov, and P. O’Neil. 2000. Generalized isolation level definitions. In Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073). IEEE Computer Society, Los Alamitos, CA, USA, 67–78. https://doi.org/10.1109/ICDE.2000.839388
Ahamad et al. (1995) Mustaque Ahamad, Gil Neiger, James E. Burns, Prince Kohli, and P.W. Hutto. 1995. Causal Memory: Definitions, Implementation and Programming. Distributed Computing 9, 1 (1995), 37–49. https://doi.org/10.1007/BF01784241
Alglave et al. (2014) Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM Trans. Program. Lang. Syst. 36, 2, Article 7 (Jul 2014), 74 pages. https://doi.org/10.1145/2627752
Berenson et al. (1995) Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neil. 1995. A Critique of ANSI SQL Isolation Levels. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (San Jose, California, USA) (SIGMOD ’95). ACM, New York, NY, USA, 1–10. https://doi.org/10.1145/223784.223785
Biswas and Enea (2019) Ranadeep Biswas and Constantin Enea. 2019. On the Complexity of Checking Transactional Consistency. Proc. ACM Program. Lang. 3, OOPSLA, Article 165 (Oct 2019), 28 pages. https://doi.org/10.1145/3360591
Biswas et al. (2021) Ranadeep Biswas, Diptanshu Kakwani, Jyothi Vedurada, Constantin Enea, and Akash Lal. 2021. MonkeyDB: Effectively Testing Correctness under Weak Isolation Levels. Proc. ACM Program. Lang. 5, OOPSLA, Article 132 (Oct 2021), 27 pages. https://doi.org/10.1145/3485546
Biswas et al. (2023) Ranadeep Biswas, Diptanshu Kakwani, Jyothi Vedurada, Constantin Enea, and Akash Lal. 2023. Personal communication.
Bouajjani et al. (2017) Ahmed Bouajjani, Constantin Enea, Rachid Guerraoui, and Jad Hamza. 2017. On verifying causal consistency. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France) (POPL ’17). Association for Computing Machinery, New York, NY, USA, 626–638. https://doi.org/10.1145/3009837.3009888
Bouajjani et al. (2023) Ahmed Bouajjani, Constantin Enea, and Enrique Román-Calvo. 2023. Dynamic Partial Order Reduction for Checking Correctness against Transaction Isolation Levels. Proc. ACM Program. Lang. 7, PLDI, Article 129 (Jun 2023), 26 pages. https://doi.org/10.1145/3591243
Brutschy et al. (2017) Lucas Brutschy, Dimitar Dimitrov, Peter Müller, and Martin Vechev. 2017. Serializability for Eventual Consistency: Criterion, Analysis, and Applications. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France) (POPL ’17). Association for Computing Machinery, New York, NY, USA, 458–472. https://doi.org/10.1145/3009837.3009895
Brutschy et al. (2018) Lucas Brutschy, Dimitar Dimitrov, Peter Müller, and Martin Vechev. 2018. Static Serializability Analysis for Causal Consistency. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (Philadelphia, PA, USA) (PLDI 2018). Association for Computing Machinery, New York, NY, USA, 90–104. https://doi.org/10.1145/3192366.3192415
Burckhardt (2014) Sebastian Burckhardt. 2014. Principles of Eventual Consistency. Found. Trends Program. Lang. 1, 1–2 (oct 2014), 1–150. https://doi.org/10.1561/2500000011
Cerone et al. (2015) Andrea Cerone, Giovanni Bernardi, and Alexey Gotsman. 2015. A Framework for Transactional Consistency Models with Atomic Visibility. In 26th International Conference on Concurrency Theory (CONCUR 2015) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 42), Luca Aceto and David de Frutos Escrig (Eds.). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 58–71. https://doi.org/10.4230/LIPIcs.CONCUR.2015.58
Cheng et al. (2023) Chaoyi Cheng, Mingzhe Han, Nuo Xu, Spyros Blanas, Michael D. Bond, and Yang Wang. 2023. Developer’s Responsibility or Database’s Responsibility? Rethinking Concurrency Control in Databases. In 13th Conference on Innovative Data Systems Research, CIDR 2023, Amsterdam, The Netherlands, January 8-11, 2023. www.cidrdb.org. https://www.cidrdb.org/cidr2023/papers/p30-cheng.pdf
Corbett et al. (2012) James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. 2012. Spanner: Google’s Globally-Distributed Database. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX Association, Hollywood, CA, 261–264. https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett
Crooks et al. (2017) Natacha Crooks, Youer Pu, Lorenzo Alvisi, and Allen Clement. 2017. Seeing is Believing: A Client-Centric Specification of Database Isolation. In Proceedings of the ACM Symposium on Principles of Distributed Computing (Washington, DC, USA) (PODC ’17). ACM, New York, NY, USA, 73–82. https://doi.org/10.1145/3087801.3087802
de Moura and Bjørner (2008) Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems, C. R. Ramakrishnan and Jakob Rehof (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 337–340.
Difallah et al. (2013) Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. Proc. VLDB Endow. 7, 4 (Dec 2013), 277–288. https://doi.org/10.14778/2732240.2732246
Elhemali et al. (2022) Mostafa Elhemali, Niall Gallagher, Nick Gordon, Joseph Idziorek, Richard Krog, Colin Lazier, Erben Mo, Akhilesh Mritunjai, Somasundaram Perianayagam, Tim Rath, Swami Sivasubramanian, James Christopher Sorenson III, Sroaj Sosothikul, Doug Terry, and Akshat Vig. 2022. Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA, 1037–1048. https://www.usenix.org/conference/atc22/presentation/elhemali
Frederickson (2024) Ben Frederickson. 2024. https://github.com/benfred/py-spy
Galanis et al. (2008) Leonidas Galanis, Supiti Buranawatanachoke, Romain Colle, Benoît Dageville, Karl Dias, Jonathan Klein, Stratos Papadomanolakis, Leng Leng Tan, Venkateshwaran Venkataramani, Yujun Wang, and Graham Wood. 2008. Oracle Database Replay. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (Vancouver, Canada) (SIGMOD ’08). Association for Computing Machinery, New York, NY, USA, 1159–1170. https://doi.org/10.1145/1376616.1376732
Gan et al. (2020) Yifan Gan, Xueyuan Ren, Drew Ripberger, Spyros Blanas, and Yang Wang. 2020. IsoDiff: Debugging Anomalies Caused by Weak Isolation. Proc. VLDB Endow. 13, 12 (Jul 2020), 2773–2786. https://doi.org/10.14778/3407790.3407860
Geng et al. (2024a) Chujun Geng, Spyros Blanas, Michael D. Bond, and Yang Wang. 2024a. IsoPredict artifact. https://doi.org/10.5281/zenodo.10802748
Geng et al. (2024b) Chujun Geng, Spyros Blanas, Michael D. Bond, and Yang Wang. 2024b. IsoPredict implementation. https://github.com/PLaSSticity/IsoPredict-implementation
Ghafoor et al. (2016) M. Ghafoor, M. Mahmood, and J. Siddiqui. 2016. Effective Partial Order Reduction in Model Checking Database Applications. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE Computer Society, Los Alamitos, CA, USA, 146–156. https://doi.org/10.1109/ICST.2016.25
Gilbert and Lynch (2002) Seth Gilbert and Nancy Lynch. 2002. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33 (June 2002), 51–59. Issue 2. https://doi.org/10.1145/564585.564601
Hamza (2015) Jad Hamza. 2015. Algorithmic Verification of Concurrent and Distributed Data Structures. Ph. D. Dissertation. PhD thesis, Université Paris Diderot.
Huang et al. (2014) Jeff Huang, Patrick O’Neil Meredith, and Grigore Rosu. 2014. Maximal sound predictive race detection with control flow abstraction. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (Edinburgh, United Kingdom) (PLDI ’14). Association for Computing Machinery, New York, NY, USA, 337–348. https://doi.org/10.1145/2594291.2594315
Kaki et al. (2018) Gowtham Kaki, Kapil Earanky, KC Sivaramakrishnan, and Suresh Jagannathan. 2018. Safe replication through bounded concurrency verification. Proc. ACM Program. Lang. 2, OOPSLA, Article 164 (Oct 2018), 27 pages. https://doi.org/10.1145/3276534
Kingsbury and Alvaro (2020) Kyle Kingsbury and Peter Alvaro. 2020. Elle: Inferring Isolation Anomalies from Experimental Observations. Proc. VLDB Endow. 14, 3 (Nov 2020), 268–280. https://doi.org/10.14778/3430915.3430918
Kini et al. (2017) Dileep Kini, Umang Mathur, and Mahesh Viswanathan. 2017. Dynamic race prediction in linear time. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (Barcelona, Spain) (PLDI 2017). Association for Computing Machinery, New York, NY, USA, 157–170. https://doi.org/10.1145/3062341.3062374
Leino and Pit-Claudel (2016) K. R. M. Leino and Clément Pit-Claudel. 2016. Trigger Selection Strategies to Stabilize Program Verifiers. In Computer Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer International Publishing, Cham, 361–381.
Li et al. (2023) Qian Li, Peter Kraft, Michael Cafarella, Çağatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Xiangyao Yu, and Matei Zaharia. 2023. R3: Record-Replay-Retroaction for Database-Backed Applications. Proc. VLDB Endow. 16, 11 (Jul 2023), 3085–3097. https://doi.org/10.14778/3611479.3611510
Mahajan et al. (2011) P. Mahajan, L. Alvisi, and M. Dahlin. 2011. Consistency, Availability, Convergence. Technical Report TR-11-22. Computer Science Department, University of Texas at Austin.
MySQL (2023a) MySQL 2023a. http://www.mysql.com
MySQL (2023b) MySQL 2023b. MySQL Cluster. https://www.mysql.com/products/cluster/
Nagar and Jagannathan (2018) Kartik Nagar and Suresh Jagannathan. 2018. Automated Detection of Serializability Violations under Weak Consistency. arXiv:1806.08416 [cs.PL]
Pavlo (2017) Andrew Pavlo. 2017. What Are We Doing With Our Lives? Nobody Cares About Our Concurrency Control Research. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD ’17). Association for Computing Machinery, New York, NY, USA, 3. https://doi.org/10.1145/3035918.3056096
perf (2024) perf 2024. https://perf.wiki.kernel.org/index.php/Main_Page
Perrin et al. (2016) Matthieu Perrin, Achour Mostefaoui, and Claude Jard. 2016. Causal Consistency: Beyond Memory. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Barcelona, Spain) (PPoPP ’16). Association for Computing Machinery, New York, NY, USA, Article 26, 12 pages. https://doi.org/10.1145/2851141.2851170
Rahmani et al. (2019) Kia Rahmani, Kartik Nagar, Benjamin Delaware, and Suresh Jagannathan. 2019. CLOTHO: Directed Test Generation for Weakly Consistent Database Systems. Proc. ACM Program. Lang. 3, OOPSLA, Article 117 (Oct 2019), 28 pages. https://doi.org/10.1145/3360543
Roemer et al. (2020) Jake Roemer, Kaan Genç, and Michael D. Bond. 2020. SmartTrack: efficient predictive race detection. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (London, UK) (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 747–762. https://doi.org/10.1145/3385412.3385993
Said et al. (2011) Mahmoud Said, Chao Wang, Zijiang Yang, and Karem Sakallah. 2011. Generating Data Race Witnesses by an SMT-Based Analysis. In NASA Formal Methods, Mihaela Bobaru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 313–327. https://doi.org/10.1007/978-3-642-20398-5_23
Sinha et al. (2012) Arnab Sinha, Sharad Malik, Chao Wang, and Aarti Gupta. 2012. Predicting Serializability Violations: SMT-Based Search vs. DPOR-Based Search. In Hardware and Software: Verification and Testing, Kerstin Eder, João Lourenço, and Onn Shehory (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 95–114. https://doi.org/10.1007/978-3-642-34188-5_11
Snowflake (2023) Snowflake 2023. Snowflake transactions. https://docs.snowflake.com/en/sql-reference/transactions
Tan et al. (2020) Cheng Tan, Changgeng Zhao, Shuai Mu, and Michael Walfish. 2020. COBRA: making transactional key-value stores verifiably serializable. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation (OSDI’20). USENIX Association, USA, Article 4, 18 pages. https://www.usenix.org/conference/osdi20/presentation/tan
Tang et al. (2022) Chuzhe Tang, Zhaoguo Wang, Xiaodong Zhang, Qianmian Yu, Binyu Zang, Haibing Guan, and Haibo Chen. 2022. Ad Hoc Transactions in Web Applications: The Good, the Bad, and the Ugly. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA, 4–18. https://doi.org/10.1145/3514221.3526120
Tunç et al. (2023) Hünkar Can Tunç, Umang Mathur, Andreas Pavlogiannis, and Mahesh Viswanathan. 2023. Sound Dynamic Deadlock Prediction in Linear Time. Proc. ACM Program. Lang. 7, PLDI, Article 177 (Jun 2023), 26 pages. https://doi.org/10.1145/3591291
Warszawski and Bailis (2017) Todd Warszawski and Peter Bailis. 2017. ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD ’17). ACM, New York, NY, USA, 5–20. https://doi.org/10.1145/3035918.3064037
Zennou et al. (2022) Rachid Zennou, Ranadeep Biswas, Ahmed Bouajjani, Constantin Enea, and Mohammed Erradi. 2022. Checking Causal Consistency of Distributed Databases. Computing 104, 10 (Oct 2022), 2181–2201. https://doi.org/10.1007/s00607-021-00911-3
Zhang et al. (2023) Jian Zhang, Ye Ji, Shuai Mu, and Cheng Tan. 2023. Viper: A Fast Snapshot Isolation Checker. In Proceedings of the Eighteenth European Conference on Computer Systems (Rome, Italy) (EuroSys ’23). Association for Computing Machinery, New York, NY, USA, 654–671. https://doi.org/10.1145/3552326.3567492

Appendix A Proof that Anti-Dependency Implies Commit Order

Here we prove the following claim from §4.2.2: Anti-dependency order must imply commit order, i.e., $\mathit{rw}\subseteq\mathit{co}$ for every valid $\mathit{co}$ . The proof proceeds by showing that violating anti-dependency order violates arbitration order:

Proof.

Suppose there exist $t_{1},t_{2}$ such that $\mathit{rw}(t_{1},t_{2})$ , but $\neg\mathit{co}(t_{1},t_{2}$ ). By the definition of anti-dependency, let $k$ be a key and $t_{w}$ be a transaction such that $t_{2}$ writes $k$ , $\mathit{wr}_{k}(t_{w},t_{1})$ , and $\mathit{co}(t_{w},t_{2})$ . Because $\neg\mathit{co}(t_{1},t_{2})$ and $\mathit{co}$ is a total order, therefore $\mathit{co}(t_{2},t_{1})$ . Then $\mathit{co}(t_{2},t_{w})$ according to the arbitration rule (Equation 1). However, $\mathit{co}(t_{2},t_{w})$ contradicts $\mathit{co}(t_{w},t_{2})$ since $\mathit{co}$ is a total order. ∎

Appendix B IsoPredict’s Full Constraints using the Prediction Boundary

This section shows the constraints generated by IsoPredict’s predictive analysis using the strict prediction boundary. For completeness we show all constraints generated by IsoPredict, including those that are unchanged compared with §4.

B.1. Encoding of Feasible Execution

	$\displaystyle\hbox{\multirowsetup$\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},% \quad$}\quad\boxed{\phi_{\mathit{so}}(t_{1},t_{2})}$	$\displaystyle\quad\textnormal{if }\mathit{so}(t_{1},t_{2})$
	$\displaystyle\boxed{\neg\phi_{\mathit{so}}(t_{1},t_{2})}$	otherwise

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},\forall i\in\mathit{% rdpos_{k}}(t_{2})=i,t_{2}\textnormal{'s read at pos $i$ reads from $t_{1}$ in % }\mathit{wr_{obs}},\quad\boxed{\phi_{\mathit{obs}}(s_{2},i)=t_{1}}

\displaystyle\forall k\textnormal{ is a key},\forall t_{1}\textnormal{ writes % }k,\forall t_{2}\textnormal{ reads }k,\forall i\in\mathit{rdpos_{k}}(t_{2}),\;% \;\boxed{i<\phi_{\mathit{boundary}}(s_{2})\implies\phi_{\mathit{choice}}(s_{2}% ,i)=\phi_{\mathit{obs}}(s_{2},i)}

where $s_{1}$ is $t_{1}$ ’s session and $s_{2}$ is $t_{2}$ ’s session.

	$\displaystyle\forall k\textnormal{ is a key},\forall t_{1}\textnormal{ writes % }k,\forall t_{2}\neq t_{1}\textnormal{ reads }k,\forall i\in\mathit{rdpos_{k}}% (t_{2}),$
	$\displaystyle\boxed{\phi_{\mathit{choice}}(s_{2},i)=t_{1}\land i\leq\phi_{% \mathit{boundary}}(s_{2})\implies\mathit{wrpos}_{k}(t_{1})<\phi_{\mathit{% boundary}}(s_{1})}$

where $s_{1}$ is $t_{1}$ ’s session and $s_{2}$ is $t_{2}$ ’s session, and $\mathit{wrpos}_{k}$ (t) is the position of $t$ ’s last write to key $k$ .

\displaystyle\forall s\textnormal{ is a session},\quad\boxed{\Big{(}\bigvee_{% \begin{subarray}{c}t\textnormal{ is a transaction in }s\\ i\in\mathit{rdpos_{k}}(t)\end{subarray}}\phi_{\mathit{boundary}}(s)=i\Big{)}% \lor\phi_{\mathit{boundary}}(s)=\infty}

Recall that $\mathit{rdpos_{k}}$ (s) is the set of positions of reads to $k$ in the transaction $t$ .

	$\displaystyle\forall k\textnormal{ is a key},\forall t_{1}\textnormal{ writes % }k,\forall t_{2}\textnormal{ reads }k,t_{1}\neq t_{2},$
	$\displaystyle\boxed{\phi_{\mathit{wr}_{k}}(t_{1},t_{2})=\bigvee_{i\in\mathit{% rdpos_{k}}(t_{2})}\phi_{\mathit{choice}}(s_{2},i)=t_{1}\land i\leq\phi_{% \mathit{boundary}}(s_{2})}$

where $s_{2}$ is $t_{2}$ ’s session.

\displaystyle\forall t_{1},t_{2}\in\mathit{T},t_{1}\neq t_{2},\quad\boxed{\phi% _{\mathit{wr}}(t_{1},t_{2})=\bigvee_{k\textnormal{ is a key}}\phi_{\mathit{wr}% _{k}}(t_{1},t_{2})}

B.2. Encoding of Unserializability

B.2.1. Precise encoding

\displaystyle\boxed{\forall\phi_{\mathit{co}},\neg\mathit{IsSerializable}(\phi% _{\mathit{co}})}

where $\mathit{IsSerializable}$ is defined as follows:

	$\displaystyle\mathit{IsSerializable}(\phi_{\mathit{co}})\coloneq\;$	$\displaystyle\mathit{Distinct}(\phi_{\mathit{co}}(t_{1}),\dots,\phi_{\mathit{% co}}(t_{n}))\;\land$
		$\displaystyle\bigwedge_{\forall t_{1},t_{2}\in T,t_{1}\neq t_{2}}(\phi_{% \mathit{so}}(t_{1},t_{2})\lor\phi_{\mathit{wr}}(t_{1},t_{2})\lor\mathit{% Arbitration}(t_{1},t_{2}))\Rightarrow\phi_{\mathit{co}}(t_{1})<\phi_{\mathit{% co}}(t_{2})$

where $t_{1},\dots,t_{n}$ are all transactions in $\mathit{T}$ , and $\mathit{Distinct}(v_{1},\dots,v_{k})$ is a built-in SMT function that requires all input values to be distinct from each other.

\displaystyle\mathit{Arbitration}(t_{1},t_{2})\coloneq\!\!\bigvee_{\begin{% subarray}{c}\forall k,t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ reads }k% \end{subarray}}\!\!\phi_{\mathit{wr}_{k}}(t_{2},t_{3})\land\phi_{\mathit{co}}(% t_{1})<\phi_{\mathit{co}}(t_{3})\land\mathit{wrpos}_{k}(t_{1})<\phi_{\mathit{% boundary}}(s_{1})

B.2.2. Approximate encoding

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},

\displaystyle\phi_{\mathit{ww}}(t_{1},t_{2})=\bigvee_{\begin{subarray}{c}% \forall k,t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ reads }k% \end{subarray}}\;\;\begin{subarray}{l}\displaystyle\phi_{\mathit{wr}_{k}}(t_{2% },t_{3})\land\phi_{\mathit{pco}}(t_{1},t_{3})\land\mathit{rank}(t_{1},t_{2})>% \mathit{rank}(t_{1},t_{3})\;\land\\ \displaystyle\mathit{wrpos}_{k}(t_{1})<\phi_{\mathit{boundary}}(s_{1})\end{subarray}

\displaystyle\phi_{\mathit{rw}}(t_{1},t_{2})=\bigvee_{\begin{subarray}{c}% \forall k,t_{1}\textnormal{ reads }k\land t_{2}\textnormal{ writes }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ writes }k% \end{subarray}}\;\;\begin{subarray}{l}\displaystyle\phi_{\mathit{wr}_{k}}(t_{3% },t_{1})\land\phi_{\mathit{pco}}(t_{3},t_{2})\land\mathit{rank}(t_{1},t_{2})>% \mathit{rank}(t_{3},t_{2})\land\\ \displaystyle\mathit{wrpos}_{k}(t_{2})<\phi_{\mathit{boundary}}(s_{2})\end{subarray}

	$\displaystyle\phi_{\mathit{pco}}(t_{1},t_{2})=\;$	$\displaystyle\phi_{\mathit{so}}(t_{1},t_{2})\lor\phi_{\mathit{wr}}(t_{1},t_{2}% )\lor\phi_{\mathit{ww}}(t_{1},t_{2})\lor\phi_{\mathit{rw}}(t_{1},t_{2})\;\lor$
		$\displaystyle\bigvee_{t\in\mathit{T}\setminus\{t_{1},t_{2}\}}\!\!\phi_{\mathit% {pco}}(t_{1},t)\land\phi_{\mathit{pco}}(t,t_{2})\land\mathit{rank}(t_{1},t_{2}% )>\mathit{rank}(t_{1},t)\land\mathit{rank}(t_{1},t_{2})>\mathit{rank}(t,t_{2})$

\displaystyle\boxed{\bigvee_{\forall t_{1},t_{2}\in\mathit{T},t_{1}\neq t_{2}}% \phi_{\mathit{pco}}(t_{1},t_{2})\land\phi_{\mathit{pco}}(t_{2},t_{1})}

B.3. Encoding of Weak Isolation

B.3.1. Causal consistency

\displaystyle\forall t_{1},t_{2}\in\mathit{T},t_{1}\neq t_{2},

\displaystyle\boxed{\phi_{\mathit{hb}}(t_{1},t_{2})=\phi_{\mathit{so}}(t_{1},t% _{2})\lor\phi_{\mathit{wr}}(t_{1},t_{2})\lor\bigvee_{\forall t\in T\setminus\{% t_{1},t_{2}\}}\phi_{\mathit{hb}}(t_{1},t)\land\phi_{\mathit{hb}}(t,t_{2})}

\displaystyle\phi_{\mathit{ww}_{\mathit{causal}}}(t_{1},t_{2})=\bigvee_{\begin% {subarray}{c}\forall k,t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},t_{3}\textnormal{ reads }k% \end{subarray}}\begin{subarray}{l}\displaystyle\phi_{\mathit{wr}_{k}}(t_{2},t_% {3})\land\phi_{\mathit{hb}}(t_{1},t_{3})\land\mathit{wrpos}_{k}(t_{1})<\phi_{% \mathit{boundary}}(s_{1})\end{subarray}

\displaystyle\quad\boxed{\phi_{\mathit{hb}}(t_{1},t_{2})\lor\phi_{\mathit{ww}_% {\mathit{causal}}}(t_{1},t_{2})\;\Rightarrow\;\phi_{\mathit{co_{causal}}}(t_{1% })<\phi_{\mathit{co_{causal}}}(t_{2})}

B.3.2. Read committed

\displaystyle\forall t_{1},t_{2}\in T,t_{1}\neq t_{2},

\displaystyle\phi_{\mathit{ww}_{\mathit{rc}}}(t_{1},t_{2})=\bigvee_{\begin{% subarray}{c}\forall k,\;t_{1}\textnormal{ and }t_{2}\textnormal{ write }k\\ \forall t_{3}\in\mathit{T}\setminus\{t_{1},t_{2}\},\;t_{3}\textnormal{ reads }% k\\ \forall i\in\mathit{rdpos}_{\ast}(t_{3}),\forall j\in\mathit{rdpos_{k}}(t_{3})% ,\;i<j\end{subarray}}\begin{subarray}{l}\displaystyle\phi_{\mathit{choice}}(s_% {3},i)=t_{1}\land\phi_{\mathit{choice}}(s_{3},j)=t_{2}\land j\leq\phi_{\mathit% {boundary}}(s_{3})\end{subarray}

where $\mathit{rdpos}_{\ast}(t)$ is the set of positions of read events in transaction $t$ , $\mathit{rdpos_{k}}(t)$ is the set of reads to $k$ in transaction $t$ , and $s_{3}$ is $t_{3}$ ’s session.

\displaystyle\quad\boxed{\phi_{\mathit{hb}}(t_{1},t_{2})\lor\phi_{\mathit{ww}_% {\mathit{rc}}}(t_{1},t_{2})\;\Rightarrow\;\phi_{\mathit{co_{rc}}}(t_{1})<\phi_% {\mathit{co_{rc}}}(t_{2})}

Appendix C Patterns of Observed and Predicted Executions

Figure 10 shows several observed executions and their unserializable predictions from our experiments. The actual executions consist of dozens of transactions and thousands of events, but the figures show only the transactions and events relevant to predicting unserializable behavior.

((a)) An observed execution of Smallbank.

((b)) A predicted execution based on (9(a)).

((c)) An observed execution of Smallbank.

((d)) A predicted execution based on (9(c)).

((e)) An observed execution of TPC-C.

((f)) A predicted execution based on (9(e)).

((g)) An observed execution of TPC-C.

((h)) A predicted execution based on (9(g)).

Figure 10. Observed executions that resulted in causal (and thus rc), unserializable predicted executions.

IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications \minibox[frame]This extended version of a PLDI 2024 paper adds an appendix with additional material

IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications

Abstract.

1. Introduction

Motivating example

Contributions

2. Background

2.1. Weakly Isolated Execution Histories

Example

2.2. Serializablility

Example

2.3. Causal Consistency

Example

2.4. Read Committed

Example

3. IsoPredict Overview

4. Predictive Analysis

4.1. Encoding of Feasible Execution

Session order

Write–read order

4.2. Encoding Unserializability

4.2.1. Constraints that encode an exact condition

4.2.2. Constraints encoding a sufficient but unnecessary condition

Adding anti-dependency order (𝑟𝑤𝑟𝑤\mathit{rw}italic_rw) to 𝑝𝑐𝑜𝑝𝑐𝑜\mathit{pco}italic_pco

Circular dependency and rank

Generated constraints

4.3. Encoding Weak Isolation

4.3.1. Causal consistency (causal)

4.3.2. Read committed (rc)

4.4. Prediction Examples

4.5. Handling Divergence in the Predicted Execution

Divergent behavior

Prediction boundary

Generating prediction boundary constraints

5. Validation

Validating execution

Checking serializability

6. Implementation

Predictive analysis

Validation

7. Evaluation

7.1. Methodology

Prediction strategies

Benchmarks

Platform

7.2. IsoPredict’s Effectiveness and Performance

Predictive analysis

Validation

Performance

7.3. Comparison with MonkeyDB

Comparison with regular execution

Differences between our MonkeyDB results and the MonkeyDB paper’s results

8. Related Work

Dynamic analysis

Static analysis

Isolation levels

9. Conclusion

Data-Availability Statement

Acknowledgements.

References

Appendix A Proof that Anti-Dependency Implies Commit Order

Proof.

Appendix B IsoPredict’s Full Constraints using the Prediction Boundary

B.1. Encoding of Feasible Execution

B.2. Encoding of Unserializability

B.2.1. Precise encoding

B.2.2. Approximate encoding

B.3. Encoding of Weak Isolation

B.3.1. Causal consistency

B.3.2. Read committed

Appendix C Patterns of Observed and Predicted Executions

IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications
\minibox[frame]This extended version of a PLDI 2024 paper adds an appendix with additional material

Adding anti-dependency order ( $\mathit{rw}$ ) to $\mathit{pco}$