0% found this document useful (0 votes)

81 views14 pages

Nightingale 06

This document summarizes a research paper that proposes a new model called "external synchrony" for local file I/O. External synchrony aims to provide the reliability of synchronous I/O (where file operations are durable before returning) while approaching the performance of asynchronous I/O. It does so by buffering any external output that depends on uncommitted file modifications, and only allowing that output once the modifications are committed to disk. The researchers implemented this model in a Linux file system called xsyncfs. Their results found xsyncfs performance was within 7% of asynchronous I/O and up to two orders of magnitude faster than synchronous I/O.

Uploaded by

sushmsn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views14 pages

Nightingale 06

Uploaded by

sushmsn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Rethink the Sync

Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn

Department of Electrical Engineering and Computer Science
University of Michigan

Abstract In contrast, an asynchronous file system does not block

We introduce external synchrony, a new model for local the calling application, so modifications are typically
file I/O that provides the reliability and simplicity of syn- committed to disk long after the call completes. This
chronous I/O, yet also closely approximates the perfor- is fast, but not safe. Users view output that depends
mance of asynchronous I/O. An external observer cannot on uncommitted modifications. If the system crashes or
distinguish the output of a computer with an externally loses power before those modifications commit, the out-
synchronous file system from the output of a computer put observed by the user was invalid. Asynchronous I/O
with a synchronous file system. No application modifi- also complicates applications that require durability or
cation is required to use an externally synchronous file ordering guarantees. Programmers must insert explicit
system: in fact, application developers can program to synchronization operations such as fsync to enforce the
the simpler synchronous I/O abstraction and still receive guarantees required by their applications. They must
excellent performance. We have implemented an exter- sometimes implement complex group commit strategies
nally synchronous file system for Linux, called xsyncfs. to achieve reasonable performance. Despite the poor
Xsyncfs provides the same durability and ordering guar- guarantees provided to users and programmers, most lo-
antees as those provided by a synchronously mounted cal file systems provide an asynchronous I/O abstraction
ext3 file system. Yet, even for I/O-intensive benchmarks, by default because synchronous I/O is simply too slow.
xsyncfs performance is within 7% of ext3 mounted asyn- The tension between durability and performance leads to
chronously. Compared to ext3 mounted synchronously, surprising behavior. For instance, on most desktop op-
xsyncfs is up to two orders of magnitude faster. erating systems, even executing an explicit synchroniza-
tion command such as fsync does not protect against
1 Introduction data loss in the event of a power failure [13]. This be-
havior is not a bug, but rather a conscious design deci-
File systems serve two opposing masters: durability and sion to sacrifice durability for performance [27]. For ex-
performance. The tension between these goals has led ample, on fsync, the Linux 2.4 kernel commits data to
to two distinct models of file I/O: synchronous and asyn- the volatile hard drive cache rather than to the disk plat-
chronous. ter. If a power failure occurs, the data in the drive cache
A synchronous file system (e.g., one mounted with the is lost. Because of this behavior, applications that re-
sync option on a Linux system) guarantees durability by quire stronger durability guarantees, such as the MySQL
blocking the calling application until modifications are database, recommend disabling the drive cache [15].
committed to disk. Synchronous I/O provides a clean ab- While MacOS X and the Linux 2.6 kernel provide mech-
straction to users. Any file system operation they observe anisms to explicitly flush the drive cache, these mecha-
to complete is durable — data will not be lost due to a nisms are not enabled by default due to the severe per-
subsequent OS crash or power failure. Synchronous I/O formance degradation they can cause.
also guarantees the ordering of modifications; if one op-
We show that a new model of local file I/O, which
eration causally precedes another, the effects of the sec-
we term external synchrony, resolves the tension be-
ond operation are never visible unless the effects of first
tween durability and performance. External synchrony
operation are also visible. Unfortunately, synchronous
provides the reliability and simplicity of synchronous
I/O can be very slow because applications frequently
I/O, while closely approaching the performance of asyn-
block waiting for mechanical disk operations. In fact, our
chronous I/O. In external synchrony, we view the ab-
results show that blocking due to synchronous I/O can
straction of synchronous I/O as a set of guarantees that
degrade the performance of disk-intensive benchmarks
are provided to the clients of the file system. In con-
by two orders of magnitude.

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 1
trast to asynchronous I/O, which improves performance — this is typically less than the perception threshold of a
by substantially weakening these guarantees, externally human user.
synchronous I/O provides the same guarantees, but it Xsyncfs uses output-triggered commits to balance
changes the clients to which the guarantees are provided. throughput and latency. Output-triggered commits track
Synchronous I/O reflects the application-centric view of the causal relationship between external output and file
modern operating systems. The return of a synchronous system modifications to decide when to commit data.
file system call guarantees durability to the application Until some external output is produced that depends
since the calling process is blocked until modifications upon modified data, xsyncfs may delay committing data
commit. In contrast, externally synchronous I/O takes a to optimize for throughput. However, once some output
user-centric view in which it guarantees durability not to is buffered that depends upon an uncommitted modifica-
the application, but to any external entity that observes tion, an immediate commit of that modification is trig-
application output. An externally synchronous system gered to minimize latency for any external observer.
returns control to the application before committing data. Our results to date are very positive. For I/O inten-
However, it subsequently buffers all output that causally sive benchmarks such as Postmark and an Andrew-style
depends on the uncommitted modification. Buffered out- build, the performance of xsyncfs is within 7% of the de-
put is only externalized (sent to the screen, network, or fault asynchronous implementation of ext3. Compared to
other external device) after the modification commits. current implementations of synchronous I/O in the Linux
From the viewpoint of an external observer such as a kernel, external synchrony offers better performance and
user or an application running on another computer, the better reliability. Xsyncfs is up to an order of magni-
guarantees provided by externally synchronous I/O are tude faster than the default version of ext3 mounted syn-
identical to the guarantees provided by a traditional file chronously, which allows data to be lost on power fail-
system mounted synchronously. An external observer ure because committed data may reside in the volatile
never sees output that depends on uncommitted modi- hard drive cache. Xsyncfs is up to two orders of mag-
fications. Since external synchrony commits modifica- nitude faster than a version of ext3 that guards against
tions to disk in the order they are generated by appli- losing data on power failure. Xsyncfs sometimes even
cations, an external observer will not see a modification improves the performance of applications that do their
unless all other modifications that causally precede that own custom synchronization. Running on top of xsyncfs,
modification are also visible. However, because exter- the MySQL database executes a modified version of the
nally synchronous I/O rarely blocks applications, its per- TPC-C benchmark up to three times faster than when it
formance approaches that of asynchronous I/O. runs on top of ext3 mounted asynchronously.
Our externally synchronous Linux file system, xsyncfs,
uses mechanisms developed as part of the Speculator
2 Design overview
project [17]. When a process performs a synchronous
I/O operation, xsyncfs validates the operation, adds the 2.1 Principles
modifications to a file system transaction, and returns The design of external synchrony is based on two princi-
control to the calling process without waiting for the ples. First, we define externally synchronous I/O by its
transaction to commit. However, xsyncfs also taints the externally observable behavior rather than by its imple-
calling process with a commit dependency that specifies mentation. Second, we note that application state is an
that the process is not allowed to externalize any output internal property of the computer system. Since applica-
until the transaction commits. If the process writes to tion state is not directly observable by external entities,
the network, screen, or other external device, its output the operating system need not treat changes to applica-
is buffered by the operating system. The buffered output tion state as an external output.
is released only after all disk transactions on which the
output depends commit. If a process with commit depen- Synchronous I/O is usually defined by its implementa-
dencies interacts with another process on the same com- tion: an I/O is considered synchronous if the calling ap-
puter through IPC such as pipes, the file cache, or shared plication is blocked until after the I/O completes [26].
memory, the other process inherits those dependencies In contrast, we define externally synchronous I/O by its
so that it also cannot externalize output until the trans- observable behavior: we say that an I/O is externally syn-
action commits. The performance of xsyncfs is gener- chronous if the external output produced by the computer
ally quite good since applications can perform computa- system cannot be distinguished from output that could
tion and initiate further I/O operations while waiting for have been produced if the I/O had been synchronous.
a transaction to commit. In most cases, output is delayed The next step is to precisely define what is considered ex-
by no more than the time to commit a single transaction ternal output. Traditionally, the operating system takes

2 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association
time time

user user
buffer
block block
app app

OS OS

disk disk
commit commit commit

(a) Synchronous I/O (b) Externally synchronous I/O

This figure shows the behavior of a sample application that makes two file system modifications, then displays output to an external
device. The diagram on the left shows how the application executes when its file I/O is synchronous; the diagram on the right shows
how it executes when its file I/O is externally synchronous.

Figure 1. Example of externally synchronous file I/O

an application-centric view of the computer system, in We say that external output of an externally synchronous
which it considers applications to be external entities ob- system is equivalent to the output of a synchronous one if
serving its behavior. This view divides the computer sys- (a) the values of the external outputs are the same, and (b)
tem into two partitions: the kernel, which is considered the outputs occur in the same causal order, as defined by
internal state, and the user level, which is considered ex- Lamport’s happens before relation [9]. We consider disk
ternal state. Using this view, the return from a system commits external output because they change the stable
call is considered an externally visible event. image of the file system. If the system crashes and re-
However, users, not applications, are the true observers boots, the change to the stable image is visible. Since the
of the computer system. Application state is only visible operating system cannot control when crashes occur, it
through output sent to external devices such as the screen must treat disk commits as external output. Thus, in Fig-
and network. By regarding application state as internal ure 1(a), there are three external outputs: the two com-
to the computer system, the operating system can take mits and the message displayed on the screen.
a user-centric view in which only output sent to an ex- An externally synchronous file I/O returns the same re-
ternal device is considered externally visible. This view sult to applications that would have been returned by
divides the computer system into three partitions, the ker- a synchronous I/O. The file system does all processing
nel and applications, both of which are considered inter- that would be done for a synchronous I/O, including val-
nal state, and the external interfaces, which are consid- idation and changing the volatile (in-memory) state of
ered externally visible. Using this view, changes to ap- the file system, except that it does not actually commit
plication state, such as the return from a system call, are the modification to disk before returning. Because the
not considered externally visible events. results that an application sees from an externally syn-
The operating system can implement user-centric guar- chronous I/O are equivalent to the results it would have
antees because it controls access to external devices. Ap- seen if the I/O had been synchronous, the external output
plications can only generate external events with the co- it produces is the same in both cases.
operation of the operating system. Applications must in- An operating system that supports external synchrony
voke this cooperation either directly by making a system must ensure that external output occurs in the same
call or indirectly by mapping an externally visible device. causal order that would have occurred had I/O been per-
formed synchronously. Specifically, if an external out-
2.2 Correctness
put causally follows an externally synchronous file I/O,
Figure 1 illustrates these principles by showing an exam- then that output cannot be observed before the file I/O
ple single-threaded application that makes two file sys- has been committed to disk. In the example, this means
tem modifications and writes some output to the screen. that the second file modification made by the application
In the diagram on the left, the file modifications made by cannot commit before the first, and that the screen output
the application are synchronous. Thus, the application cannot be seen before both modifications commit.
blocks until each modification commits.

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 3
2.3 Improving performance 2.4 Deciding when to commit
The externally synchronous system in Figure 1(b) makes An externally synchronous file system uses the causal re-
two optimizations to improve performance. First, the two lationship between external output and file modifications
modifications are group committed as a single file system to trigger commits. There is a well-known tradeoff be-
transaction. Because the commit is atomic, the effects of tween throughput and latency for group commit strate-
the second modification are never seen unless the effects gies. Delaying a group commit in the hope that more
of the first are also visible. Grouping multiple modifica- modifications will occur in the near future can improve
tions into one transaction has many benefits: the commit throughput by amortizing more modifications across a
of all modifications is done with a single sequential disk single commit. However, delaying a commit also in-
write, writes to the same disk block are coalesced in the creases latency — in our system, commit latency is es-
log, and no blocks are written to disk at all if data writes pecially important because output cannot be externalized
are closely followed by deletion. For example, ext3 em- until the commit occurs.
ploys value logging — when a transaction commits, only
Latency is unimportant if no external entity is observ-
the latest version of each block is written to the journal. ing the result. Specifically, until some output is gener-
If a temporary file is created and deleted within a single ated that causally depends on a file system transaction,
transaction, none of its blocks are written to disk. In con-
committing the transaction does not change the observ-
trast, a synchronous file system cannot group multiple able behavior of the system. Thus, the operating sys-
modifications for a single-threaded application because
tem can improve throughput by delaying a commit until
the application does not begin the second modification
some output that depends on the transaction is buffered
until after the first commits. (or until some application that depends on the transac-
The second optimization is buffering screen output. The tion blocks due to an ioctl or similar system call). We
operating system must delay the externalization of the call this strategy output-triggered commits since the at-
output until after the commit of the file modifications tempt to generate output that is causally dependent upon
to obey the causal ordering constraint of externally syn- modifications to be written to disk triggers the commit of
chronous I/O. One way to enforce this ordering would be those modifications.
to block the application when it initiates external output. Output-triggered commits enable an externally syn-
However, the asynchronous nature of the output enables chronous file system to maximize throughput when out-
a better solution. The operating system instead buffers
put is not being displayed (for example, when it is piped
the output and allows the process that generated the out- to a file). However, when a user could be actively observ-
put to continue execution. After the modifications are
ing the results of a transaction, commit latency is small.
committed to disk, the operating system releases the out-
put to the device for which it was destined. 2.5 Limitations
This design requires that the operating system track the
causal relationship between file system modifications One potential limitation of external synchrony is that
and external output. When a process writes to the file it complicates application-specific recovery from catas-
system, it inherits a commit dependency on the uncom- trophic media failure because the application continues
mitted data that it wrote. When a process with commit execution before such errors are detected. Although the
dependencies modifies another kernel object (process, kernel validates each modification before writing it to the
pipe, file, UNIX socket, etc.) by executing a system call, file cache, the physical write of the data to disk may sub-
the operating system marks the modified objects with the sequently fail. While smaller errors such as a bad disk
same commit dependencies. Similarly, if a process ob- block are currently handled by the disk or device driver,
serves the state of another kernel object with commit de- a catastrophic media failure is rarely masked at these lev-
pendencies, the process inherits those dependencies. If els. Theoretically, a file system mounted synchronously
a process with commit dependencies executes a system could propagate such failures to the application. How-
call for which the operating system cannot track the flow ever, a recent survey of common file systems [20] found
of causality (e.g., an ioctl), the process is blocked until that write errors are either not detected by the file sys-
its file systems modifications have been committed. Any tem (ext3, jbd, and NTFS) or induce a kernel panic
external output inherits the commit dependencies of the (ReiserFS). An externally synchronous file system could
process that generated it — the operating system buffers propagate failures to applications by using Speculator to
the output until the last dependency is resolved by com- checkpoint a process before it modifies the file system. If
mitting modifications to disk. a catastrophic failure occurs, the process would be rolled
back and notified of the failure. We rejected this solution
because it would both greatly increase the complexity

4 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association
of external synchrony and severely penalize its perfor- uses these data structures to track causal dependencies.
mance. Further, it is unclear that catastrophic failures are For example, if a speculative process writes to a pipe,
best handled by applications — it seems best to handle Speculator creates an entry in the pipe’s undo log that
them in the operating system, either by inducing a kernel refers to the speculations on which the writing process
panic or (preferably) by writing data elsewhere. depends. If another process reads from the pipe, Spec-
ulator creates an undo log entry for the reading process
Another limitation of external synchrony is that the user
may have some temporal expectations about when modi- that refers to all speculations on which the pipe depends.
fications are committed to disk. As defined so far, an ex- Speculator ensures that speculative state is never visible
ternally synchronous file system could indefinitely delay to an external observer. If a speculative process exe-
committing data written by an application with no exter- cutes a system call that would normally externalize out-
nal output. If the system crashes, a substantial amount put, Speculator buffers its output until the outcome of the
of work could be lost. Xsyncfs therefore commits data speculation is decided. If a speculative process performs
every 5 seconds, even if no output is produced. The 5 a system call that Speculator is unable to handle by ei-
second commit interval is the same value used by ext3 ther transferring causal dependencies or buffering output,
mounted asynchronously. Speculator blocks it until it becomes non-speculative.
A final limitation of external synchrony is that modifica- 3.1.2 From speculation to synchronization
tions to data in two different file systems cannot be easily
committed with a single disk transaction. Potentially, we Speculator ties dependency tracking and output buffer-
could share a common journal among all local file sys- ing to other features, such as checkpoint and rollback,
tems, or we could implement a two-phase commit strat- that are not needed to support external synchrony. Worse
egy. However, a simpler solution is to block a process yet, these unneeded features come at a substantial per-
with commit dependencies for one file system before it formance cost. This led us to factor out the functionality
modifies data in a second. Speculator would map each in Speculator common to both speculative execution and
dependency to a specific file system. When a process external synchrony. We modified the Speculator inter-
writes to a file system, Speculator would verify that the face to allow each file system to specify the additional
process depends only on the file system it is modifying; Speculator features that it requires. This allows a single
if it depends on another file system, Speculator would computer to run both a speculative distributed file system
block it until its previous modifications commit. and an externally synchronous local file system.
Both speculative execution and external synchrony en-
3 Implementation force restrictions on when external output may be ob-
served. Speculative execution allows output to be ob-
3.1 External synchrony served based on correctness; output is externalized af-
ter all speculations on which that output depends have
We next provide a brief overview of Speculator [17] and proven to be correct. In contrast, external synchrony al-
how it supports externally synchronous file systems. lows output to be observed based on durability; output is
3.1.1 Speculator background externalized after all file system operations on which that
output depends have been committed to disk.
Speculator improves the performance of distributed file
In external synchrony, a commit dependency represents
systems by hiding the performance cost of remote opera-
the causal relationship between kernel state and an un-
tions. Rather than block during a remote operation, a file
committed file system modification. Any kernel object
system predicts the operation’s result, then uses Specula-
that has one or more associated commit dependencies is
tor to checkpoint the state of the calling process and spec-
referred to as uncommitted. Any external output from a
ulatively continue its execution based on the predicted
process that is uncommitted is buffered within the kernel
result. If the prediction is correct, the checkpoint is dis-
until the modifications on which the output depends have
carded; if it is incorrect, the calling process is restored to
been committed. In other words, uncommitted output is
the checkpoint, and the operation is retried.
never visible to an external observer.
Speculator adds two new data structures to the kernel.
When a process writes to an externally synchronous file
A speculation object tracks all process and kernel state
system, Speculator marks the process as uncommitted. It
that depends on the success or failure of a speculative
also creates a commit dependency between the process
operation. Each speculative object in the kernel has an
and the uncommitted file system transaction that con-
undo log that contains the state needed to undo specu-
tains the modification. When the file system commits the
lative modifications to that object. As processes interact
transaction to disk, the commit dependency is removed.
with kernel objects by executing system calls, Speculator

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 5
Once all commit dependencies for buffered output have
been removed, Speculator releases that output to the ex- Process 1234
ternal device to which it was written. When the last com-
Commit Committing
mit dependency for a process is discarded, Speculator Output a Dep 1 FS Op 1
Transaction
marks the process as committed.
Output b Commit FS Op 2 Active
Speculator propagates commit dependencies among ker- Output c Transaction
Dep 2 FS Op 3
nel objects and processes using the same mechanisms it
uses to propagate speculative dependencies. However, Undo Log FS Journal
since external synchrony does not require checkpoint and (a) Data structures with a committing and active transaction
rollback, the propagation of dependencies is consider-
Process 1234
ably easier to implement. For instance, before a pro-
cess inherits a new speculative dependency, Speculator
Output b Commit FS Op 2 Active
must checkpoint its state with a copy-on-write fork. In Dep 2 Transaction
Output c FS Op 3
contrast, when a process inherits a commit dependency,
no checkpoint is needed since the process will never be
rolled back. To support external synchrony, Speculator
maintains the same many-to-many relationship between Undo Log FS Journal
commit dependencies and undo logs as it does for specu- (b) Data structures after the first transaction commits
lations and undo logs. Since commit dependencies are Figure 2. The external synchrony data structures
never rolled back, undo logs need not contain data to
undo the effects of an operation. Therefore, undo logs Figure 2(a), process 1234 has completed three file sys-
in an externally synchronous system only track the re- tem operations, sending output to the screen after each
lationship between commit dependencies and kernel ob- one. Since the output after the first operation triggered
jects and reveal which buffered output can be safely re- a transaction commit, the two following operations were
leased. This simplicity enables Speculator to support placed in a new active transaction. The output is buffered
more forms of interaction among uncommitted processes in the undo log; the commit dependencies maintain the
than it supports for speculative processes. For example, relationship between buffered output and uncommitted
checkpointing multi-threaded processes for speculative data. In Figure 2(b), the first transaction has been com-
execution is a thorny problem [17, 21]. However, as dis- mitted to disk. Therefore, the output that depended upon
cussed in Section 3.5, tracking their commit dependen- the committed transaction has been released to the screen
cies is substantially simpler. and the commit dependency has been discarded.
3.2 File system support for external synchrony Xsyncfs uses journaled mode rather than the default or-
dered mode. This change guarantees ordering; specif-
We modified ext3, a journaling Linux file system, to cre- ically, the property that if an operation A causally pre-
ate xsyncfs. In its default ordered mode, ext3 writes only cedes another operation B, the effects of B should never
metadata modifications to its journal. In its journaled be visible unless the effects of A are also visible. This
mode, ext3 writes both data and metadata modifications. guarantee requires that B never be committed to disk be-
Modifications from many different file system operations fore A. Otherwise, a system crash or power failure may
may be grouped into a single compound journal transac- occur between the two commits — in this case, after the
tion that is committed atomically. Ext3 writes modifica- system is restarted, B will be visible when A is not. Since
tions to the active transaction — at most one transaction journaled mode adds all modifications for A to the jour-
may be active at any given time. A commit of the active nal before the operation completes, those modifications
transaction is triggered when journal space is exhausted, must already be in the journal when B begins (since B
an application performs an explicit synchronization op- causally follows A). Thus, either B is part of the same
eration such as fsync, or the oldest modification in the transaction as A (in which case the ordering property
transaction is more than 5 seconds old. After the transac- holds since A and B are committed atomically), or the
tion starts to commit, the next modification triggers the transaction containing A is already committed before the
creation of a new active transaction. Only one transac- transaction containing B starts to commit.
tion may be committing at any given time, so the next In contrast, the default mode in ext3 does not provide or-
transaction must wait for the commit of the prior trans- dering since data modifications are not journaled. The
action to finish before it commits. kernel may write the dirty blocks of A and B to disk
Figure 2 shows how the external synchrony data struc- in any order as long as the data reaches disk before the
tures change when a process interacts with xsyncfs. In metadata in the associated journal transaction commits.

6 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association
Thus, the data modifications for B may be visible after a the completion would trigger a commit if it is a visible
crash without the modifications for A being visible. event). Finally, if the user were to observe the contents of
the file using a different application, e.g., tail, xsyncfs
Xsyncfs informs Speculator when a new journal transac-
tion is created — this allows Speculator to track state that would correctly optimize for latency because Specula-
depends on the uncommitted transaction. Xsyncfs also tor would track the causal relationship through the kernel
data structures from tail to the transaction and provide
informs Speculator when a new modification is added to
the transaction and when the transaction commits. callbacks to xsyncfs. When tail attempts to output data
to the screen, Speculator callbacks will cause xsyncfs to
As described in Section 1, the default behavior of ext3 commit the active transaction.
does not guarantee that modifications are durable after a
power failure. In the Linux 2.4 kernel, durability can be 3.4 Rethinking sync
ensured only by disabling the drive cache. The Linux
2.6.11 kernel provides the option of using write bar- Asynchronous file systems provide explicit synchroniza-
riers to flush the drive cache before and after writing tion operations such as sync and fdatasync for appli-
each transaction commit record. Since Speculator runs cations with durability or ordering constraints. In a syn-
on a 2.4 kernel, we ported write barriers to our kernel chronous file system, such synchronization operations
and modified xsyncfs to use write barriers to guarantee are redundant since ordering and durability are already
that all committed modifications are preserved, even on guaranteed for all file system operations. However, in an
power failure. externally synchronous file system, some extra support
is needed to minimize latency. For instance, a user who
3.3 Output-triggered commits types “sync” in a terminal would prefer that the com-
mand complete as soon as possible.
Xsyncfs uses the causal relationship between disk I/O
and external output to balance the competing concerns When xsyncfs receives a synchronization call such as
sync from the VFS layer, it creates a commit depen-
of throughput and latency. Currently, ext3 commits its
dency between the calling process and the active trans-
journal every 5 seconds, which typically groups the com-
mit of many file system operations. This strategy opti- action. Since this does not require a disk write, the
return from the synchronization call is almost instanta-
mizes for throughput, a logical behavior when writes are
asynchronous. However, latency is an important consid- neous. If a visible event occurs, such as the completion
eration in xsyncfs since users must wait to view output of the sync process, Speculator will issue a callback that
causes xsyncfs to commit the active transaction.
until the transactions on which that output depends com-
mit. If xsyncfs were to use the default ext3 commit strat- External synchrony simplifies the file system abstrac-
egy, disk throughput would be high, but the user might be tion. Since xsyncfs requires no application modifica-
forced to wait up to 5 seconds to see output. This behav- tion, programmers can write the same code that they
ior is clearly unacceptable for interactive applications. would write if they were using a unmodified file sys-
We therefore modified Speculator to support output- tem mounted synchronously. They do not need explicit
synchronization calls to provide ordering and durability
triggered commits. Speculator provides callbacks to
since xsyncfs provides these guarantees by default for all
xsyncfs when it buffers output or blocks a process that
performed a system call for which it cannot track the file system operations. Further, since xsyncfs does not
incur the large performance penalty usually associated
propagation of causal dependencies (e.g., an ioctl).
Xsyncfs uses the ext3 strategy of committing every 5 sec- with synchronous I/O, programmers do not need com-
onds unless it receives a callback that indicates that Spec- plicated group commit strategies to achieve acceptable
performance. Group commit is provided transparently
ulator blocked or buffered output from a process that de-
pends on the active transaction. The receipt of a callback by xsyncfs.
triggers a commit of the active transaction. Of course, a hand-tuned strategy might offer better per-
Output-triggered commits adapt the behavior of the file formance than the default policies provided by xsyncfs.
However, as described in Section 3.3, there are some
system according to the observable behavior of the sys-
tem. For instance, if a user directs output from a running instances in which xsyncfs can optimize performance
when an application solution cannot. Since xsyncfs uses
application to the screen, latency is reduced by commit-
output-triggered commits, it knows when no external
ting transactions frequently. If the user instead redirects
the output to a file, xsyncfs optimizes for throughput by output has been generated that depends on the current
transaction; in these instances, xsyncfs uses group com-
committing every 5 seconds. Optimizing for throughput
is correct in this instance since the only event the user mit to optimize throughput. In contrast, an application-
can observe is the completion of the application (and specific commit strategy cannot determine the visibility

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 7
of its actions beyond the scope of the currently executing dependencies for multi-threaded applications — any de-
process; it must therefore conservatively commit modifi- pendency inherited by one thread is inherited by all.
cations before producing external messages.
For example, consider a client that issues two sequential 4 Evaluation
transactions to a database server on the same computer
and then produces output. Xsyncfs can safely group Our evaluation answers the following questions:
the commit of both transactions. However, the database
server (which does not use output-triggered commits) • How does the durability of xsyncfs compare to cur-
must commit each transaction separately since it cannot rent file systems?
know whether or not the client will produce output after • How does the performance of xsyncfs compare to
it is informed of the commit of the first transaction. current file systems?
3.5 Shared memory • How does xsyncfs affect the performance of appli-
cations that synchronize explicitly?
Speculator does not propagate speculative dependencies
when processes interact through shared memory due to • How much do output-triggered commits improve
the complexity of checkpointing at arbitrary states in a the performance of xsyncfs?
process’ execution. Since commit dependencies do not
require checkpoints, we enhanced Speculator to propa- 4.1 Methodology
gate them among processes that share memory.
All computers used in our evaluation have a 3.02 GHZ
Speculator can track causal dependencies because pro-
Pentium 4 processor with 1 GB of RAM. Each computer
cesses can only interact through the operating system. has a single Western Digital WD-XL40 hard drive, which
Usually, this interaction involves an explicit system call is a 7200 RPM 120 GB ATA 100 drive with a 2 MB on-
(e.g., write) that Speculator can intercept. However,
disk cache. The computers run Red Hat Enterprise Linux
when processes interact through shared memory regions, version 3 (kernel version 2.4.21). We use a 400 MB jour-
only the sharing and unsharing of regions is visible to the
nal size for both ext3 and xsyncfs. For each benchmark,
operating system. Thus, Speculator cannot readily inter- we measured ext3 executing in both journaled and or-
cept individual reads and writes to shared memory. dered mode. Since journaled mode executed faster in
We considered marking a shared memory page inaccessi- every benchmark, we report only journaled mode results
ble when a process with write permission inherits a com- in this evaluation. Finally, we measured the performance
mit dependency that a process with read permission does of ext3 both using write barriers and with the drive cache
not have. This would trigger a page fault whenever a pro- disabled. In all cases write barriers were faster than dis-
cess reads or writes the shared page. If a process reads abling the drive cache since the drive cache improves
the page after another writes it, any commit dependen- read times and reduces the frequency of writes to the disk
cies would be transferred from the writer to the reader. platter. Thus, we report only results using write barriers.
Once these processes have the same commit dependen-
cies, the page can be restored to its normal protections. 4.2 Durability
We felt this mechanism would perform poorly because of
the time needed to protect and unprotect pages, as well Our first benchmark empirically confirms that without
as the extra page faults that would be incurred. write barriers, ext3 does not guarantee durability. This
result holds in both journaled and ordered mode, whether
Instead, we decided to use an approach that imposes ext3 is mounted synchronously or asynchronously, and
less overhead but might transfer dependencies when not even if fsync commands are issued by the application
strictly necessary. We make a conservative assumption after every write. Even worse, our results show that, de-
that processes with write permission for a shared mem- spite the use of journaling in ext3, a loss of power can
ory region are continually writing to that region, while corrupt data and metadata stored in the file system.
processes with read permission are continually reading
it. When a process with write permission for a shared We confirmed these results by running an experiment in
region inherits a new commit dependency, any process which a test computer continuously writes data to its lo-
with read permission for that region atomically inherits cal file system. After each write completes, the test com-
the same dependency. puter sends a UDP message that is logged by a remote
computer. During the experiment, we cut power to the
Speculator uses the same mechanism to track commit test computer. After it reboots, we compare the state of
dependencies transfered through memory-mapped files. its file system to the log on the remote computer.
Similarly, Speculator is conservative when propagating

8 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association
File system configuration Data durable on write Data durable on fsync
Asynchronous No Not on power failure
Synchronous Not on power failure Not on power failure
Synchronous with write barriers Yes Yes
External synchrony Yes Yes
This figure describes whether each file system provides durability to the user when an application executes a write or fsync system
call. A “Yes” indicates that the file system provides durability if an OS crash or power failure occurs.

Figure 3. When is data safe?

Our goal was to determine when each file system guaran- ext3 provides both durability and ordering when write
tees durability and ordering. We say a file system fails to barriers are combined with some form of synchronous
provide durability if the remote computer logs a message operation (either mounting the file system synchronously
for a write operation, but the test computer is missing or calling fsync after each modification). If write barri-
the data written by that operation. In this case, dura- ers are not available, the equivalent behavior could also
bility is not provided because an external observer (the be achieved by disabling the hard drive cache.
remote computer) saw output that depended on data that The last row of Figure 3 shows results for xsyncfs. As
was subsequently lost. We say a file system fails to pro- expected, xsyncfs provides both durability and ordering.
vide ordering if the state of the file after reboot violates
the temporal ordering of writes. Specifically, for each 4.3 The PostMark benchmark
block in the file, ordering is violated if the file does not
also contain all previously-written blocks. We next ran the PostMark benchmark, which was de-
signed to replicate the small file workloads seen in elec-
For each configuration shown in Figure 3, we ran four tronic mail, netnews, and web-based commerce [8]. We
trials of this experiment: two in journaled mode and used PostMark version 1.5, running in a configuration
two in ordered mode. As expected, our results confirm that creates 10,000 files, performs 10,000 transactions
that ext3 provides durability only when write barriers are consisting of file reads, writes, creates, and deletes, and
used. Without write barriers, synchronous operations en- then removes all files. The PostMark benchmark has a
sure only that modifications are written to the hard drive single thread of control that executes file system oper-
cache. If power fails before the modifications are written ations as quickly as possible. PostMark is a good test
to the disk platter, those modifications are lost. of file system throughput since it does not generate any
Some of our experiments exposed a dangerous behav- output or perform any substantial computation.
ior in ext3: unless write barriers are used, power failures Each bar in Figure 4 shows the time to complete the Post-
can corrupt both data and metadata stored on disk. In Mark benchmark. The y-axis is logarithmic because of
one experiment, a block in the file being modified was the substantial slowdown of synchronous I/O. The first
silently overwritten with garbage data. In another, a sub- bar shows results when ext3 is mounted asynchronously.
stantial amount of metadata in the file system, including As expected, this offers the best performance since the
the superblock, was overwritten with garbage. In the lat- file system buffers data in memory up to 5 seconds before
ter case, the test machine failed to reboot until the file writing it to disk. The second bar shows results using
system was manually repaired. In both cases, corruption xsyncfs. Despite the I/O intensive nature of PostMark,
is caused by the commit block for a transaction being the performance of xsyncfs is within 7% of the perfor-
written to the disk platter before all data blocks in that mance of ext3 mounted asynchronously. After examin-
transaction are written to disk. Although the operating ing the performance of xsyncfs, we determined that the
system wrote the blocks to the drive cache in the correct overhead of tracking causal dependencies in the kernel
order, the hard drive reorders the blocks when writing accounts for most of the difference.
them to the disk platter. After this happens, the trans-
The third bar shows performance when ext3 is mounted
action is committed during recovery even though several
synchronously. In this configuration the writing process
data blocks do not contain valid data. Effectively, this
is blocked until its modifications are committed to the
overwrites disk blocks with uninitialized data.
drive cache. Ext3 in synchronous mode is over an order
Our results also confirm that ext3 without write barriers of magnitude slower than xsyncfs, even though xsyncfs
writes data to disk out of order. Journaled mode alone is provides stronger durability guarantees. Throughput is
insufficient to provide ordering since the order of writing limited by the size of the drive cache; once the cache fills,
transactions to the disk platter may differ from the order subsequent writes block until some data in the cache is
of writing transactions to the drive cache. In contrast, written to the disk platter.

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 9
10000

ext3-async
xsyncfs Untar
ext3-sync Configure
1000
1000 ext3-barrier Make
Remove

Time (seconds)
Time (seconds)

100

500

1 0
ext3-async xsyncfs ext3-sync ext3-barrier RAMFS

This figure shows the time to run the PostMark benchmark — This figure shows the time to run the Apache build benchmark.
the y-axis is logarithmic. Each value is the mean of 5 trials — Each value is the mean of 5 trials — the (relatively small) error
the (relatively small) error bars are 90% confidence intervals. bars are 90% confidence intervals.

Figure 4. The PostMark file system benchmark Figure 5. The Apache build benchmark

The last bar in Figure 4 shows the time to complete as the data on which it depends commits, output appears
the benchmark when ext3 is mounted synchronously promptly during the execution of the benchmark.
and write barriers are used to prevent data loss when a
For comparison, the bar at the far right of the graph
power failure occurs. Since write barriers synchronously
shows the time to execute the benchmark using a
flush the drive cache twice for each file system transac-
memory-only file system, RAMFS. This provides a
tion, ext3’s performance is over two orders of magnitude
lower bound on the performance of a local file sys-
slower than that of xsyncfs.
tem, and it isolates the computation requirements of the
Due to the high cost of durability, high end storage sys- benchmark. Removing disk I/O by running the bench-
tems sometimes use specialized hardware such as a non- mark in RAMFS improves performance by only 8% over
volatile cache to improve performance [7]. This elim- xsyncfs because the remainder of the benchmark is dom-
inates the need for write barriers. However, even with inated by computation.
specialized hardware, we expect that the performance
The third bar in Figure 5 shows that ext3 mounted in
of ext3 mounted synchronously would be no better than
synchronous mode is 46% slower than xsyncfs. Since
the third bar in Figure 4, which writes data to a volatile
computation dominates I/O in this benchmark, any dif-
cache. Thus, use of xsyncfs should still lead to substan-
ference in I/O performance is a smaller part of overall
tial performance improvements for synchronous opera-
performance. The fourth bar shows that ext3 mounted
tions even when the hard drive has a non-volatile cache
synchronously with write barriers is over 11 times slower
of the same size as the volatile cache on our drive.
than xsyncfs. If we isolate the cost of I/O by subtracting
the cost of computation (calculated using the RAMFS
4.4 The Apache build benchmark
result), ext3 mounted synchronously is 7.5 times slower
We next ran a benchmark in which we untar the Apache than xsyncfs while ext3 mounted synchronously with
2.0.48 source tree into a file system, run configure in write barriers is over two orders of magnitude slower
an object directory within that file system, run make in than xsyncfs. These isolated results are similar to the
the object directory, and remove all files. The Apache values that we saw for the PostMark experiments.
build benchmark requires the file system to balance
throughput and latency; it displays large amounts of 4.5 The MySQL benchmark
screen output interleaved with disk I/O and computation.
We were curious to see how xsyncfs would perform
Figure 5 shows the total amount of time to run the with an application that implements its own group com-
benchmark, with shadings within each bar showing the mit strategy. We therefore ran a modified version of
time for each stage. Comparing the first two bars in the OSDL TPC-C benchmark [18] using MySQL ver-
the graph, xsyncfs performs within 3% of ext3 mounted sion 5.0.16 and the InnoDB storage engine. Since
asynchronously. Since xsyncfs releases output as soon both MySQL and the TPC-C benchmark client are

10 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association
New Order Transactions Per Minute 4000 300

Throughput (Kb/s)
3000

200 ext3-async
2000 xsyncfs
xsyncfs
ext3-barrier ext3-sync
ext3-barrier
1000

100
0
0 5 10 15 20
Number of Threads

This figure shows the New Order Transactions Per Minute when
running a modified TPC-C benchmark on MySQL with varying 0
numbers of clients. Each result is the mean of 5 trials — the
error bars are 90% confidence intervals.
This figure shows the mean throughput achieved when running
Figure 6. The MySQL benchmark the SPECweb99 benchmark with 50 simultaneous connections.
Each result is the mean of three trials, with error bars showing
the highest and lowest result.
multi-threaded, this benchmark measures the efficacy of
xsyncfs’s support for shared memory. TPC-C measures Figure 7. Throughput in the SPECweb99 benchmark
the New Order Transactions Per Minute (NOTPM) a
database can process for a given number of simultane- with write barriers matches the performance of xsyncfs.
ous client connections. The total number of transactions From these results, we conclude that even applications
performed by TPC-C is approximately twice the num- such as MySQL that use a custom group commit strat-
ber of New Order Transactions. TPC-C requires that a egy can benefit from external synchrony if the number of
database provide ACID semantics, and MySQL requires concurrent transactions is low to moderate.
either disabling the drive cache or using write barriers to Although ext3 mounted asynchronously without write
provide durability. Therefore, we compare xsyncfs with barriers does not meet the durability requirements for
ext3 mounted asynchronously using write barriers. Since TPC-C, we were still curious to see how its performance
the client ran on the same machine as the server, we mod- compared to xsyncfs. With only 1 or 2 clients, MySQL
ified the benchmark to use UNIX sockets. This allows executes 11% more NOTPM with xsyncfs than it exe-
xsyncfs to propagate commit dependencies between the cutes with ext3 without write barriers. With 4 or more
client and server on the same machine. In addition, we clients, the two configurations yield equivalent perfor-
modified the benchmark to saturate the MySQL server by mance within experimental error.
removing any wait times between transactions and creat-
ing a data set that fits completely in memory. 4.6 The SPECweb99 benchmark
Figure 6 shows the NOTPM achieved as the number
Since our previous benchmarks measured only work-
of clients is increased from 1 to 20. With a single
loads confined to a single computer, we also ran the
client, MySQL completes 3 times as many NOTPM
SPECweb99 [29] benchmark to examine the impact of
using xsyncfs. By propagating commit dependencies
external synchrony on a network-intensive application.
to both the MySQL server and the requesting client,
In the SPECweb99 benchmark, multiple clients issue a
xsyncfs can group commit transactions from a single
mix of HTTP GET and POST requests. HTTP GET re-
client, significantly improving performance. In contrast,
quests are issued for both static and dynamic content up
MySQL cannot benefit from group commit with a single
to 1 MB in size. A single client, emulating 50 simultane-
client because it must conservatively commit each trans-
ous connections, is connected to the server, which runs
action before replying to the client.
Apache 2.0.48, by a 100 Mb/s Ethernet switch. As we
When there are multiple clients, MySQL can group use the default Apache settings, 50 connections are suf-
the commit of transactions from different clients. As ficient to saturate our server.
the number of clients grows, the gap between xsyncfs
We felt that this benchmark might be especially challeng-
and ext3 mounted asynchronously with write barriers
ing for xsyncfs since sending a network message exter-
shrinks. With 20 clients, xsyncfs improves TPC-C per-
nalizes state. Since xsyncfs only tracks causal dependen-
formance by 22%. When the number of clients reaches
cies on a single computer, it must buffer each message
32, the performance of ext3 mounted asynchronously

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 11
Request size ext3-async xsyncfs strategies. The output-triggered commit strategy per-
0–1 KB 0.064 (±0.025) 0.097 (±0.002) forms better than the eager commit strategy in every
1–10 KB 0.150 (±0.034) 0.180 (±0.001) benchmark except SPECweb99, which creates so much
10–100 KB 1.084 (±0.052) 1.094 (±0.003) output that the eager commit and output-triggered com-
100–1000 KB 10.253 (±0.098) 10.072 (±0.066) mit strategies perform very similarly. Since the eager
commit strategy attempts to minimize the latency of a
The figure shows the mean time (in seconds) to request a file of a single operation, it sacrifices the opportunity to improve
particular size during three trials of the SPECweb99 benchmark
with 50 simultaneous connections. 90% confidence intervals are throughput. In contrast, the output-triggered commit
given in parentheses. strategy only minimizes latency after output has been
Figure 8. SPECweb99 latency results generated that depends on a transaction; otherwise it
maximizes throughput.
until the file system data on which that message depends
has been committed. In addition to the normal log data 5 Related work
written by Apache, the SPECweb99 benchmark writes a
To the best of our knowledge, xsyncfs is the first local
log record to the file system as a result of each HTTP
file system to provide high-performance synchronous I/O
POST. Thus, small file writes are common during bench-
without requiring specialized hardware support or appli-
mark execution — a typical 45 minute run has approxi-
cation modification. Further, xsyncfs is the first file sys-
mately 150,000 file system transactions.
tem to use the causal relationship between file modifica-
As shown in Figure 7, SPECweb99 throughput us- tions and external output to decide when to commit data.
ing xsyncfs is within 8% of the throughput achieved
While xsyncfs takes a software-only approach to provid-
when ext3 is mounted asynchronously. In contrast to
ing high-performance synchronous I/O, specialized hard-
ext3, xsyncfs guarantees that the data associated with
ware can achieve the same result. The Rio file cache [2]
each POST request is durable before a client receives
and the Conquest file system [31] use battery-backed
the POST response. The third bar in Figure 7 shows
main memory to make writes persistent. Durability is
that SPECweb99 using ext3 mounted synchronously
guaranteed only as long as the computer has power or
achieves 6% higher throughput than xsyncfs. Unlike the
the batteries remain charged.
previous benchmarks, SPECweb99 writes little data to
disk, so most writes are buffered by the drive cache. The Hitz et al. [7] store file system journal modifications on
last bar shows that xsyncfs achieves 7% better through- a battery-backed RAM drive cache, while writing file
put than ext3 mounted synchronously with write barriers. system data to disk. We expect that synchronous oper-
ations on Hitz’s hybrid system would perform no better
Figure 8 summarizes the average latency of individual
than ext3 mounted synchronously without write barriers
HTTP requests during benchmark execution. On aver-
in our experiments. Thus, xsyncfs could substantially
age, use of xsyncfs adds no more than 33 ms of delay
improve the performance of such hybrid systems.
to each request — this value is less than the commonly
cited perception threshold of 50 ms for human users [5]. eNVy [33] is a file system that stores data on flash-based
Thus, a user should perceive no difference in response NVRAM. The designers of eNVy found that although
time between xsyncfs and ext3 for HTTP requests. reads from NVRAM were fast, writes were prohibitively
slow. They used a battery-backed RAM write cache to
4.7 Benefit of output-triggered commits achieve reasonable write performance. The write perfor-
mance issues seen in eNVy are similar to those we ex-
To measure the benefit of output-triggered commits, we perienced writing data to commodity hard drives. There-
also implemented an eager commit strategy for xsyncfs fore, it is likely that xsyncfs could also improve perfor-
that triggers a commit whenever the file system is mod- mance for flash file systems.
ified. The eager commit strategy still allows for group
Xsyncfs’s focus on providing both strong durability and
commit since multiple modifications are grouped into a
reasonable performance contrasts sharply with the trend
single file system transaction while the previous transac-
in commodity file systems toward relaxing durability
tion is committing. The next transaction will only start
to improve performance. Early file systems such as
to commit once the commit of the previous transaction
FFS [14] and the original UNIX file system [22] in-
completes. The eager commit strategy attempts to mini-
troduced the use of a main memory buffer cache to
mize the latency of individual file system operations.
hold writes until they are asynchronously written to
We executed the previous benchmarks using the eager disk. Early file systems suffered from potential corrup-
commit strategy. Figure 9 compares results for the two tion when a computer lost power or an operating sys-
tem crashed. Recovery often required a time consuming

12 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association
Benchmark Eager Commits Output-Triggered Commits Speedup
PostMark (seconds) 9.879 (±0.056) 8.668 (±0.478) 14%
Apache (seconds) 111.41 (±0.32) 109.42 (±0.71) 2%
MySQL 1 client (NOTPM) 3323 (±60) 4498 (±73) 35%
MySQL 20 clients (NOTPM) 3646 (±217) 4052 (±200) 11%
SPECweb99 (Kb/s) 312 (±1) 311(±2) 0%
This figure compares the performance of output-triggered commits with an eager commit strategy. Each result shows the mean of 5
trials, except SPECweb99, which is the mean of 3 trials. 90% confidence intervals are given in parentheses.

Figure 9. Benefit of output-triggered commits

examination of the entire state of the file system (e.g., Our implementation of external synchrony draws upon
running fsck). For this reason, file systems such as two other techniques from the fault tolerance literature.
Cedar [6] and LFS [23] added the complexity of a write- First, buffering output until the commit is similar to de-
ahead log to enable fast, consistent recovery of file sys- ferring message sends until commit [12]. Second, track-
tem state. Yet, as was shown in our evaluation, journaling causal dependencies to identify what and when to
ing data to a write-ahead log is insufficient to prevent commit is similar to causal tracking in message logging
file system corruption if the drive cache reorders block protocols [4]. We use these techniques in isolation to im-
writes. An alternative to write-ahead logging, Soft Up- prove performance and maintain the appearance of syn-
dates [25], carefully orders disk writes to provide consis- chronous I/O. We also use these techniques in combina-
tent recovery. Xsyncfs builds on this prior work since it tion via output-triggered commits, which automatically
writes data after returning control to the application and balance throughput and latency.
uses a write-ahead log. Thus, external synchrony could Transactions, provided by operating systems such as
improve the performance of synchronous I/O with other
QuickSilver [24], TABS [28], and Locus [32], and by
journaling file systems such as JFS [1] or ReiserFS [16]. transactional file systems [10, 19], also give the strong
Fault tolerance researchers have long defined consis- durability and ordering guarantees that are provided by
tent recovery in terms of the output seen by the outside xsyncfs. In addition, transactions provide atomicity for
world [3, 11, 30]. For example, the output commit prob- a set of file system operations. However, transactional
lem requires that, before a message is sent to the outside systems typically require that applications be modified
world, the state from which that message is sent must be to specify transaction boundaries. In contrast, use of
preserved. In the same way, we argue that the guarantees xsyncfs requires no such modification.
provided by synchronous disk I/O should be defined by
the output seen by the outside world, rather than by the 6 Conclusion
results seen by local processes.
It is challenging to develop simple and reliable software
It is interesting to speculate why the principle of outside
systems if the foundations upon which those systems are
observability is widely known and used in fault tolerance
built are unreliable. Asynchronous I/O is a prime exam-
research yet new to the domain of general purpose appli-
ple of one such unreliable foundation. OS crashes and
cations and I/O. We believe this dichotomy arises from
power failures can lead to loss of data, file system cor-
the different scope and standard of recovery in the two
ruption, and out-of-order modifications. Nevertheless,
domains. In fault tolerance research, the scope of recov-
current file systems present an asynchronous I/O inter-
ery is the entire process; hence not using the principle of
face by default because the performance penalty of syn-
outside observability would require a synchronous disk
chronous I/O is assumed to be too large.
I/O at every change in process state. In general pur-
pose applications, the scope of recovery is only the I/O In this paper, we have proposed a new abstraction, exter-
issued by the application (which can be viewed as an nal synchrony, that preserves the simplicity and reliabil-
application-specific recovery protocol). Hence it is fea- ity of a synchronous I/O interface, yet performs approx-
sible (though still slow) to issue each I/O synchronously. imately as well as an asynchronous I/O interface. Based
In addition, the standard for recovery in fault tolerance on these results, we believe that externally synchronous
research is well defined: a recovery system should lose file systems such as xsyncfs can provide a better founda-
no visible output. In contrast, the standard for recovery tion for the construction of reliable software systems.
in general purpose systems is looser: asynchronous I/O is
common, and even synchronous I/O is usually commit-
ted synchronously only to the volatile hard drive cache.

USENIX Association OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation 13
Acknowledgments [17] N IGHTINGALE , E. B., C HEN , P. M., AND F LINN , J. Spec-
ulative execution in a distributed file system. In Proceedings
We thank Manish Anand, Evan Cooke, Anthony Nicholson, Dan Peek, of the 20th ACM Symposium on Operating Systems Principles
Sushant Sinha, Ya-Yunn Su, our shepherd, Rob Pike, and the anony- (Brighton, United Kingdom, October 2005), pp. 191–205.
mous reviewers for feedback on this paper. The work has been sup- [18] OSDL. OSDL Database Test 2. http://www.osdl.org/.
ported by the National Science Foundation under award CNS-0509093.
Jason Flinn is supported by NSF CAREER award CNS-0346686, and [19] PAXTON , W. H. A client-based transaction system to maintain
Ed Nightingale is supported by a Microsoft Research Student Fellow- data integrity. In Proceedings of the 7th ACM Symposium on Op-
ship. Intel Corp. has provided additional support. The views and con- erating Systems Principles (1979), pp. 18–23.
clusions contained in this document are those of the authors and should [20] P RABHAKARAN , V., BAIRAVASUNDARAM , L. N., A GRAWAL ,
not be interpreted as representing the official policies, either expressed N., G UNAWI , H. S., A RPACI -D USSEAU , A. C., AND A RPACI -
or implied, of NSF, Intel, Microsoft, the University of Michigan, or the D USSEAU , R. H. IRON file systems. In Proceedings of the 20th
U.S. government. ACM Symposium on Operating Systems Principles (Brighton,
United Kingdom, October 2005), pp. 206–220.
References [21] Q IN , F., T UCEK , J., S UNDARESAN , J., AND Z HOU , Y. Rx:
Treating bugs as allergies—a safe method to survive software fail-
[1] B EST, S. JFS overview. Tech. rep., IBM, http://www- ures. In Proceedings of the 20th ACM Symposium on Operating
128.ibm.com/developerworks/linux/library/l-jfs.html, 2000. Systems Principles (Brighton, United Kingdom, October 2005),
pp. 235–248.
[2] C HEN , P. M., N G , W. T., C HANDRA , S., AYCOCK , C., R A -
JAMANI , G., AND L OWELL , D. The Rio file cache: Surviv- [22] R ITCHIE , D. M., AND T HOMPSON , K. The UNIX time-sharing
ing operating system crashes. In Proceedings of the 7th Inter- system. Communications of the ACM 17, 7 (1974), 365–375.
national Conference on Architectural Support for Programming [23] ROSENBLUM , M., AND O USTERHOUT, J. K. The design and im-
Languages and Operating Systems (Cambridge, MA, October plementation of a log-structured file system. ACM Transactions
1996), pp. 74–83. on Computer Systems (TOCS) 10, 1 (February 1992), 26–52.
[3] E LNOZAHY, E. N., A LVISI , L., WANG , Y.-M., AND J OHN - [24] S CHMUCK , F., AND W YLIE , J. Experience with transactions
SON , D. B. A survey of rollback-recovery protocols in message-
in QuickSilver. In Proceedings of the 13th ACM Symposium on
passing systems. ACM Computing Surveys 34, 3 (September Operating Systems Principles (October 1991), pp. 239–53.
2002), 375–408.
[25] S ELTZER , M. I., G ANGER , G. R., M C K USICK , M. K., S MITH ,
[4] E LNOZAHY, E. N., AND Z WAENEPOEL , W. Manetho: Transpar- K. A., S OULES , C. A. N., AND S TEIN , C. A. Journaling versus
ent Rollback-Recovery with Low Overhead, Limited Rollback, soft updates: Asynchronous meta-data protection in file systems.
and Fast Output Commit. IEEE Transactions on Computers C- In USENIX Annual Technical Conference (San Diego, CA, June
41, 5 (May 1992), 526–531. 2000), pp. 18–23.
[5] F LAUTNER , K., AND M UDGE , T. Vertigo: Automatic [26] S ILBERSCHATZ , A., AND G ALVIN , P. B. Operating System
performance-setting for Linux. In Proceedings of the 5th Sympo- Concepts (5th Edition). Addison Wesley, February 1998. p. 27.
sium on Operating Systems Design and Implementation (Boston,
MA, December 2002), pp. 105–116. [27] S LASHDOT. Your Hard Drive Lies to You.
http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252.
[6] H AGMANN , R. Reimplementing the Cedar file system using
logging and group commit. In Proceedings of the 11th ACM [28] S PECTOR , A. Z., DANIELS , D., D UCHAMP, D., E PPINGER ,
Symposium on Operating Systems Principles (Austin, TX, 1987), J. L., AND PAUSCH , R. Distributed transactions for reliable sys-
pp. 155–162. tems. In Proceedings of the 10th ACM Symposium on Operating
Systems Principles (Orcas Island, WA, December 1985), pp. 127–
[7] H ITZ , D., L AU , J., AND M ALCOLM , M. File system design for 146.
an NFS file server appliance. In Proceedings of the Winter 1994
USENIX Technical Conference (1994). [29] S TANDARD P ERFORMANCE E VALUATION C ORPORATION.
SPECweb99. http://www.spec.org/web99.
[8] K ATCHER , J. PostMark: A new file system benchmark. Tech.
Rep. TR3022, Network Appliance, 1997. [30] S TROM , R. E., AND Y EMINI , S. Optimistic Recovery in Dis-
tributed Systems. ACM Transactions on Computer Systems 3, 3
[9] L AMPORT, L. Time, clocks, and the ordering of events in a dis- (August 1985), 204–226.
tributed system. Commun. ACM 21, 7 (1978), 558–565.
[31] WANG , A.-I. A., R EIHER , P., P OPEK , G. J., AND K UENNING ,
[10] L ISKOV, B., AND RODRIGUES , R. Transactional file systems can G. H. Conquest: Better performance through a disk/persistent-
be fast. In Proceedings of the 11th SIGOPS European Workshop RAM hybrid file system. In Proceedings of the 2002 USENIX
(Leuven, Belgium, September 2004). Annual Technical Conference (Monterey, CA, June 2002).
[11] L OWELL , D. E., C HANDRA , S., AND C HEN , P. M. Exploring [32] W EINSTEIN , M. J., T HOMAS W. PAGE , J., L IVEZEY, B. K.,
failure transparency and the limits of generic recovery. In Pro- AND P OPEK , G. J. Transactions and synchronization in a dis-
ceedings of the 4th Symposium on Operating Systems Design and tributed operating system. In Proceedings of the 10th ACM Sym-
Implementation (San Diego, CA, October 2000). posium on Operating Systems Principles (Orcas Island, WA, De-
[12] L OWELL , D. E., AND C HEN , P. M. Persistent messages in local cember 1985), pp. 115–126.
transactions. In Proceedings of the 1998 Symposium on Princi- [33] W U , M., AND Z WAENEPOEL , W. eNVy: A non-volatile,
ples of Distributed Computing (June 1998), pp. 219–226. main memory storage system. In Proceedings of the 6th Inter-
[13] M C K USICK , M. K. Disks from the perspective of a file system. national Conference on Architectural Support for Programming
;login: 31, 3 (June 2006), 18–19. Languages and Operating Systems (San Jose, CA, 1994), pp. 86–
97.
[14] M C K USICK , M. K., J OY, W. N., L EFFLER , S. J., AND FABRY,
R. S. A fast file system for unix. ACM Transactions on Computer
Systems (TOCS) 2, 3 (August 1984), 181–197.
[15] M Y SQL AB. MySQL Reference Manual. http://dev.mysql.com/.
[16] N AMESYS . ReiserFS. http://www.namesys.com/.

14 OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association

Operation Manual: Jesma Filter
100% (4)
Operation Manual: Jesma Filter
50 pages
Krage Et Al, 2020
No ratings yet
Krage Et Al, 2020
17 pages
Route Flap Damping
No ratings yet
Route Flap Damping
13 pages
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
m5 Datasheet
No ratings yet
m5 Datasheet
1 page
Effect of Axial Load On Plastic Modulus
No ratings yet
Effect of Axial Load On Plastic Modulus
6 pages
IA Operating Systems: I/O, Storage, File Management, Case Studies
No ratings yet
IA Operating Systems: I/O, Storage, File Management, Case Studies
4 pages
Characteristics of FAT32 and NTFS
No ratings yet
Characteristics of FAT32 and NTFS
10 pages
Probability Inequalities: 15.1. Boole's Inequality, Bonferroni Inequalities
No ratings yet
Probability Inequalities: 15.1. Boole's Inequality, Bonferroni Inequalities
14 pages
Gravitation Test Series
No ratings yet
Gravitation Test Series
3 pages
Engineering Data Analysis Comprehensive Notes and Examples
No ratings yet
Engineering Data Analysis Comprehensive Notes and Examples
4 pages
Os Pyq Solution
No ratings yet
Os Pyq Solution
31 pages
Picrosiriusred Protocol
No ratings yet
Picrosiriusred Protocol
8 pages
Keyboard Shortcuts Linux
No ratings yet
Keyboard Shortcuts Linux
1 page
File MGT (Module 11)
No ratings yet
File MGT (Module 11)
99 pages
Hooded Dino Blanket
No ratings yet
Hooded Dino Blanket
2 pages
Automatic Pixel-Level Detection of Vertical Cracks in Asphalt Pavement Based On GPR Investigation and Improved Mask R-CNN
No ratings yet
Automatic Pixel-Level Detection of Vertical Cracks in Asphalt Pavement Based On GPR Investigation and Improved Mask R-CNN
44 pages
Os Cycle Test 3 Answer Key
No ratings yet
Os Cycle Test 3 Answer Key
11 pages
Infire HTC Speed Operating Instruction
No ratings yet
Infire HTC Speed Operating Instruction
56 pages
Osync v1.2
No ratings yet
Osync v1.2
12 pages
Class 7 Speculative Execution
No ratings yet
Class 7 Speculative Execution
21 pages
File MGMT L2
No ratings yet
File MGMT L2
18 pages
CSE203 LP Unit III A
No ratings yet
CSE203 LP Unit III A
41 pages
Isometry: 5.1 Isometry and Isometric Isomorphism
No ratings yet
Isometry: 5.1 Isometry and Isometric Isomorphism
13 pages
537 L22 LFS
No ratings yet
537 L22 LFS
64 pages
Chapter 3
No ratings yet
Chapter 3
18 pages
WH Questions
100% (1)
WH Questions
13 pages
Section 13F - Engine Electrical System PDF
No ratings yet
Section 13F - Engine Electrical System PDF
14 pages
Script Part 4
No ratings yet
Script Part 4
5 pages
Unit 5
No ratings yet
Unit 5
98 pages
Recirculation Pump Sizing
No ratings yet
Recirculation Pump Sizing
3 pages
Chapter 13-Concrete USD
No ratings yet
Chapter 13-Concrete USD
58 pages
CS2510 00 Distributed Storage Overview
No ratings yet
CS2510 00 Distributed Storage Overview
53 pages
Ahmet Ozan HATİPOĞLU Cansu Çalişir Mehmet Özgür TEMUÇİN
100% (1)
Ahmet Ozan HATİPOĞLU Cansu Çalişir Mehmet Özgür TEMUÇİN
14 pages
DELEM Install GB
No ratings yet
DELEM Install GB
81 pages
Unit 3 Part 1,2
No ratings yet
Unit 3 Part 1,2
22 pages
Outline: File System Consistency Issues in The Presence of Failures
No ratings yet
Outline: File System Consistency Issues in The Presence of Failures
4 pages
Outline: Access Control Lists (ACL) : Keep Lists of Access For Each Domain With
No ratings yet
Outline: Access Control Lists (ACL) : Keep Lists of Access For Each Domain With
5 pages
Meeting 14 - IO Management and Disk Scheduling - 2024
No ratings yet
Meeting 14 - IO Management and Disk Scheduling - 2024
69 pages
Stein
No ratings yet
Stein
6 pages
Anesthetic Technique For Inferior Alveolar Nerve Block: A New Approach
No ratings yet
Anesthetic Technique For Inferior Alveolar Nerve Block: A New Approach
5 pages
File System Implementation OS
No ratings yet
File System Implementation OS
54 pages
LOGO! StarterKit
No ratings yet
LOGO! StarterKit
2 pages
The Lecture Contains:: Lecture 9: Performance Issues in Shared Memory
No ratings yet
The Lecture Contains:: Lecture 9: Performance Issues in Shared Memory
7 pages
Meq Model Questions
0% (1)
Meq Model Questions
4 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
66 pages
Ext 2
No ratings yet
Ext 2
12 pages
Lec20 Distributed
No ratings yet
Lec20 Distributed
29 pages
Seismic Micro Zonation Aap PHD
No ratings yet
Seismic Micro Zonation Aap PHD
11 pages
Kleiman 86 V Nodes
No ratings yet
Kleiman 86 V Nodes
10 pages
Dynamic Equilibrium
No ratings yet
Dynamic Equilibrium
4 pages
Distributed File Systems: Arvind Krishnamurthy Spring 2001
No ratings yet
Distributed File Systems: Arvind Krishnamurthy Spring 2001
3 pages
A Quick Introduction To The Domain Name System: David Conrad
No ratings yet
A Quick Introduction To The Domain Name System: David Conrad
74 pages
Core-Stateless Fair Queueing: A Scalable Architecture To Approximate Fair Bandwidth Allocations in High Speed Networks
No ratings yet
Core-Stateless Fair Queueing: A Scalable Architecture To Approximate Fair Bandwidth Allocations in High Speed Networks
56 pages
22 File Systems 2
No ratings yet
22 File Systems 2
28 pages
Chord: A Scalable Peer-to-Peer Lookup Protocol For Internet Applications
No ratings yet
Chord: A Scalable Peer-to-Peer Lookup Protocol For Internet Applications
40 pages
Botnets: Randy Marchany Marchany@vt - Edu VA Tech IT Security Lab VASCAN 2005
No ratings yet
Botnets: Randy Marchany Marchany@vt - Edu VA Tech IT Security Lab VASCAN 2005
41 pages
Chord: A Scalable Peer-to-Peer Lookup Service For Internet Applications
No ratings yet
Chord: A Scalable Peer-to-Peer Lookup Service For Internet Applications
33 pages
Rarest First and Choke Algorithms Are Enough: Arnaud LEGOUT
No ratings yet
Rarest First and Choke Algorithms Are Enough: Arnaud LEGOUT
29 pages
A Case For End System Multicast: Yang-Hua Chu, Sanjay Rao and Hui Zhang Carnegie Mellon University
No ratings yet
A Case For End System Multicast: Yang-Hua Chu, Sanjay Rao and Hui Zhang Carnegie Mellon University
27 pages
Chord: A Scalable Peer-To-Peer Lookup Protocol For Internet Applications
No ratings yet
Chord: A Scalable Peer-To-Peer Lookup Protocol For Internet Applications
25 pages
Clustering and Sharing Incentives in Bittorrent Systems
No ratings yet
Clustering and Sharing Incentives in Bittorrent Systems
23 pages
A Case For End System Multicast
No ratings yet
A Case For End System Multicast
19 pages
Congestion Avoidance and Control: V. Jacobson
No ratings yet
Congestion Avoidance and Control: V. Jacobson
17 pages
Domain Name System: DNS
No ratings yet
Domain Name System: DNS
16 pages
Greening of The Internet
No ratings yet
Greening of The Internet
14 pages
Delayed Internet Routing Convergence: Craig Labovitz Abha Ahuja, Abhijit Bose Farnam Jahanian
No ratings yet
Delayed Internet Routing Convergence: Craig Labovitz Abha Ahuja, Abhijit Bose Farnam Jahanian
13 pages
Narada
No ratings yet
Narada
12 pages
Torrent Clustering
No ratings yet
Torrent Clustering
12 pages
A Multifaceted Approach To Understanding The Botnet Phenomenon
No ratings yet
A Multifaceted Approach To Understanding The Botnet Phenomenon
27 pages
Resilient Overlay Networks: David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris
No ratings yet
Resilient Overlay Networks: David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris
15 pages
Congestion Control For High Bandwidth-Delay Product Networks
No ratings yet
Congestion Control For High Bandwidth-Delay Product Networks
14 pages
Jazz Guitar Chords - The Ultimate Guide
No ratings yet
Jazz Guitar Chords - The Ultimate Guide
40 pages
Timer Interaction in Route Flap Damping
No ratings yet
Timer Interaction in Route Flap Damping
11 pages
8 RouterSupport
No ratings yet
8 RouterSupport
36 pages
Internet Architecture: CPS 214 (Nick Feamster) January 14, 2008
No ratings yet
Internet Architecture: CPS 214 (Nick Feamster) January 14, 2008
31 pages
C-Full Programs 001
No ratings yet
C-Full Programs 001
25 pages
Reducing Network Energy Consumption Via Sleeping and Rate-Adaptation
No ratings yet
Reducing Network Energy Consumption Via Sleeping and Rate-Adaptation
14 pages
Unit 1
No ratings yet
Unit 1
67 pages
CSC: Principles of Computer Networks: Demultiplexing
No ratings yet
CSC: Principles of Computer Networks: Demultiplexing
11 pages
Multi Cast
No ratings yet
Multi Cast
38 pages
Distributed File Systems
No ratings yet
Distributed File Systems
22 pages
F2FS: A New File System For Flash Storage
No ratings yet
F2FS: A New File System For Flash Storage
15 pages
Learning Piano by Yourself
No ratings yet
Learning Piano by Yourself
2 pages
XFS - Extended Filesystem
No ratings yet
XFS - Extended Filesystem
46 pages
Bluffs: BSD Logging Updated Fast File System Stephan Uphoff
No ratings yet
Bluffs: BSD Logging Updated Fast File System Stephan Uphoff
52 pages
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
No ratings yet
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
21 pages
Atm System FINAL
No ratings yet
Atm System FINAL
77 pages
BGP Convergence
No ratings yet
BGP Convergence
24 pages
Back To The Roots Oracle Database IO Management
No ratings yet
Back To The Roots Oracle Database IO Management
35 pages
Ext3/4 File Systems: Don Porter CSE 506
No ratings yet
Ext3/4 File Systems: Don Porter CSE 506
33 pages
On Incremental File System Development
No ratings yet
On Incremental File System Development
33 pages
Juan Ortega 9/23/09 NTW412
No ratings yet
Juan Ortega 9/23/09 NTW412
14 pages
13 File-Systems
No ratings yet
13 File-Systems
69 pages
Linux Block IO Paper Systor13-Final18
No ratings yet
Linux Block IO Paper Systor13-Final18
10 pages
Understanding Operating System Resources
No ratings yet
Understanding Operating System Resources
11 pages
Green Inertia - Firm Presentation-April 2011-1
No ratings yet
Green Inertia - Firm Presentation-April 2011-1
18 pages
Operating Systems With Linux John O'Gorman
100% (1)
Operating Systems With Linux John O'Gorman
313 pages
File Systems in Linux and Freebsd: A Comparative Study: Kuo-Pao Yang, Katie Wallace
No ratings yet
File Systems in Linux and Freebsd: A Comparative Study: Kuo-Pao Yang, Katie Wallace
4 pages
Journal Design PDF
No ratings yet
Journal Design PDF
8 pages
File System Consistency and Exam Review
No ratings yet
File System Consistency and Exam Review
43 pages
Case Study
No ratings yet
Case Study
44 pages
Rethink The Sync
No ratings yet
Rethink The Sync
22 pages
Ocfs2-1 8 2-Manpages
No ratings yet
Ocfs2-1 8 2-Manpages
84 pages
Ext3 Journal Design
No ratings yet
Ext3 Journal Design
8 pages
File Synchronization: Theory and Practice Benjamin C. Pierce
No ratings yet
File Synchronization: Theory and Practice Benjamin C. Pierce
49 pages
Files and File Systems: File Storage Structure File System Implementation Kernel Abstraction Communication Through A Pipe
No ratings yet
Files and File Systems: File Storage Structure File System Implementation Kernel Abstraction Communication Through A Pipe
25 pages
Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems
From Everand
Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems
Dr. Bruce Holenstein
No ratings yet
Operating Systems Io Systems
No ratings yet
Operating Systems Io Systems
25 pages
SEOUC - How To Solve The Wrong Problem
No ratings yet
SEOUC - How To Solve The Wrong Problem
42 pages
Rethink The Sync: Ed Nightingale Kaushik Veeraraghavan Peter Chen Jason Flinn University of Michigan
No ratings yet
Rethink The Sync: Ed Nightingale Kaushik Veeraraghavan Peter Chen Jason Flinn University of Michigan
22 pages
Speculative Execution in A Distributed File System
No ratings yet
Speculative Execution in A Distributed File System
15 pages
A Comparison of Journaling and Transactional File Systems
No ratings yet
A Comparison of Journaling and Transactional File Systems
12 pages
CL205 Lab8
No ratings yet
CL205 Lab8
21 pages
Gsmith Content Postgresql TuningPGWAL
No ratings yet
Gsmith Content Postgresql TuningPGWAL
8 pages
Design and Implementation of The Sun Network Filesystem: R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon
No ratings yet
Design and Implementation of The Sun Network Filesystem: R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon
34 pages
CS2257 OS LAB MANUAL II IT by GBN PDF
No ratings yet
CS2257 OS LAB MANUAL II IT by GBN PDF
91 pages
NFS
No ratings yet
NFS
27 pages

Nightingale 06

Uploaded by

Nightingale 06

Uploaded by

Rethink the Sync

Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn

Abstract In contrast, an asynchronous file system does not block

(a) Synchronous I/O (b) Externally synchronous I/O

Figure 1. Example of externally synchronous file I/O

Figure 3. When is data safe?

Figure 9. Benefit of output-triggered commits

You might also like