Speculative Execution in A Distributed File System: E. B. Nightingale P. M. Chen J. Flint
Speculative Execution in A Distributed File System: E. B. Nightingale P. M. Chen J. Flint
Speculative Execution in A Distributed File System: E. B. Nightingale P. M. Chen J. Flint
E. B. Nightingale
P. M. Chen
J. Flint
University of Michigan
Motivation
• Distributed file systems are often much slower than
local file systems
– Due to synchronous operations required for
cache coherence and data safety
– Even true for file systems that weaken
consistency and safety guarantees
• Close-to-open consistency for AFS and most
versions of NFS
A better solution
• Most of these synchronous operations have
predictable outcomes
– We can bet on the outcome and let the client
process go forward (speculation)
• Make operation asynchronous
– Must take before that a checkpoint of the process
• Can restart operation if speculation failed
Why it works
1. Clients can correctly predict the outcome of many
operations
• Few concurrent accesses to files
2. Time to take a lightweight checkpoint is often less
than network round-trip time
• 52 ms for a small process thanks to
copy-on-write
3. Most clients have free cycles
Speculator
• File system controls when speculations start,
succeed and fail
• Speculator provides a mechanism to ensure
correct execution of speculative code
• No application changes are required
• Speculative state is never visible from the
outside
Correctness rules (I)
• A process that executes in speculative mode
cannot externalize output
– Speculator blocks the process
• Speculator tracks causal dependencies between
kernel objects
– Kernel objects modified by a speculative
process will be put in a speculative state
Correctness rules (II)
• Speculator tracks causal dependencies between
processes
– Processes receiving a message or a signal
from a speculative process will be
checkpointed and become speculative
• In case of doubt, Speculator will block the
execution of the speculative process
An example: conventional NFS
An example: conventional NFS
• Linux 2.4.21 NFSv3 implements close to open
consistency
– At close time, client sends to server:
1. Asynchronous write calls with the
modified data
2. A synchronous commit call once it
has received replies for all write calls
An example: SpecNFS
An example: SpecNFS
• All calls are non-blocking but force the calling
process to become speculative
• If a call returns an unexpected result, the calling
process is rolled back to its checkpoint and the
call is executed again
– A new speculation starts
Speculation interface
• Three new system calls:
– Create_speculation():
• Returns unique spec_id and a list of
previous speculations on which the
speculation depends
– Commit_speculation(spec_id)
– Fail_speculation(spec_id)
Implementing checkpoints
• Checkpoints are implemented through
copy-on-write fork
– Speculator also saves the state of any open
file descriptor and copies all pending signals
• Forked child is not placed on the ready queue
– It just waits
• If speculation fails, forked child assumes the
identity of the failed parent
New kernel structures
• Speculation structure:
– Created during create_speculation()
– Tracks the set of kernel objects that depend
on the speculation
• Undo log:
– Associated with each kernel object that has a
speculative state
– Ordered list of speculative modifications
Sharing checkpoints
• Letting successive speculations share the same
checkpoint reduces the speculation overhead
• Two limitations
– Speculator limits the amount of rollback work
by not letting speculation share a checkpoint
that is more than 500 ms old
– Cannot let a speculation share a checkpoint
with a previous speculation that changes state
of file system
Correctness invariants
1. Speculative state should never be visible to the
user or to any external device
– Speculator prevents all speculative
processes from externalizing output to any
interface
2. A process should never view speculative state
unless it is already speculatively dependent
upon that state.
Invariant implementations (I)
• First Implementation:
Block speculative processes whenever they try
to perform a system call
– Always correct
– Limits the amount of work that can be done by
a process in a speculative state
Invariant implementations (II)
• Second Implementation:
Allow speculative processes to perform systems
calls that
– Do not modify state
• “Read-only” calls such as getpid()
– Only modify state that is private to the calling
process
• It will be rolled back if speculation fails
Invariant implementations (III)
• Third Implementation:
Allow speculative processes to perform
operations on files in speculative file systems
– With VFS, can have multiple file systems on
the same machine
• Typically NFS plus FFS or ext3
• Must check type of file system
– Have a special bit in superblock
Multiprocess speculation (I)
• Whenever a speculative process P participates
in interprocess communication with a process Q
• Process Q must become speculatively
dependent on the speculative state of
process P and get checkpointed
Multiprocess speculation (II)
• Whenever a speculative process P modifies an
object X
• Object X must become speculatively
dependent on the speculative state of
process P and get an undo list