0% found this document useful (0 votes)
31 views9 pages

Daruma: Regaining Trust in Cloud Storage: Doron Shapiro Michelle Socher Ray Lei Sudarshan Muralidhar

1) The document describes a senior design project at the University of Pennsylvania to create a cloud storage system called Daruma that eliminates the need to trust any single cloud provider by combining space across multiple providers. 2) Daruma aims to provide confidentiality, integrity, and availability guarantees for user files while maintaining a simple user experience comparable to existing cloud services. 3) The system architecture divides responsibilities across modules for managing metadata, file encoding and distribution, and fault recovery to achieve the security, usability and performance goals.

Uploaded by

M Faizan Nasir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views9 pages

Daruma: Regaining Trust in Cloud Storage: Doron Shapiro Michelle Socher Ray Lei Sudarshan Muralidhar

1) The document describes a senior design project at the University of Pennsylvania to create a cloud storage system called Daruma that eliminates the need to trust any single cloud provider by combining space across multiple providers. 2) Daruma aims to provide confidentiality, integrity, and availability guarantees for user files while maintaining a simple user experience comparable to existing cloud services. 3) The system architecture divides responsibilities across modules for managing metadata, file encoding and distribution, and fault recovery to achieve the security, usability and performance goals.

Uploaded by

M Faizan Nasir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

University of Pennsylvania SEAS Senior Design

Spring 2016

Daruma: Regaining Trust in Cloud Storage


Doron Shapiro Michelle Socher Ray Lei Sudarshan Muralidhar
CIS 2016 CIS 2016 NETS 2016 NETS 2016

Boon Thau Loo Nadia Heninger


Advisor Advisor

ABSTRACT to build a system that combined cloud providers in a much more


space-efficient manner, while still providing the availability guar-
Currently, cloud storage services are used by consumers for a wide antees of a backup protocol.
variety of important documents, including family photos, health- Tahoe-LAFS is a system implemented by Zooko Wilcox-
care information and proprietary corporate data. These services O’Hearn that also attempts solve these problems by combining
all make promises about their storage solutions, usually including cloud providers [11]. However, Tahoe is targeted at system admin-
some guarantees of confidentiality, integrity, and availability. How- istrators and network security specialists. It is virtually impossible
ever, downtime is a fact of life for cloud services and, for better for an average user to set up, and requires users to be able to run
or worse, many providers openly admit to being able to access cus- code on their cloud storage providers. Thus, it is not a tool that can
tomer files for purposes ranging from analytics to law enforcement. be used out of the box - it requires a significant time and money
Daruma solves this problem by eliminating the need to trust investment before its benefits become apparent.
any cloud provider. We run no servers ourselves - instead, we com- We designed a system that accomplishes all of these goals in a
bine and secure the space on cloud services already used by con- verifiable manner while abstracting its inherent complexity behind
sumers (like Dropbox and Google) with advanced cryptographic a familiar interface.
and redundancy algorithms. Our system provides a simple guaran-
tee: no one cloud service provider can read, change, or delete your
files – ever. Daruma feels just like an existing service - there are 2. APPROACH
no extra passwords to remember or frustrating workflows to nav- 2.1 Threat Model
igate. Daruma handles the complexities of security and reliability
for users, allowing them to confidently utilize cloud storage without One of our first steps was carefully developing a threat model for
worrying about their previously inherent risks. our application. We started by characterizing our users and split
them into two groups based on need:

1. INTRODUCTION Business users Likely to be non-technical, but have proprietary


corporate information that requires confidentiality and fast avail-
Today, millions of people rely on cloud services to store their files. ability.
Companies like Dropbox Inc., Box Inc., and Alphabet Inc. have Personal users Similarly non-technical but might be motivated
popularized the model of offering free or cheap storage capacity on more by a sense of general privacy in addition to concerns of spe-
servers they administer. These companies offer a convenient offsite cific data sensitivity (e.g. tax form storage).
backup and file-sharing service for their users and these users often
take advantage of this by storing a variety of confidential or other- We then developed a list of adversaries and their capabilities:
wise valuable documents on them. However, cloud storage neces-
sarily come with a large set of risks. The ”cloud” is susceptible to a Rogue Provider Providers are all corporations that administer
wide variety of failures, ranging from a user’s spotty connection to servers to provide remote data storage. They have the ability to
the Internet to sophisticated hacking attempts and government in- read, modify, or delete all files that a user stores, in addition to
terventions. Non-technical end-users rarely consider - or even know the ability to entirely remove a user’s account, provide a third party
how to consider - what might happen were these cloud services to with access, and collaborate with other providers. They may do this
fail. for a variety of reasons: they may be hacked by a third party, house
There are several naive approaches to guard against poten- malicious employees, be compelled by a government, see opportu-
tial cloud service failures. One option is to encrypt all files before nities for business value, or make engineering mistakes that lead to
upload. However, such a process forces users to remember an en- these outcomes.
cryption key; if this key is forgotten, all files are permanently lost. State-Level Actor Governments are increasingly expanding their
We wanted to build a solution that securely stored files and any en- electronic surveillance capabilities through technical and legal
cryption keys on the cloud, without leaking any sensitive data. means. Today, it is widely assumed that a state-level actor can read,
A potential solution to achieve fault-tolerance is to backup block, and modify network traffic. They can also in many cases
files on multiple providers. However, this is very space inefficient, legally compel corporations to perform many of the actions listed
as it stores the complete file on every provider used. We strove under Rogue Provider. These actions may be taken on both a large

1
University of Pennsylvania SEAS Senior Design
Spring 2016

scale in a dragnet-style surveillance effort or in a targeted manner end, we decided to implement a fault recovery algorithm that would
for an individual investigation. go beyond the basics of merely detecting errors and recovering files
to also securely fix errors as they came up. To help users understand
With this in mind, our primary goal was to protect the confidential-
the overall system health as we did this, we decided to implement a
ity, integrity, and accessibility of user file contents on computers
scoring system that would educate users about the behavior of their
other than the users’ own. As an additional goal, we wanted to be
providers and advise them on how each one was affecting the sys-
able to apply these guarantees to file metadata where possible.
tem. Finally, we decided to make all of our operations atomic with
To do this, we set the following security guidelines:
automatic rollback to reduce the number of states our system could
(1) All network-bound traffic must be encrypted with a key be in.
providers do not know (i.e. SSL is insufficient for this pur- Performance To maintain ease of use, we wanted our users of our
pose). system to experience the same upload and download speeds that
(2) All encryption must be authenticated to prevent tampering. they maight see on their existing providers. Similarly, since we ex-
(3) It should be difficult for any one non-user actor to hold all the pected that many users would have free-tier provider accounts (with
information necessary to reconstruct a key. relatively low storage capacity), space efficiency was a high prior-
(4) It should be difficult for any one non-user actor to, by corrupt- ity. We captured both of these constraints setting as a goal a network
ing or removing access to data, remove access to a user file. cost that increased roughly linearly with plaintext size.

2.2 User Experience 2.3 Architectural Overview


One of our overarching goals was to make our application usable Our architecture can be divided into several modules:
for non-technical end-users, so we had to develop an experience
that made our features clear without forcing users to understand Manifest Module This module presents a filesystem abstraction to
any of the behind-the-scenes implementation. We identified four the user. Daruma-tracked files have numerous metadata points asso-
key areas to focus on: ciated with them and are not stored internally in the same hierarchy
that they are on a user’s filesystem, so this module translates be-
Key Management This project needed to keep track of many keys tween our internal representations and a filesystem representation.
for both file encryption and provider authentication. However, users
Encryption Module This module encrypts and decrypts all user file
rarely make proper key selection or storage choices, so we decided
data, including the manifest. This allows us to guarantee confiden-
that users should not have to remember any keys beyond their exist-
tiality of all file data as well as a significant amount of metadata
ing provider login credentials. Making all key management happen
contained in the manifest, such as user file names. By using au-
behind-the-scenes would also decrease the friction in adopting and
thenticated encryption, this module allows us to verify the integrity
using our system.
of data upon retrieval. All files are encrypted with different keys,
File Management Users needed a way to add, remove, edit, and all of which are stored in the manifest.
read files tracked by the system. The filesystem interface handles
all of these operations in a way that computer users are already very Distribution Module This module takes encrypted files, splits them
familiar with, so we decided we would either present a filesystem- into shares using an erasure encoding scheme, and disperses the
like interface or operate on top of the existing filesystem. While the shares among providers. Under this scheme, a configurable number
current implementation only supports the latter due to its ease of of shares may be lost while still guaranteeing retrievability of data.
use and interoperability, the former may be more suited for certain Master Key Module This module administers the randomly gener-
archival purposes, as it allows an end-user to store files in the cloud ated master key, which is used to encrypt the manifest. The master
that they may not want to store locally (perhaps due to local capac- key is also split into shares using a variant of the Shamir Secret
ity constraints). Sharing algorithm that provides both resilience to share deletion
To provide the least friction in usage, we decided to implement and share secrecy. When necessary, the master key is regenerated
the user interface pattern popularized by Dropbox of synchronizing to protect secrecy.
a target directory in a user’s filesystem and visually indicating the Provider Module This module provides a common abstraction for
synchronization status of each file in the operating system file ex- all provider APIs.
plorer (e.g. Finder on OSX).
Resilience Module This module runs on top of all of the other mod-
Fault Handling This project is intended to be exposed to Byzan-
ules and handles recovery from provider errors. If a provider goes
tine faults across a distributed network, ranging from network fail-
down or corrupts files, it can execute the relevant recovery by, for
ures to state-level attackers. Some of this behavior can be recovered
instance, redistributing files among the remaining providers or re-
from, while other behavior can result in more catastrophic failure.
uploading uncorrupted data to the faulty provider.
Moreover, the degree to which a system can recover from data is
often configurable (e.g. by increasing redundancy). Since our tar-
get users were non-technical, we wanted our fault handling process 2.4 Filesystem Abstraction
to be as automatic as possible while still communicating to the user To implement our system, we began by designing a secure filesys-
the status and risks of our system. As a motivating example, we tem representation that we would use internally to track user files.
considered what would happen if a file was corrupted by a provider. We needed the following three properties out of our representation:
Our system could have been designed to be resilient this sort of er-
ror by just reconstructing the relevant user file with data from other (1) The filesystem should support storing files, files in directo-
providers and then notifying the user. However, there would still be ries, and empty directories. While there are other features that
many questions and decisions for the user at this point in time. Was filesystems support (e.g. hard links), we found this to be a min-
the overall system still safe to use? Should they remove the offend- imal set of features that gave us compatibility with average
ing provider? What should they do with the recovered file? To that daily use.

2
University of Pennsylvania SEAS Senior Design
Spring 2016

(2) The filesystem should hide file metadata (e.g. directory struc- encryption suites such as block ciphers (e.g. AES-CBC) where de-
ture and node names). cryption of a single block of ciphertext results in a single block
(3) The filesystem metadata should have a minimal storage over- of plaintext. This scheme ensures that no information is revealed
head. unless all blocks can be decrypted, protecting against certain brute
force attacks. [9]
We decided on the following structure:
(1) All filesytem metadata would be stored in a single man- Rabins Information Dispersal Algorithm (IDA) This dispersal al-
ifest file. This file would map user file paths (e.g. gorithm guarantees that any m of m shares can be used to recon-
”/Users/alice/documents/foo.txt”) to an internal codename struct the secret. It runs in O(m2 ) time and the total space re-
n
(e.g. ”5DB62955FBCD4228968A046A873A9236”) as well as quired for dispersing data of size F is m ∗ F . If the parameters
other metadata, such as encryption keys. are set so that n = m then the dispersal incurs no additional stor-
1
(2) User files would be stored in a flat structure in a single direc- age cost, consistent with AONT. Since each share will be m of its
tory. Each user file would be stored under its code name rather original size, this dispersal achieves maximum space efficiency for
than its file path, thus hiding file paths from providers. its threshold. However, this also implies that the scheme does not
(3) To achieve atomicity, we would treat user files as being im- make strong confidentiality guarantees - that is, some smaller than
mutable and change the codename references in the manifest m subset of shares may reveal information about the secret. [6]
after we had confirmed that any given operation had succeeded.
For more details on this implementation, see appendix B.1. Shamir Secret Sharing Shamir Secret Sharing is very similar to
Rabin’s IDA but with some key differences. In particular, it is a
2.5 Authenticated Encryption k-of-n threshold scheme such that k share are required for recon-
struction and any subset of fewer shares reveals no information. In
We chose Dan Bernstein’s NaCl package for authenticated encryp- order to make this guarantee, it is necessary that each share be the
tion [3]. We selected this package for several reasons: same size as the original secret. Therefore, for data of size F, the
Authentication This package offers authenticated encryption, total stored data for this scheme would be n ∗ F . [10]
which was pivotal for our ability to detect and punish provider mis-
behavior. This guarantees the integrity of our system because suc- Reed-Solomon Encoding This scheme makes the same guarantees
cessful decryption of authenticated ciphertext ensures that the data as Rabin’s IDA with the same space performance. Technically,
has not been modified. Reed-Solomon encoding takes a message of length m and extends
Vetted Cryptography Dan Bernstein’s work is highly respected it to be n symbols such than any m symbols can be used for full re-
within the cryptography community and this package has been vet- construction while the IDA actually generates shares. However, the
ted by those in that community. IDA is discussed primarily in academia while there are a plethora
No User Modifications The goal of this package was to provide us- of Reed-Solomon implementations (which follow the technical be-
able cryptography where all details and decisions are removed from havior of the IDA instead in terms of share generation). [5], [8]
users of the library. It is generally considered poor practice to roll
one’s own cryptography or even attempt to navigate the various set- Secret Sharing Made Short This scheme combines the confiden-
tings and parameters without strong expertise. tiality properties of Shamir Secret Sharing with the space efficiency
of Rabin’s IDA. Specifically, a random key is generated and used to
Using this package, we generate a random key for each file. The encrypt the message that is meant to be kept confidential. The en-
mapping from files to keys is then stored in the manifest. It was crypted data is then distributed with Rabin’s IDA and the key itself
important that each file get its own key (rather than a single master is protected and distributed with Shamir Secret Sharing. [4]
key being used to encrypt all files). Otherwise, our scheme would
reduce to e.g. AES-ECB, an AES mode that is known to reveal
We were excited to find in our research that the Secret Sharing
metadata about the data it encrypts because all identical blocks en-
Made Short Scheme closely mirrored parts of the solution scheme
crypt to the same ciphertext.
we had independently been discussing. We chose this scheme be-
cause it matched the confidentiality guarantees that we needed
2.6 Erasure Encoding while still providing optimal space efficiency for user files.
One of our main goals was to efficiently combine the space avail- We had to make a few modifications on top of this scheme
able to users through a single and secure interface. We therefore in order to match our use case. In particular, we implemented a
needed to distribute user files in such a way that we could maintain stronger variant of Shamir Secret Sharing (see 2.7) and we used an
confidentiality, reliability, and availability. This corresponds to our existing Reed Solomon Encoding library. Additionally, rather than
core promise to users: no provider can read, delete, or modify their generating a single key in order to encrypt a single file, we had to
files. generate a key for each file and store the mapping in a manifest (see
We therefore needed an overall distribution scheme that 2.5). The manifest was then encrypted with a randomly generated
would guarantee these properties while being both time and space master key. We then used Reed Solomon encoding to distribute the
efficient. In our own research, we found the following primitives encrypted files as well as the encrypted manifest and distribute the
and schemes available for such use: master key with Shamir Secret Sharing.
All-or-Nothing Transform (AONT) This is an s-of-s threshold 2.6.1 Dependency Sandboxing. We used the PyECLib library
scheme; all shares are required for the reconstruction of the orig- [2], a wrapper around the liberasurecode library [1], to provide
inal secret. As a result, the algorithm runs quickly, in O(log(s)) our erasure encoding capabilities. During the development of our
time for a message of length s, and no additional storage costs are project, we frequently found that by corrupting the shares provided
incurred. This is often used as a preprocessing step for separable to the library, we could induce a segfault in it and bring down our

3
University of Pennsylvania SEAS Senior Design
Spring 2016

entire program alongside it. To mitigate this, we reported the is- terface, so that the internal system can treat them all same way.
sues upstream to the PyECLib and liberasurecode maintainers, who All providers support GET, PUT, DELETE, and WIPE operations.
helpfully developed fixes to the problems we identified. However, These functions each wrap calls to the provider API.
since the erasure decoding stage of our pipeline necessarily took OAuth tokens and credentials for providers are cached in a
raw data from providers as input, we were wary of undiscovered user credentials JSON file on the user’s disk. This allows Daruma
bugs in this library continuing to crash our program. So, we devel- to automatically load cached providers on load, simplifying the user
oped a sandbox for this and other library dependencies that would experience.
run their stages of our data pipeline in separate processes and report
their results back to the main process via inter-process communi-
cation channels. With this development, any third-party library we
2.9 Resilience
depended on could crash (or be otherwise compromised) without
harming our main application logic - any errors would be passed to A major concern when implementing Daruma was ensuring that no
the resilience subsection (see 2.9). coalition of providers, through action or inaction, could corrupt the
state of the filesystem. This included cases where certain providers
2.7 Robust Secret Sharing strategically go offline during crucial uploads - for instance, if only
some providers are online during an upload, it stands to reason that
In order to encrypt our manifest without forcing users to remem-
different parts of the system could be out of sync. For this reason, it
ber and protect a new credential, we needed to generate a random
was necessary to make all write operations atomic. Similarly, repro-
master key that could be stored across the providers with confiden-
visioning (redistributing files and shares upon changing a threshold
tiality, integrity, and reliability.
or adding/removing a provider) was written such the system did not
For confidentiality and reliability, we needed a threshold
become corrupted if a provider failed during the operation.
scheme such that for shares distributed across n providers, re-
In order to ensure that our guarantees were maintained as var-
construction of the secret would be possible with any k of these
ious cloud providers failed and came back online, it was necessary
providers but no information about the secret would be revealed to
to quickly detect and repair errors on providers as soon as possible.
any subset of fewer than k providers. These requirements were sat-
Daruma needed to account for the fact that a provider could cor-
isfied by Shamir Secret Sharing (see 2.6, [10]). However, Shamir
rupt files at one time frame and thereafter behave correctly as some
Secret Sharing does not provide the integrity guarantees that we
other provider failed. Both our major recovery/distribution proto-
needed. In particular, Shamir Secret Sharing is tolerant to some
cols - Robust Secret Sharing and Reed-Solomon, for master keys
subset of missing shares, but it is vulnerable to corrupt shares. We
and files, respectively - were written to provide, upon successful
therefore needed to apply a verification wrapper around this scheme
recovery, information about which recovered shares were invalid.
that would guarantee the integrity of the reconstructed secret.
The providers with invalid shares (and providers who did not return
Such schemes require metadata to be transmitted as part of
a share at all) were tagged as failing, and upon recovery, repaired.
the shares. Much of the academic work within this space has been
The repair itself consisted of re-encrypting (with a new key) and re-
targeted at minimizing the size of the resultant shares as the num-
distributing a file (in the case of an invalid Reed-Solomon share) or
ber of players increased. However, since we only needed to apply
creating and sharing a new master key, and then redistributing the
this scheme to a single encryption key and our target use case would
manifest (in the case of an invalid Robust-Secret-Sharing share. It
have no more than 10 providers, we favored simplicity over asymp-
is important to note that, in order to ensure that information was not
totic efficiency. We therefore chose the “Verifiable Secret Sharing”
leaked, new keys had to be used on all repairs; if not, a malicious
scheme presented by Rabin and Ben-Or [7].
provider could pretend to lose shares in order to collect multiple
For additional information on the inner workings of Shamir
shares of the same plaintext.
Secret Sharing and the Robust Secret Sharing wrapper, see ap-
In the case of a permanently failing provider, it was important
pendix A
that we have a limit to retries, after which we would decide not to
Open-source implementations of Shamir Secret Sharing are
continue repair attempts. It was also crucial that we report to users
available and we had originally planned to make use of an exist-
when providers were failing badly, so that such a provider could
ing library. We selected a library based on its apparent code qual-
be removed. However, we needed a way to differentiate between
ity and hygiene, but when we implemented our own unit tests we
providers who were experiencing temporary difficulties (and failed
found a security flaw within that library. After communicating with
several times in a short time span, but resumed normal service af-
the team behind that library, we decided that our best option would
terwards) and providers who exhibited patterns of failure over time.
be to implement Shamir Secret Sharing ourselves. Furthermore, we
To do this, we used an exponential smoothing formula of the form
were unable to find any Robust Secret Sharing implementations as
s = αx + (1 − α)s for a provider score. Here, s is the provider’s
this space has been largely academic. As best as we know, ours is
score, and x is a data point - 1 if a provider responded to the most
the first Robust Secret Sharing implementation in the wild.
recent request without errors, 0 otherwise. This score represents
2.8 Providers the amount of time a provider is responding without flaws, and ac-
counts for both past and current behavior, weighting the latter more
Daruma currently supports four popular customer facing cloud stor- heavily. Below a certain threshold, a provider is considered ”red” -
age companies: Dropbox, Google Drive, OneDrive and Box. retries are no longer attempted, and the user is advised to remove
We created two flows for creating providers. The first supports the provider. Below a higher threshold, a provider is considered
providers that implement the OAuth flow, and redirects the user to ”yellow” - the provider has been experiencing failures, but seems
a link where they can log in on the providers’ website. The sec- to be mostly okay. Tweaking the thresholds and α enable us to ac-
ond supports providers that take a single parameter (for instance, a count for a wide variety of provider behaviors - either penalizing
provider residing on a local disk might require only the providers’ harshly or being more tolerant, as necessary.
path on disk for construction). For more information on the resilience protocols and how ex-
After providers are created, all providers share a common in- ponential smoothing parameters were chosen, see B.

4
University of Pennsylvania SEAS Senior Design
Spring 2016

3. RESULTS AND MEASUREMENT Uniform This assumes a uniform capacity distribution across
providers. In order to match the 32 GB total from the Free Tier
3.1 User Interface category, we assumed here that each of the four providers would
Our goal was to provide easy integration with a user’s existing have 8 GB.
filesystem. Below is a comparison of our interface with the tra-
ditional Dropbox interface:

We showed the breakdown of the total capacity usage in four differ-


ent cases (in two of those cases, the breakdown did not vary from
the Free Tier distribution to the Uniform distribution while in the
other two it did and both versions are therefore shown). We consid-
ered the following four distribution schemes:
Fig. 1. Daruma Finder Integration on OSX

Without Replication Users store files separately on each of their


providers without any redundancy. This represents the way users
currently interact with cloud providers.
This will give the greatest possible amount of unsecured avail-
able space but no secured space.

Replication This is the case of naive full redundancy. Each


provider will have a full copy of each user file.
This will always be the inverse of Daruma - that is, its over-
head will always be equal to the secured space on Daruma and
Daruma’s overhead will always be equal to its secured space when
they are applied to the same capacity distribution.

Fig. 2. Dropbox Finder Integration on OSX


Manual Duplication Each user file is stored in full on two
providers.
3.2 Capacity Utilization This provides more secured space than Daruma can on a Free
We broke capacity down into three possible categories: Tier, but will significantly more overhead that consumes the large
amount of available unsecured space that Daruma can still offer. It
Available Secured This represents the space that will be available offers significantly less secured space and much more overhead as
to users with a redundancy guarantee. compared to Daruma used with Uniform capacity distribution.

Available Unsecured This is space that is still available to users


on their individual providers. They can access this space through Daruma We use Reed Solomon Encoding to efficiently distribute
the provider interfaces that they have always used but without any user files under a variety of capacity distribution models. Our goal
protection through redundancy. is to minimize overhead while securing as much space as possible.

Unavailable Overhead This space is lost and unavailable to the


user for storage. This is the cost of redundancy for each scheme.
3.3 Speed
We also considered two different cases to show how Daruma’s Daruma’s measured speed performance is quite good compared to
comparative performance varies across the capacity distribution of using providers directly. All operations involve parallel network
the providers: requests to all providers, so Daruma’s speed is fundamentally af-
fected by the speed of the slowest provider. However, because of
Free Tier This shows the capacity utilization in the case where a the space efficiency of Reed-Solomon encoding, Daruma shares
user has accounts on the free tier of each of the providers. This are smaller than the original files, resulting in some speed gains.
gives them 2 GB from Dropbox, 5 GB from OneDrive, 10 GB from In the following tests, Daruma was configured with 5 providers (4
Box, and 15 GB from Google Drive. online), and compared to file operations directly on providers. Note
the the axes are in a logarithmic scale.

5
University of Pennsylvania SEAS Senior Design
Spring 2016

4.2 Architectural Considerations


This conclusion was deeply considered in the architecture of
Daruma. First, all Daruma code runs on users’ computers. This
means that if an adversary were to try to compromise Daruma cen-
trally, they would no central surface to attack: there are no Daruma
servers. There is still, however, the possibility that we as project
developers can write or introduce malicious code in the project,
either due to malintent, as the result of a legal order, or because
of hacking. To protect against this, the entire Daruma codebase is
published as an open-source project so that it can be audited before
use. Even if an average end-user does not have the technical know-
how to verify that our code operates as advertised, this opens up the
In a download operation, Daruma simply has to download Reed- opportunity for trusted third-parties to inspect the code and publish
Solomon shares from providers. The size advantage of using these their results.
shares becomes more apparent as file sizes grow larger, as Daruma Finally, cloud security is a very complex landscape that often
becomes significantly faster than the slowest provider by the time outstrips the technical understanding of its users. Because of this,
file sizes approach 1GB. we spend significant effort making sure that Daruma could provide
all of its features to an entirely non-technical end-user. There are
solutions that achieve some of the same goals that Daruma does
with significantly more user maintenance and understanding, but
we strove to ensure that traditional maintenance details, ranging
from key management to error handling, were handled automati-
cally.

5. DISCUSSION
Our current product is capable of replacing an application like the
Dropbox client on a user’s computer for most non-social tasks on a
day-to-day basis (e.g. barring collaboration and link sharing). Users
can currently log in with their credentials for Dropbox, Google
Drive, OneDrive, or Box in addition to using standard filesystem
paths as local providers (e.g. to use a mounted local backup drive).
In an upload operation, Daruma first uploads the file, and then Once logged in, the application will watch a Daruma folder in the
uploads an updated manifest. For small files, the extra operation user’s home directory for changes and synchronize the files it stores
causes Daruma to be slower than other providers, but only by a online with the Daruma folder state. When providers go down or
few seconds (about 2 seconds slower for 100kb files). As files grow otherwise remove access to or corrupt files, we properly recover
larger, the file size advantage described previously becomes more the system if at all possible.
influential, and Daruma again becomes significantly faster than the There are still some inherent weaknesses in the system as well
slowest provider. as areas that we see opportunities for improvement in. We break
these areas into the following categories:
4. ETHICAL AND PRIVACY CONSIDERATIONS Threat Model Weaknesses Our system has a threat model that is in-
tentionally limited to only consider parties outside the user’s com-
4.1 Social Context puter as potential attackers. While this covered both providers and
Online security and privacy have gained increased public scrutiny ourselves, we do not take significant steps to protect sensitive ma-
recently due in part to stories of large corporations being hacked terial on the computer system we are running on. If we were to
and revelations of mass surveillance by governments around the expand our threat model, additional thought might be put into how
world. While this is a rapidly evolving situation, several points re- to sandbox our application from local threats.
main clear. First, regardless of the risks, users and businesses are Usability Usability was a high priority for us as we developed, so
willingly trusting more and more of their data to the cloud. Sec- significant effort was put into making interactions feel familiar for
ondly, the threat model a security-conscious company must main- a non-technical user. However, there are other friction points that
tain needs to include itself as an adversary, either due to the possi- might be considered, such as our requirement that users have many
bility of a rogue employee or because it may become the target of existing cloud provider accounts. To mitigate these issues, future
hackers or governments. work could include making it easier to sign up for new provider ac-
This latter point was recently highlighted in the high-profile counts as well as more informative communication regarding sys-
legal battle between Apple, Inc. and the Federal Bureau of Inves- tem statistics such as capacity utilization.
tigation. During the case, it was revealed that Apple had imple- Sharding When sharing files, Daruma currently builds shares for
mented two versions of PIN protection in different generations of the entire file at once. For large files, this results in a significant
phones it manufactures and only in the earlier generation could it memory cost. This can be avoided by cutting files into many small
provide a tool to bypass the protection. While the case sparked a pieces (shards), and sharing these sharesd individually. The shards
wide debate over whether the FBI should compel Apple to produce would then be reconstructed on download. If parallelized, a shard-
such a tool, the conclusion for Apple was clear: in the earlier gener- ing operation would also significantly improve speed, as it would
ation, its position of trust made it an adversary to complete security. allow Daruma to upload multiple parts of a file at once.

6
University of Pennsylvania SEAS Senior Design
Spring 2016

Sharing While our system rivals Dropbox and other cloud [10] Adi Shamir. How to share a secret. In Communications of the
providers in usability, Daruma lacks certain features that have be- ACM, number 22-11, pages 612–613, November 1979.
come fundamental for other cloud providers. For instance, files [11] Zooko Wilcox-O’Hearn. Tahoe-LAFS. https:
sharing (allowing other users to view your files) is a feature com- //www.tahoe-lafs.org.
monly used on Dropbox, but unavailable in Daruma. Such a feature
could be implemented if, on sharing a file with a secondary user,
some public key and manifest information was passed to the sec-
ondary user. This feature would further help users transition from
individual providers to Daruma.
Cross-Computer Usage Currently, our usage model assumes that
users will not have concurrent Daruma sessions on two different
computers. If this assumption was broken, we would have to im-
plement several safeguards to ensure the systems do not go out of
sync or enter a corrupted state if providers maliciously fail. While
hard, these problems are not intractable, and would make Daruma
more usable for everyday users.
Cross-Platform Interfaces Currently, Daruma’s user interface only
works on OSX. Making Daruma’s GUI usable on all platform is a
major future goal.

6. ACKNOWLEDGMENTS
We would like to thank our advisors, Nadia Heninger and Boon
Thau Loo, as well as CIS senior design instructors Ani Nenkova
and Jonathan Smith, for the invaluable technical advice and logis-
tical direction they gave us throughout this process. We would also
like to thank Brett Hemenway for his inspiration during our brain-
storming process and for his in-depth technical guidance as we im-
plemented the project. Finally, we extend our gratitude to Thuy Le
for designing our logo and Melanie Wolff for helping us prepare
our demonstrations.

7. REFERENCES
[1] liberasurecode. https://bitbucket.org/tsg-/
liberasurecode/.
[2] PyECLib. https://bitbucket.org/kmgreen2/pyeclib.
[3] Dan Bernstein. NaCl: Networking and Cryptography library.
https://nacl.cr.yp.to/.
[4] Hugo Krawczyk. Secret sharing made short. In CRYPTO
’93 Proceedings of the 13th Annual International Cryptology
Conference on Advances in Cryptology, number ISBN:3-540-
57766-1, pages 136–146, 1993.
[5] R.J. McEliece and D.V. Sarwate. On sharing secrets and reed
solomon codes. In Communications of ACM, number 24-9,
pages 583–584, September 1981.
[6] Michael O. Rabin. Efficient dispersal of information for se-
curity, load balancing, and fault tolerance. In Journal of the
ACM, number 36-2, pages 335–348, April 1989.
[7] Tal Rabin and Michael Ben-Or. Verifiable secret sharing and
multiparty protocols with honest majority. In Proceedings of
the 21st Annual ACM Symposium on Theory of Computing,
number ACM 0-89791-307-8/89/0005/0073, pages 73 – 85,
May 1989.
[8] I. S. REED and G. SOLOMON. Polynomial codes over cer-
tain finite fields. In Journal of the Society for Industrial
and Applied Mathematics, number 8-2, pages 300–304, June
1960.
[9] Ronald L. Rivest. All-or-nothing encryption and the pack-
age transform. In Lecture Notes in Computer Science, number
1267, pages 210–218, May 2006.

7
University of Pennsylvania SEAS Senior Design
Spring 2016

APPENDIX will correctly recover the secret originally shared [7]. However, we
do not ever want the providers learning the original secret so rather
than having players that broadcast their data to each other upon re-
A. IMPLEMENTED CRYPTOGRAPHY construction, we request information back from each provider. We
then take a vote on the secrets constructed from the view of each
For our project, we implemented two core cryptography schemes.
provider, and the correctness of this scheme reduces to the guaran-
tees made in that paper.
A.1 Shamir Secret Sharing
This is a k-of-n scheme such that any k shares can be used to re- B. RESILIENCE
construct the secret but that any subset of fewer shares reveals no
information. This is fundamentally achieved with polynomial eval- B.1 Atomic Algorithms
uation and interpolation [10]. In order to guarantee that sudden provider failures would not put
Sharing begins with the generation of a polynomial P with k the system in an unstable state, all algorithms needed to be atomic
- 2 random coefficients and the y-intercept set as the secret to be (all-or-nothing). For put and get operations, this was achieved by
shared. The polynomial will therefore be of degree k - 1 and n x- using a manifest update as a ”commit” operation.
values will be selected for evaluation. The resultant (x, P(x)) points
are then distributed as the shares associated with the secret [10].
For reconstruction, a minimum of k of these points Algorithm 1 Put file
can be used to reconstruct the polynomial P. By taking P(0) (1) Share file under new random name
it is straightforward from there to recover the secret [10]. (2) Update manifest to point file to the random name
We can demonstrate how this (3) If file existed previously, delete files with the old random name
scheme works with motivating exam- Fig. 3. n=3, k=2
ple. We consider a case with 3 play-
ers where any 2 of them can be used
for reconstruction. Our polynomial
with the secret as the y-intercept will Algorithm 2 Delete file
therefore be of degree 1 (i.e., a line). (1) Update manifest to remove file
We see from this diagram that with (2) Delete files with the random name previously pointed to by file
any two shares (points) we can re-
construct the line and therefore the
secret. However, an infinite number
of lines could pass through a single In each of these operations, a failure before or after the manifest
point so with one share no information about the secret is revealed. update results in a consistent state across all providers (with per-
While this scheme protects well against missing shares, it haps some garbage files that need to be deleted). While manifest
is vulnerable to corrupt shares. In that case, polynomial interpo- failures do have the potential to bring the system out-of-sync, our
lation will construct from the points a polynomial different from threat model assumes that a maximum of threshold providers fail
the one originally generated, preventing the recovery of the secret. at a time. If this is the case, then the manifest operation is com-
This problem is addressed by Robust Secret Sharing. mitted, and the manifest will be repaired on later operations as the
failing providers become operational. This random-name scheme
makes any operation reversible because there is a clear ”commit”
A.2 Robust Secret Sharing
step, without which all providers are left unfinalized.
We used Robust Secret Sharing so that we could tolerate and iden-
tify up to k - 1 corrupt shares. The algorithm proceeds as follows - B.2 Reprovisioning
Shares are generated as before through Shamir Secret Shar-
ing. Each of these shares is then used as the message for n generated As a result of our assumption that providers can go offline or out
check vectors. A check vector is therefore generated for each pair of business at any time, we needed to make it possible for users
of to remove providers from the system and replace them with new
, and this metadata is sent to the providers along with the generated ones. However, it was crucial that we maintain all guarantees after
shares. such a replacement. To do this, all files and the master key needed
When the providers return their shares and metadata, we use to be reshared across providers with the new threshold parameters.
that metadata to create lists of verified shares from the perspective Atomicity was achieved by using a manifest-commit operation sim-
of each provider. This allows us to use Shamir Secret Sharing for ilar to Put and Wipe. As before, failures in steps (1) and (2) result
each such list to reconstruct what each provider would think the se-
cret would be if they had access to the shares and metadata. Algorithm 3 Reprovision
We then apply a voting scheme to select the correct secret.
If fewer than k providers returned corrupt shares and we have (1) For every file in the system, recover it from the old provider
at least k honest shares, the secret returned from this process is set, and reshare it to the new provider set with a new name
guaranteed to be the one that was originally shared. (2) Create a new manifest with information about all new shares,
This scheme as described varies slightly from the algorithm and share it across new providers
presented by Rabin and Ben-Or [7]. In particular, their algorithm (3) Share the new master key across all providers, and broadcast
guarantees that upon reconstruction all players broadcast their in- the new manifest name (commit).
formation and the guarantee they make is that each honest player

8
University of Pennsylvania SEAS Senior Design
Spring 2016

only in creating some extra garbage, and a failure in step 3 results


in a consistent state that can be repaired.

B.3 Exponential Decay


After a successful operation, all providers are collected so that their
internal scores can be updated. The scores are updated with an ob-
servation xp,t+1 , representing the providers’ performance in the
last operation. If the provider was successful, xp,t+1 = 1, and if the
provider failed (invalid share, connection failures, etc), xp,t+1 = 0.
The score is then updated according to the update rule
scorep,t+1 = αscorep,t + (1 − α)xp,t+1
For our purposes, α was chosen to be 0.7. Thus, the score is a
reflection of both the provider’s past and current behavior. If a
provider’s score drops below a threshold of 0.05, it is considered
RED, and users are alerted that, while the system is fully opera-
tional, there are major problems with the provider. For scores be-
tween 0.05 and 0.95, users are notified that the provider is expe-
riencing difficulties. If a provider is red, repairs and retries are not
attempted. However, any requested operation is attempted at least
once in all cases, ensuring that if a failing provider comes back
online, its scores will rise back up.

B.4 Repair and retry


If an operation fails, all identified failing providers are updated ac-
cording to the exponential decay update rule. Then, if no providers
are RED, the operation is retried. Because a failure necessarily de-
creases the score of at least one provider, the retry procedure will
always halt - either with a successful operation, or with a provider
becoming RED.
If an operation is started when some providers are RED, the opera-
tion is still tried once. Then, on failure, the operation is not retried,
as retries only happen when all providers are not RED.
When a provider loses or corrupts a file/key share, that share needs
to be replaced as soon as possible. Upon a successful recovery and
diagnosis of failing providers, any necessary repairs are performed.
For failed key shares, this involves choosing a new master key,
re-encrypting the manifest, and re-sharing and re-distributing the
bootstrap keys and manifest. For failed file shares, the repair in-
volves re-uploading the file, encrypted with a new key and stored
with a new random name. These repairs follow the same protocols
as any other operations, and so ensure that the system stays in a
stable state, regardless of any mid-operation provider failures.

You might also like