0% found this document useful (0 votes)
17 views26 pages

Parakeet: Practical Key Transparency For End-to-End Encrypted Messaging

Uploaded by

shaded549
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views26 pages

Parakeet: Practical Key Transparency For End-to-End Encrypted Messaging

Uploaded by

shaded549
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Parakeet: Practical Key Transparency for

End-to-End Encrypted Messaging


Harjasleen Malvai∗† , Lefteris Kokoris-Kogias‡§ , Alberto Sonnino‡¶ , Esha Ghoshk ,
Ercan Oztürk∗∗ , Kevin Lewi∗∗ , and Sean Lawlor∗∗
∗ UIUC, † IC3, ‡ Mysten Labs, § IST Austria, ¶ University College London (UCL), k Microsoft Research, ∗∗ Meta

Abstract—Encryption alone is not enough for secure end-to- server is adhering to certain update rules. Finally, a consistency
end encrypted messaging: a server must also honestly serve public enforcement protocol must be used to prevent the server from
keys to users. Key transparency has been presented as an efficient serving different versions of the directory to different users.
solution for detecting (and hence deterring) a server that attempts
to dishonestly serve keys. Key transparency involves two major To address the problem of providing auditable identity bind-
components: (1) a username to public key mapping, stored and ings, Chase et al. [16] introduced a primitive called verifiable
cryptographically committed to by the server, and, (2) an out- key directory (VKD) and defined its required security proper-
of-band consistency protocol for serving short commitments to ties. Their system, called SEEMless, instantiates a VKD and
users. In the setting of real-world deployments and supporting
proves the correctness of the underlying scheme. Although the
production scale, new challenges must be considered for both of
these components. We enumerate these challenges and provide evaluation of SEEMless indicates that it veers toward practical-
solutions to address them. In particular, we design and imple- ity, SEEMless makes the following simplifying assumptions,
ment a memory-optimized and privacy-preserving verifiable data omitting issues that must still be resolved in order to be
structure for committing to the username to public key store. applicable to a large-scale, real-world deployment. SEEMless
To make this implementation viable for production, we also assumes that there exists a mechanism for clients to obtain
integrate support for persistent and distributed storage. We
also propose a future-facing solution, termed “compaction”, as
consistent views of a small server commitment, at every server
a mechanism for mitigating practical issues that arise from epoch. Their implementation relies on local RAM storage
dealing with infinitely growing server data structures. Finally, to store all of the data ever required for key transparency.
we implement a consensusless solution that achieves the min- Scalable implementations need a separate, optimized storage
imum requirements for a service that consistently distributes layer. The data structure that SEEMless uses to keep state,
commitments for a transparency application, providing a much
more efficient protocol for distributing small and consistent
which they call a history tree, assumes that it is feasible to
commitments to users. This culminates in our production-grade perpetually store enough data to reconstruct the state of the
implementation of a key transparency system (Parakeet) which server’s cryptographic data structure from any epoch. Even
we have open-sourced, along with a demonstration of feasibility though their data structure is well-optimized to satisfy the
through our benchmarks. requirement of storing this much data, it becomes a bottleneck
I. I NTRODUCTION when scaling to billions of users.
Hence, while a good starting point, these assumptions render
The use of end-to-end encrypted (E2EE) messaging ([31], SEEMless, as is, infeasible for large-scale deployments. For
[32]) ensures that conversations between users remain private. instance, WhatsApp, one of the most popular end-to-end
It is just as vital, though, to make sure these interactions encrypted messaging apps, has a monthly active user base
happen with the right people. To avoid being compromised by of 2 billion [2] and has been downloaded on average 0.66
a man-in-the-middle attack, users must discover the necessary billion times [3] in the last five years. With the rollout of
public keys to identify each other. In reality, this is accom- multi-device [1] where each user device has a separate key,
plished either by scanning QR codes in person to validate the daily new key creations can be estimated to be roughly
recipient public keys or by depending on a service provider to 5.4 million (based on an average of 3 devices/browsers per
generate the relevant public key of a communication partner. user and download numbers). If we assume that existing users
The term key transparency refers to a system in which a update their keys at a similar volume, a rough estimate doubles
service provider stores the public keys of individuals on a this to approximately 10 million. This leads SEEMless’s data
publicly accessible key directory server so that users can query structure to require approximately 27TB (counting existing
this server for messaging individuals in their contact list. and new keys based on number of key updates/creations) of
However, since the server can supply users with outdated or storage within the first year (See Appendix I).
fabricated keys, this strategy makes the server a single point
A. Our Contributions
of failure. To counteract this, three complimentary methods
have been discussed in existing literature. Firstly, users must In this work, we present Parakeet, a key transparency system
be able to verify public keys served by the identity provider which is designed specifically to overcome the scalability chal-
as well as their own key(s) on a regular basis. Secondly, the lenges that limit prior academic works from being practical for
key changes must be publicly auditable, to ensure that the real-world encrypted messaging applications.
We achieve this by constructing a more efficient VKD, in flooding the open ports of a large fraction of the nodes
extending the VKD to support compaction, and then leveraging of this chain. The alternative described by [41] is to use
a simple consistent broadcast protocol to avoid the reliance on a header relay network, i.e. a small number of nodes
clients having to connect to blockchains (and the scalability serving a specific transparency application, essentially, a
issues associated with this dependence) to achieve consistency. centralized service. Since none of these solutions have
Below, we elaborate on these three central contributions: tackled this problem at a scale of billions of clients, finding
the minimal set of requirements for a consistency protocol
1) Building an efficient VKD. A VKD is a cryptographic remains an open question. To this end, we propose a light-
primitive, defined and implemented by [16], which allows weight consensusless consistency protocol that provides
an identity provider to store an evolving set of label-value these guarantees with low performance overhead.
pairs, commit to this set, and respond (using cryptographic
Paper organization. In Section II, we present an overview of
proofs) to client queries about this committed set and its up-
the Parakeet system, which consists of the VKD construction,
dates. At a scale of billions of users, however, the efficiency
the consistency protocol and an interface between the two. In
requirements of a system implementing a VKD become
Section III, we review the original VKD definition and present
more stringent than [16]’s implementation is able to meet.
a more storage-efficient construction based on an oZKS. In
We address these limitations by implementing a crypto-
Section IV, we extend VKDs to support the compaction
graphic construction called an ordered zero-knowledge set
operation to address storage limitations in practical systems.
(oZKS), which allows us to more efficiently realize a VKD
In Section V, we describe the consistency protocol used by
with storage improvements of up to an order of magnitude
Parakeet to serve commitments. In Section VI, we provide
when compared against existing solutions. In addition to
microbenchmarks and comparisons to prior works, run against
the storage optimizations for the VKD, we also present a
our open-source implementation of the system. We discuss
modular and flexible data-layer API, called StorageAPI,
related works in Section VII and conclude in Section VIII.
which can be implemented using any distributed database
solution. Our VKD implementation, written in Rust, has II. OVERVIEW
been published as an open-source library [5]. A central entity called the identity provider (IdP) keeps a
2) Supporting compaction. Many works on verifiable data local database linking users’ identifiers (e.g. phone numbers)
structures that support updates require append-only data to their public keys. It periodically sends a commitment of its
structures ([41], [16]). In large-scale practical contexts, latest state to a distributed set of authorities, called witnesses,
requiring the support of an ever-growing storage and in- that ensure its correct behavior. The witnesses store the latest
memory system that supports append-only data structures commitment and also communicate with clients to make
can be a barrier for the deployment of key transparency. We sure the identity provider is not censoring or eclipsing them.
introduce the term “compaction” to refer to an operation The users query the identity provider to look up the public
that allows reducing the data stored on the server by purg- key associated with a specific user identifier. Each user can
ing ancient and obsolete entries. Secure compaction loosens additionally monitor the history of their own public key. The
the requirement for append-only data structures, as intro- identity provider is trusted for ensuring that only permissioned
duced in [16], through a minimal additional assumption. By users can perform particular actions.
extending the existing oZKS functionality to enable secure
deletions, we present our two-phase compaction paradigm A. System & Threat Model
of a VKD, which can be brought on par to existing
Participants. Parakeet is run by the following participants:
works with comparable performance. The novelty of our
oZKS construction consists of the mechanism for enabling • Identity provider (IdP): The entity keeping the identifier-

secure deletions. We include a theoretical discussion of our to-key binding for every user and replying to users’ queries.
construction here as we believe it is of independent interest • Witnesses: A distributed set of authorities that cross-check

for any applications which may be limited by the storage the IdP.
requirements of append-only verifiable data structures. • Users: Ask the IdP to store a specific key bound to their

3) Serving commitments. All of the equivocation-based se- identifier, and query it to look up the key bound to specific
curity definitions for transparency require users being able identifiers. Users can also periodically check the history of
to access a small shared commitment. Some works assume their own key, to ensure that the server did not make any
a shared public ledger, such as a blockchain [41], [13]. unwarranted changes to their key.
Others rely on out-of-band gossip (see, e.g. [33]). However, The IdP and each of the witnesses generate a key pair
at a scale of billions of users, these mechanisms all have consisting of a private signature key and the corresponding
drawbacks. For example, across platforms and geographic public verification key, so that their identity is known.
regions, picking good out-of-band mechanisms, which do By definition, an honest witness always follows the Parakeet
not result in disconnected partitions of sets of users, is protocol, while a faulty (or Byzantine) one may deviate
a challenge. As [41] analyzed, using a blockchain to arbitrarily. We present the Parakeet protocol for 3f +1 equally-
respond a large number of queries could eventually result trusted witnesses, assuming a fixed (but unknown) subset of

2
at most f Byzantine witnesses. In this setting, a quorum is security. This combines the space efficiency gains of a VKD
defined as any subset of 2f + 1 witnesses. (As for many construction which relies on the assumption that users monitor
BFT protocols, our proofs only use the classical properties their keys constantly (e.g. [34]), with the security gains of a
of quorums thus apply to all Byzantine quorum systems [30].) construction based on append-only data structures (e.g.[16]).
When a protocol message is signed by a quorum of witnesses, C. Overview of our Consistency Protocol
it is said to be certified: we call such a jointly signed message
a certificate. Additionally, we assume the network is fully Our first result is to debunk the common belief that such sys-
asynchronous [21]. The adversary may arbitrarily delay and tems require consensus [36], [41], [16]. A first correct but inef-
reorder messages; however, messages are eventually delivered. ficient solution would be to use simple Reliable Broadcast [14]
We discuss the various guarantees for these parties below. to achieve all required properties in full asynchrony, a setting
where deterministic consensus is actually impossible [22].
Properties. The consistency protocol of Parakeet satisfies the This, however, is inefficient. Instead, we design a tailor-made
following properties: solution for Parakeet (Section V). The resulting system has
• Consistency: For a given epoch t, if a user outputs a com- low-latency, and additionally the identity provider can be
mitment comt and a different user outputs a commitment natively sharded across many machines (unlike consensus-
com0t , then comt =com0t . based solutions) to allow unbounded horizontal scalability.
• Validity: If an honest witness outputs comt as valid, then
D. Bridging the VKD & the Consistency Protocol
comt was proposed by the identity provider.
• Termination: If the identity provider runs the protocol For the most part, we handle the consistency and VKD
honestly for epoch t, commitment comt , eventually the components of Parakeet separately. However, they do need to
identity provider will produce a certificate certt for t. communicate with each other. Here, we provide an application
programming interface (API) to bridge this gap. The IdP must
The security properties of the VKD of Parakeet are more
provide small commitments to the witnesses and the users need
complicated to formally state, so we defer a more formal
to communicate with the witnesses to receive these up-to-date
treatment to appendix B and section III. At a high level, we
commitments. To this end, we define a simple witness API.
require that the construction of a VKD satisfy:
As stated above, the witnesses run the consistency protocol
• Completeness: If an identity provider honestly serves val- to certify a commitment to the internal state of the VKD. So,
ues, then, for any label label, registered with the identity our witness API includes an algorithm for the IdP to propose
provider, all users should receive (and accept) consistent a commitment at each server epoch and prove that it correctly
views of the value associated with label. updated its VKD (WitnessAPI.ProposeNewEp). We also allow
• Soundness: As stated before, the soundness definition used any party to query the witnesses to retrieve the commitment
in this paper is that of non-equivocation, i.e., if, at an epoch and certificate for an epoch (WitnessAPI.GetCom). Finally,
t, Bob accepts a value val as Alice’s key, the server cannot the API call WitnessAPI.VerifyCert verifies the certificate for
convince Alice that her key was val0 for the same epoch t a commitment. The details of this API are in appendix D.
with val0 6= val. Note, this is in the presence of auditors.
• Privacy: Privacy for a VKD is defined with respect to
III. V ERIFIABLE K EY D IRECTORY
all parties who are not the identity provider itself. The In this section, we will discuss the primitive Verifiable Key
responses to all API calls made by parties which are not the Directory (VKD), defined by [16], and its properties. Recall
identity provider are zero-knowledge with a well-defined, that a VKD consists of three types of parties: (1) an identity
permissible leakage function. provider (or, server), (2) users, and (3) independent auditors.
In a VKD, each user has an associated label, denoting their
B. Overview of our VKD Solution
username. The server stores a directory Dir of label-value
Our VKD solution, aimed at real-world, large-scale ap- pairs. Each value corresponds to a public key. The clients
plications, removes a majority of the assumptions made by can request updates to their own public keys – equivalent to
previous works in this domain, such as, [16], [34]. We modify requesting a change to the state of Dir. For efficiency, many
the constructions and definitions presented in [16] to use such requests are batched together, with updates going into
a primitive called ordered zero-knowledge set (oZKS) [6], effect at discrete timesteps (epochs). So, Dir is stateful, has of
which, together with a secure commitment scheme (cCS) and an ordered sequence of states Dirt , one state per epoch t.
a verifiable random function (sVRF), allow us to instantiate To support verifiability in a VKD, each state of the directory
a VKD in space linear in the number of updates for labels needs a corresponding commitment comt . The commitment
in the VKD. This is in contrast to [16]’s construction, that comt is made public using the WitnessAPI defined in ap-
additionally requires space linear in the number of server pendix D. This model assumes that any changes to the direc-
epochs, with a high constant factor (linear in the security tory go into effect when the corresponding epoch t goes into
parameter). We further augment these constructions to provide effect and the commitment, comt , for this epoch is published.
a method to allow compacting the underlying data-structures, The security of a VKD crucially relies on at least one
i.e., deleting values which are no longer in use, while still honest auditor per epoch checking the latest update for validity.
requiring very little monitoring from the user to provide This assumption is common across a majority of the work in

3
this area (see, e.g. [43], [16]), in order to maintain client- zero-knowledge set (oZKS). Here, we discuss the properties
side efficiency. In this work, we realize this assumption using of oZKS and briefly describe our implementation.
the witnesses, each of which runs the auditing operation as An oZKS is actually a further generalization of the append-
part of executing WitnessAPI.ProposeNewEp. Note that the only zero-knowledge set (aZKS) primitive, which was first
WitnessAPI solution can be used in addition to any other introduced in [16]. [6] presents an implementation (and
solution, such as gossip for distributing commitments [33] corresponding informal definition) of an oZKS construction.
or client auditing [43], also discussed in Section VII. When The oZKS primitive is used in a setting where a party (often
serving billions of users across different geographical regions called a server) holds a data-store of label-value pairs (with
and platforms, it is not always reasonable to expect clients to unique labels), and is trusted for privacy, but not to serve
run audit operations or to participate in a connected gossip consistent views of label-value pairs. The party uses oZKS
network, so using a set of witnesses only increases security. to commit to label-value pairs where the labels are all unique.
Initial work on zero-knowledge sets (e.g. [17], [35], [15])
A. Outline of this Section
committed to static data stores. The aZKS primitive ex-
The rest of this section is devoted to revisiting the VKD tended this to include insertions only. The recent imple-
primitive, with improvements targeted for production. In sec- mentation of an append-only oZKS [6] extends the aZKS
tion III-B, we recall the various algorithms for this primitive to include a strict ordering on when elements are inserted.
and its properties. For our storage-optimized VKD, we need Thus, an oZKS should support verifiable algorithms for: ini-
the oZKS primitive, whose motivation and properties we de- tially committing to a datastore (oZKS.CommitDS), insertions
scribe in section III-C. Then, in section III-D, we describe our (oZKS.InsertionToDS), membership/non-membership queries
VKD construction using the oZKS, which achieves properties (oZKS.QueryMem and oZKS.QueryNonMem). We use the
identical to [16]. Even with improved efficiency, several prac- oZKS primitive to build our VKD construction in section III-D.
tical problems remain, including: (1) allowing users to have Comparing with the oZKS from [6]. Note that the append-
multiple devices linked to their accounts, and multiple updates only property of the implementations of both [6]’s oZKS
for the same user in an epoch, (discussed in section III-E), and and [16]’s aZKS require an ever-growing storage requirement
(2) introducing a separate, efficient storage layer for a scalable on the server. In large-scale practical applications, never
VKD implementation (discussed in section III-F). purging obsolete data can be infeasible. The novelty of the
B. VKD Definition oZKS construction in our work consists specifically of the
mechanism for enabling secure deletions. We present this as
The primitive verifiable key directory (VKD) was first a middle ground solution between fully mutable auditable
defined by Chase, et. al. [16]. As stated before, the VKD server data structures that require users to be always online (e.g.
holds a directory mapping labels to values with one state per CONIKS), and append-only auditable data structures that lead
epoch. The commitment to this state (and the proof of correct to an unreasonable server storage cost for long-running sys-
update) is published by the server, using VKD.Publish. tems (e.g. prior and concurrent work on oZKS and SEEMless).
When users want to lookup the public key for a particular [18] also formalizes the protocol presented in the im-
label, they request this by calling VKD.Query. The response plementation [6] and extends it to provide post-compromise
to a lookup comes with a proof of correctness, which the security. Their construction is still append-only and does not
requesting user can verify (with VKD.VerifyQuery). Each user support secure deletion. So, our contribution is orthogonal to
can also check the history of her own key, implicitly, for her that of [18].
label, getting a mapping t → valt of the state of the associated We introduce the secure deletion extension here for com-
value at every epoch. The user can get this history and its proof pleteness and apply it in section IV. For verifiability, the dele-
(VKD.KeyHistory), as well as verify it (VKD.VerifyHistory). tion operation of an oZKS with compaction is actually a two-
Any party can verify the output of a publish (VKD.VerifyUpd) step process: marking nodes as candidates for deletion (a pro-
or a sequence of publish operations (VKD.Audit). cess we call tombstoning, through oZKS.TombstoneElts) and
Soundness and privacy of a VKD. The soundness definition deletion of tombstoned elements (oZKS.DeleteElts). In ap-
for the VKD primitive is exactly the same as that of [16]. pendix B, we provide a formal definition of all of these algo-
This definition captures the non-equivocation property stated rithms and in appendix C, we describe our oZKS construction.
in section II-A, assuming that the server never deletes any Our oZKS implementation. Similarly to [16]’s aZKS, our
existing records. We also define the privacy of a VKD the oZKS instantiation uses a Merkle Patricia Trie (MPT) to
same way as [16]: all operations are zero knowledge with a commit to label-value pairs. However, instead of using a
well defined leakage function. complicated persistent data-structure, which requires storing
all states of every MPT node, for the oZKS data structure, we
C. Ordered (Append-Only) Zero-Knowledge Set with Deletion
simply include the epoch a leaf was inserted as part of the
As a step towards a more space-efficient implementation of value committed for a node. This allows us to store one MPT
a VKD than that of SEEMless, we replace their aZKS building which mutates over time and old states of nodes in this tree
block with a primitive which we call an ordered append-only can be garbage collected, instead of persisting forever. Also,

4
note that for privacy, the MPT-based oZKS implementation concrete implementation strategy for our VKD implementa-
computes leaf labels using a verifiable random function (VRF) tion. This replacement of an aZKS with an oZKS results in
(defined in appendix C), which is a deterministic function functionality which is equivalent to that of SEEMless [16]
computable only by the holder of a secrete key for a publicly with significant space savings. The formal description of this
known public key, PK. Any party can, however, check the construction is in appendix E.
correct computation of a VRF using PK. For privacy of
Concrete implementation. As mentioned in section III-C, our
the actual value associated with a label, we use a hiding
concrete implementation of the oZKS consists of a Merkle
commitment scheme. Concretely, this means that to add label
Patricia Trie (MPT), used to commit to a leaf’s value and
with value val to the datastore and update the corresponding
bind it to the epoch in which it was inserted. This MPT-based
commitment at epoch t, the owner of the datastore adds a
oZKS is an important component of our VKD construction. In
leaf with label VRF(label) and value (com(val); t) to the
addition to this oZKS, the VKD requires a database to store
MPT. Figure 1 shows an example of the MPT-based aZKS
actual username to value mappings as well as when each value
used by [16] as it evolves through various insertions. Figure 2
was added. Suppose a user with username Alice first joined
shows an example of our oZKS construction, with the same
the system at a time t with public key value val1 . Thus, val1 is
leaves being inserted as in Figure 1.
the first version of Alice’s public key and the server adds the
Note that this constitutes only a simple oZKS, without the
label ‘Alice|1’ with value val1 to the oZKS at epoch t. If at a
tombstone or deletion algorithms. We discuss tombstoning and
later epoch t0 , Alice’s key is updated from version i to i + 1
deletion in section IV.
with new value val0 , the server inserts the labels ‘Alice|i|stale’
Soundness and privacy. The oZKS without the with value equal to the empty string, and ‘Alice|(i + 1)’ with
TombstoneElts and DeleteElts algorithms is said to be value equal to val0 to the oZKS. At a lower level, this means
sound if, given at least one honest auditor in every epoch, that at epoch t0 , the server adds to the MPT the leaves whose
the server cannot delete any elements which were previously labels are ‘VRF(Alice|(i + 1))0 and ‘VRF(Alice|i|stale)0 , with
committed. I.e., the oZKS without tombstones or deletions values (com(val0 ); t0 ) and (com(); t0 ), respectively.
is append-only. At first glance, the terms “append-only” and
“with deletion” for an oZKS may seem like contradictions. Soundness and privacy. The soundness definition of this
What we intend to capture is a mechanism to commit to VKD construction is identical to that of [16]. At a high level,
a set, in an append-only manner, but with the additional as long as for epochs up till epoch t, (1) a client with label
ability to remove very old values which may no longer be label checks the states of key using the VKD.VerifyHistory
needed by the calling application. Hence, the soundness algorithm, (2) the WitnessAPI is honest, and (3) at least one
property of this data structure, defined in appendix B, is that honest auditing party verifies each update using VKD.Audit,
values may only be marked for deletion if they are older then the identity provider could not have output a diverging
than an epoch permitted by a system parameter. Other than view of the val associated with label at any epoch less than
the fixed epochs for tombstoning old-enough nodes, and t. In other words, in the presence of auditing and witnesses,
correspondingly for deleting nodes marked as tombstoned, Alice and Bob must always agree on the view of valAlice —the
the data structure should only allow insertions for labels not value associated with the label Alice. The leakage functions of
present in the datastore already. As in the privacy definition this construction match those of [16]’s construction exactly.
for the aZKS of [16], we require all functions for the oZKS Space efficiency improvements. The aZKS construction
to be zero-knowledge, with a well-defined leakage function. of [16] allows reconstructing the state of verification data for
Leakage for our implementation. Our oZKS construction, a data store at any time, by storing all intermediate states ever
when committing to an initial data store (oZKS.CommitDS), generated. Both our work and [16] implement a compressed
leaks the size of the datastore. The oZKS.InsertionToDS also version of a Merkle Patricia Trie, where, for a random set of
leaks the size of the datastore before and after the update. leaves, the expected depth of a leaf in a tree with n leaves is
For each inserted element, the adversary also learns whether log(n). Thus, if a new node is added to the MPT of [16], this
it queried for this element before and if it did, this tells the results in adding O(log(n)) new states to persistent storage.
adversary when this element was added. oZKS.TombstoneElts Even with batching, this means that the space complexity of
and oZKS.DeleteElts leak when the tombstoned (resp. deleted) their implementation depends on the number of epochs, in
elements were inserted and the size of the datastore before addition to the number of leaves added. However, the space
and after each call. The responses to oZKS.QueryMem and complexity of our oZKS implementation only depends on the
oZKS.QueryNonMem, in addition to the actual response, also total number of leaves added. The impact of this difference is
leak the size of the datastore and for oZKS.QueryMem, it leaks quite significant, as shown in Section VI.
when this element was added.
E. Other Practical Considerations
D. Revisiting the SEEMless VKD Construction
In addition to the compaction extension, we discuss the
In this section, we construct a VKD (without compaction) following additional considerations for practical deployments
using an oZKS, instead of an aZKS and also summarize the of key transparency for messaging applications.

5
h =
h =  H(h0 , 0,
 H(h000 , 000, h110 , 110)
h110 , 110) h0 =
H(h000 , 000 h110 =
 h0 = H() 0 110 H(v110 )
h000 = h110 = h011 , 011)
000 110
H(v000 ) H(v110 )
Epoch 0

Epoch 1: After inserting label- h000 = h011 =


value pairs (000, v000 ) and 000 011
H(v000 ) H(v011 )
(110, v110 ).
Epoch 2: After inserting label-value pair (011, v011 ).
Fig. 1: An example of the evolution of the SEEMless Merkle Patricia Trie-based aZKS construction with 3-bit labels. The aZKS starts with no entries at
epoch 0, which is committed in the tree with a single node, with label  and commitment H(). At the first epoch, two new leaves are inserted and at epoch 2,
a third leaf is inserted. Note that the values inserted in this tree correspond to the entries inserted in Figure 2. However, in contrast to the oZKS construction,
the tree at epoch 2 as is cannot be used to reconstruct any of its previous states. SEEMless optimizes the ability to get values from previous states using
storage compression.

h = h =
H(h000 , 000, H(h0 , 0,
 
h110 , 110) h110 , 110)

h0 =
 h = H() h110 =
000 110 H(h000 , 000, 0 110
H(v110 , 1)
h011 , 011)
Epoch 0 h000 = h110 =
H(v000 , 1) H(v110 , 1)
h000 = h011 =
000 011
Epoch 1: After inserting label- H(v000 , 1) H(v011 , 2)
value pairs (000, v000 ) and
(110, v110 ). Epoch 2: After inserting label-value pair (011, v011 ).
Fig. 2: An example of the evolution of our Merkle Patricia Trie-based oZKS construction with 3-bit labels. The oZKS starts with no entries at epoch 0,
which is committed in the tree with a single node, with label  and commitment H(). At the first epoch, two new leaves are inserted and at epoch 2, a third
leaf is inserted. Note that the tree at epoch 2 can be used to reconstruct any of its previous states.

User interfaces. In this work, we assume that the server within the fast key updates. Instead, when sequencing these
is trusted for bootstrapping users. For example, if a user updates, the server can designate the ordering by adding an
loses their phone, there is a mechanism for them to recover ordered list of public keys as opposed to a single key per
their account. We also assume that the client application has epoch. Then, clients can check for list membership when
interfaces to inform users about various events including: if querying for proofs, and the server can use the ordering to
witnesses are offline for extended periods of time, if a proof ascertain the latest key after the next epoch begins.
failed to verify, or if a public key they requested to add to their Together, these adjustments to client public keys result in
account did not take effect within a specified time period. [38] an ordered list (multiple updates per epoch) of sets (multiple
discusses some of the design considerations for a user-facing devices) of public keys being hashed in our VKD construction.
transparency system. We leave further study to future work. F. Storage API
Multiple devices per user. A common property of most
mainstream messaging applications is the ability to associate As discussed in Section I, a large-scale VKD implementation
one user account with multiple devices owned by the user (e.g. requires a separate storage solution which we call StorageAPI.
computer, mobile phone), each with a distinct E2EE keypair. Warmup. As a first attempt at defining StorageAPI, we may
This means that a VKD must support a set of public keys want only two operations on top of a simple database:
belonging to a user, as opposed to just a single key. This • val/⊥ ← StorageAPI.GetFromStorage(key): This call takes
simply involves using the set of public keys as the values in as input a key (key). If the key is in the database, it returns
the VKD, ensuring that validating client queries involves a the associated value (val). Otherwise, it outputs ⊥.
membership check instead of an exact match of public key. • 1/⊥ ← StorageAPI.SetToStorage(key, val): This call takes
Multiple updates per epoch. Depending on the length of the as inputs a key key and corresponding value val. It outputs 1,
epochs, clients may submit multiple updates to their public key if this value is successfully in the database and ⊥ otherwise.
values within the span of a single epoch. Only publishing the We could use this simple API for all data types by defining
latest key received for a user within an epoch can lead to issues the storage keys for each type as an encoded binary vector with
with consistency for clients which may have been messaged a prefix to demarcate types. The implementer of StorageAPI

6
determines the underlying storage layer to handle each type. updated nodes, an operation our storage interface requires.
This also maintains unique keys for individual records without The only writing operation is VKD.Publish, and once it has
changing the storage key type signature. completed its changes, we can commit all of the changes in
Handling storage latency. Of course, in a practical implemen- the transaction cache as a single atomic operation (assuming
tation, storage will have to consist of a set of geographically the storage layer supports this). Since all of the modified data
spread-out, duplicated nodes. This means that one of the will be in the cache, flushing the cache in a timely manner is
biggest bottlenecks in timely update of the server’s verifiable all that is needed to ensure up-to-date views of storage.
data structures, as well as client queries, is memory la- Other storage considerations. Even with our cache-based
tency. For large-scale end-to-end encrypted messaging identity solution, there is an underlying assumption that the storage
providers, what may not be a problem, however, is bandwidth. layer can support a cache-sized atomic write operation. For
Hence, we add batched storage APIs to leverage the high- very large systems, we would like to further loosen this
bandwidth to reduce latency. We define these APIs as: requirement, by adding support for the situation in which the
• {vali }i ← StorageAPI.BatchGetFromStorage({keyi }i ): storage layer cannot atomically (or at least efficiently) commit
This call should take as input a set of keys keyi and the entire transaction of changes in a single operation. We
return values vali stored in the database associated with the make some simple modifications to the data-structure layer to
corresponding key. If labeli is not in the database, vali = ⊥. prevent inconsistencies in read operations during an update.
• 1/⊥ ← StorageAPI.BatchSetToStorage({(keyi , vali )}i ): All proof generation operations use an epoch value stored in
This call takes as inputs a set of key-value pairs (keyi , vali ), the oZKS data-structure to determine the latest epoch, denoted
and outputs 1, if all values are successfully set and ⊥ if any (LatestEp). LatestEp must be updated last in order to ensure
errors occur. Note that this implicitly implies atomacity of that the data corresponding to LatestEp has been written for
a single BatchSetToStorage operation. all parts of the data-structure, before any reading is permitted.
Once the oZKS’s latest epoch is updated, all operations will
Batching-friendly oZKS. Note that batching storage opera- take the new epoch as truth and access consistent values.
tions is not enough on its own, unless the algorithms we With this in mind, we need to change the stored record for
implement are also amenable to batched memory accesses. a tree node to store two values at a time: the previous and the
In particular, underlying our implementation of an oZKS is a current value. For atomicity, this means storing a dual-value
compressed Merkle Patricia Trie, whose construction satisfies struct, each value with an associated epoch. Read and write
the following invariant: operations read the epoch stored at these values to determine
Each node is the longest common prefix of its children. which to designate as current and which as previous. The read
This implies that, for example, just looking at the label operation always reads the value whose epoch is less than or
for a leaf being inserted does not automatically allow the equal to LatestEp, and the write operation overwrites values
server to know which nodes (and corresponding keyi ’s) to use whose epoch is less than LatestEp. This allows us to update
with BatchGetFromStorage. To solve this problem, our imple- tree nodes without impacting proof generation operations.
mentation starts with assuming that the storage solution has
caching capabilities. Under the assumption of a large enough IV. VKD WITH C OMPACTION
cache, if we can preload nodes for a batched operations, say, Previous constructions, such as [16] and ([43], [34]) seem
VKD.Publish, the operation itself can be done as if in RAM. to fall into two extreme categories: (1) works that assume
Later, the cache can be flushed to persistent memory as a the server’s storage to be linear in the number of updates,
single transaction operation. oZKS operations requiring tree- and (2) works that assume users are either always online or
traversals are called with a leaf label as input, batched versions can retroactively check the server’s view of their keys for any
of these operations are called with a set of leaf labels leaves. epochs they may have missed. This means that the client has
To optimize remote persistent storage accesses, at a high level, to do linear work in the number of server epochs to monitor
we implement procedures which operate following two steps: their keys. Also, at production scale, we must account for the
(1) compute a set prefixes of all prefixes of values in storage implications of being able to serve users who come
leaves, (2) starting at the tree’s root node, begin a breadth- online infrequently—e.g., in case (2), the server would need
first search for nodes with labels in the set prefixes, batch- to store a large trove of history, perhaps even all of its data.
fetching labels at the same depth. This ensures that all of the Even the soundness of our oZKS-based VKD construction
nodes required to run algorithms for the labels in leaves will depends on the append-only property of the oZKS. In our
be loaded at the end of the preload operation with only one implementation, we instatiated the oZKS described above
access to persistent storage per layer of the Merkle Patricia using the Merkle Patricia Trie-based SA, as in [16]. For each
Trie, without overwhelming the cache by having to load the version added by the user, the server adds a label of the
entire tree. In fact, the set of nodes retrieved to cache is exactly form (uname|i) to the MPT and for each version retired by
the set required for the original oZKS batched operation. the user, it adds a label of the form (uname|i|stale). So, the
So far, we have omitted any discussion of batching for space complexity of st for our VKD Pgrows linearly in the total
n
set (write) storage operations. This is easier, since batching number of updates, i.e. |st| ∈ O( t=0 |St |) where St is the
writes is equivalent to flushing an in-memory cache of the input to the tth call to VKD.Publish. This poses two problems:

7
• the labels committed for a given user may never be deleted, first joins the system, with public key PK1 , the VKD adds
so it becomes impossible for a user to request that all data the label Alice|1 with value PK1 to the oZKS. When Alice
associated with their account be deleted. This is (arguably) updates their key from the first version PK1 to PK2 , the server
in conflict with “right to be forgotten” regulations, such as simultaneously adds the labels Alice|2 and Alice|1|stale with
those laid out in the GDPR [4]. values PK2 and  to the oZKS. This pattern generalizes to any
• eventually, traversing a tree which is monotonically growing further updates by the user Alice. Suppose, after several years
leads to unreasonable blowups in proof sizes as well as the of joining, Alice is on their 10th key version, and has come
time taken to respond to queries and make server updates. online and checked their key history a few times in the interim.
This motivates the need for a process of secure (and trans- The data for Alice|1, PK1 may not be useful and the server
parent) compaction of storage on the VKD server. Naturally, may want to reduce storage costs by deleting such entries.
the best way to do this would be to delete very old data stored We motivate our final cVKD construction with preliminary
on the VKD server, should it no longer be useful. attempts to support compaction directly from a VKD.
Overview of our paradigm. To address the issue of ever- Attempt 1. The most straightforward attempt at adding a com-
growing storage, we extend the VKD functionality to a VKD paction functionality to the VKD built using an oZKS involves
with compaction, denoted cVKD. The compaction consists simply allowing the deletion of arbitrary, “old enough” oZKS
of two phases: the tombstone phase, a special epoch when entries by introducing a oZKS.DeleteElts functionality. We
some of the server’s data is marked for garbage collec- define “old enough” as a system parameter called StaleParam,
tion, followed, after a period of time, by, the compaction with the requirement that any entries deleted by the server
phase, a special epoch where values marked as tombstoned must have been inserted at least StaleParam epochs ago. That
are garbage collected i.e. deleted. All other epochs are ex- is, if oZKS.DeleteElts is called in some epoch t, the only
pected to function as before, i.e., internal data structures entries the auditors will verify as correctly deleted were added
remain append-only. The auditors contribute global correct- before the epoch t − StaleParam. In our MPT-based oZKS
ness checks to the tombstone and compaction processes by implementation, this is easy to support, since the epoch a leaf
verifying that only data which is “old enough” is tomb- was added is included in the leaf’s hash. From the auditors’
stoned (VKD.VerifyTombstone), that only tombstoned values point of view, as long as a deleted node is old enough, its
are garbage collected (VKD.VerifyCompact) and, that in all deletion was valid. For privacy, the auditors should not access
epochs which are not used for tombstoning or deletion, the the plaintexts used to compute the VRF for the MPT.
server updates its commitments correctly, as before. To ensure What if certain users check their key history infrequently?
that values are properly garbage collected, we require that Suppose a user Alice is at version 20 of their public key
users monitor their own key history after each tombstone and version 10 was added a long time ago. The server could
phase, prior to the following compaction phase. Through mount an attack where it deletes a label Alice|10|stale but not
VKD.KeyHistory, the user checks any deletions were correct. Alice|10. If the user Alice does not come online for a while,
the server could serve the stale value at version 10 for Alice,
Soundness of a VKD with compaction. We provide a formal
in response to VKD.Query, then delete Alice|10. The server
definition of a cVKD in appendix B. The soundness definition
could do all of this before Alice comes online to check their
for a VKD is more general than a single tombstone or deletion
public key. The only way to prevent this seems to be to store
epoch, but rather, refers loosely to the following intuition. We
all states for a long time, or to have users be always online:
start with some requirements on the user for verifying her
both of which significantly degrade efficiency.
own key. Specifically, we require that between any consecutive
tombstone and deletion, a user checks her key at least once. Attempt 2. We could try to patch the issues with the
Then, “if a user Alice verifies her key according to the oZKS.DeleteElts functionality in the previous attempt by
requirements and believes her own key at an epoch t was P K, introducing a system parameter called DeletionEpochs such
then, any other user must also believe that Alice’s key at epoch that an oZKS.DeleteElts proof only verifies if it is called in an
t was P K”. Note that Alice checking her keys between all epoch in the set DeletionEpochs with the important invariant
“tombstone-deletion” epochs induces a mapping from “epoch that the oZKS remains append-only at all non-deletion epochs.
t to the freshest public key and version number at t”. For this We could then rely on the assumption that a user comes online
mapping to be unambiguous, and strictly increasing, a version between any two consecutive elements of DeletionEpochs
number cannot be deleted and reinserted without detection. to run VKD.KeyHistory on their own label. Now, if a user
Alice comes online, they can ensure that (1) the epochs for
A. Construction
version numbers are correctly ordered, for example, if the label
We extend the oZKS-based VKD construction from ap- Alice|9 was inserted at epoch t9 and Alice|10 was inserted
pendix E to include a compaction phase, to allow garbage at t10 , then t9 < t10 , (2) versions are marked stale at the
collection and verification of data which is no longer needed. appropriate epoch, in the previous example, this means that
The following example and Figure 3 illustrate what kind Alice|9|stale was inserted at t10 , and (3) for an honest IdP,
of data the server may want to delete in order to compact labels of the form Alice|version are always inserted before the
its storage. Recall that when a user with username Alice corresponding label Alice|version|stale, they should ensure that

8
...
100 200 1000
Alice|1 Alice|2 Alice|10
inserted Alice|1|stale Alice|9|stale
inserted inserted
Fig. 3: An example of labels corresponding to user Alice’s key updates. Since Alice has changed their key enough times since their initial entry into the
system and epochs before epoch 200 are considered old enough, Alice|1, Alice|1|stale can be deleted.

if Alice|version|stale is deleted, then so is Alice|version. Final construction. In our final construction, we patch this
However, an issue still remains: suppose the server has two last issue by including a set of tombstone epochs, denoted
consecutive deletion epochs del1 and del2 , with n non-deletion TombstoneEpochs in the public parameters. A tombstone
epochs between them. Suppose, Alice comes online at epoch epoch is an epoch when items are marked for deletion, i.e.
del1 + 1 and checks their key history and then again at epoch tombstoned, but not actually deleted. This allows users to
del2 + 1 checks their key history. Suppose at epoch del1 + 1 check their own key history and ensure that values marked
their latest key’s version number was 10, and the server now for deletion are appropriate, before they are actually deleted.
inserted 3 fake keys for their between epochs del1 + 2 and For example, if the IdP wants to delete the label Alice|10
del2 , i.e. their key’s version right before the next deletion is at a deletion epoch del, it would first have to set the value of
fallaciously 13. As part of del2 , the server could delete the Alice|10 to T OMBSTONE at the preceding tombstone epoch
labels it added for Alice’s versions 11 through 13 and the TombEp. Between TombEp and del, Alice can run KeyHistory
label Alice|10|stale. Such an attack could go undetected, unless and see if this is appropriate at the VKD level.
Alice came online in exactly the epoch del2 −1. In the presence Meanwhile, at TombEp, the auditors ensure that the only
of network delays and users who are possibly offline for long oZKS elements which are modified at TombEp are old enough,
periods of time, we consider this requirement too restrictive. according to the parameter StaleParam. At all epochs between
Hence, we propose adding the restriction that StaleParam is TombEp and del, the auditors continue checking append-only
at least as large as the space between two deletions. Meaning, proofs. At del, auditors check that the only oZKS elements
auditors would not accept deletion of labels which were which are deleted have the value T OMBSTONE .
inserted more recently than the most recent deletion. In the The final construction requires a client to come online between
above example, this would mean that StaleParam > del2 −del1 deletion epochs and check the following set of conditions. If
and any values inserted between del1 and del2 would remain the minimum valid version number it receives is min and its
for Alice to check when they comes online at epoch del2 + 1. current version number is current, then:
Even with the restriction that StaleParam > max(deli+1 − • Correct tombstoning or deletion. For all versions version

deli ), if the user does not immediately come online at each below min, it gets either (a) a non-membership proof for
epoch deli + 1, and her key is old enough, the server could uname|version, or (b) membership proofs for uname|version
temporarily rollback her key’s version. Consider the following and uname|version|stale, with either both of them or neither
case: a user Alice updates their key very infrequently and of them having the value T OMBSTONE .
their most recent key, say, version 10, is old enough to fit the • Correct ordering. For any version version in the range min

criteria for deletion. That is, Alice|10 was added at an epoch to current − 1, that the epoch tversion when uname|version
older than StaleParam. At the next deletion epoch del, the was added is less than the epoch tversion+1 , when
server could delete all record of their most recent key, i.e. the uname|version + 1 was added.
labels Alice|10 and Alice|9|stale, effectively rolling back their • Correct version changes. For any version version in the

key to version 9. It could then re-insert the labels Alice|10 range min to current − 1, that uname|version + 1 and
and Alice|9|stale before Alice checks their key history for this uname|version|stale were added at the same epoch.
round of deletions. All of the checks we mentioned above • Freshness of current value. For the current version current,

would pass and yet, for a period of time, if another user Bob it checks for the non-membership of uname|current|stale.
queries for Alice, they would see a different version and value • Non-membership of next few entries. For any version ∈

than what Alice will see later. This is a subtle technical issue [current + 1, 2blog(current)c+1 ), uname|version wasn’t added.
but preventing such an attack is integral for a construction to • Non-membership of much further entries. For L, the most

be considered sound! recent server epoch, for any j ∈ [2blog(current)c+1 , 2blog(L)c ],


The auditors could mitigate this issue by storing old labels it checks a non-membership proof for uname|2j .
and ensuring deleted labels were not reinserted. This requires Soundness. We give a detailed proof for the soundness of this
auditors to be stateful and have linear storage complexity in construction in appendix B. The intuition for soundness is as
the size of the updates. This would make the auditor’s storage follows: For the most part, this construction with deletion is
as large as the server’s – which, our construction avoids. identical to the one in section III, since all but a designated
We argue that the requirement that the user come online set of epochs is append-only. The only divergence occurs at
immediately after every deletion epoch is also too strong. In tombstone or deletion epochs, and we show that dishonest
the next construction, we propose a design that fixes this issue. server behavior cannot go undetected, even in these epochs.

9
We require that a user must check their key history once for Space efficiency. A compressed MPT is constructed with the
each tombstone epoch, prior to the following deletion. If a invariant that any node with only one child is deleted and
user’s key is at version, the server can show the wrong key replaced by its child. This means that when a new leaf is
for a user in one of the following ways. The server could inserted, somewhere in the tree an additional new node will
try to change the public key committed to the label version: have to be inserted to accommodate this leaf. When a leaf is
this is mitigated by the auditors checking that the only change deleted, its immediate parent can also be deleted. Hence, for
being made is tombstoning or deletion of already tombstoned each leaf deleted from this MPT-based VKD, the equivalent
values. If the server tombstones the label uname|version, the of 2 MPT nodes worth of data can be erased from the server.
user will detect that it is tombstoned since they will come V. T HE PARAKEET C ONSISTENCY P ROTOCOL
online before the next deletion. The second option is that the
server could show an older key, i.e. show a key for some Existing transparency overlays typically fall in two categories:
version version0 < version but this should be detected by the • Reactive Client-side Auditing: These protocols [34] offload
one of the key history checks. The third option is that it could the consistency protocol to the clients. Clients are expected
try to add a higher version version0 > version: this is already to gossip among themselves about their views of the system
detected by the construction in section III. The fourth option is and construct fraud proofs showing that the IdP has misbe-
that it could try to re-insert an old version number, which was haved. Unfortunately, this class of protocols is too optimistic
deleted, with a different key: this particular attack is unique to for a security-critical infrastructure and too heavy for the end
the scenario with deletions and is detected by the key history user. As a result, there is a high risk that users do not eagerly
check, since if the server re-inserts an old version number, the gossip proofs or that the adversary launches targeted eclipse
user will see this the next time they comes online, which will attacks [24] to isolate targeted users. Finally, gossip relies
be before the server has the opportunity to hide this change. on always-connected nodes forming a connected graph,
Assumptions. Recall that previous works, as well as our perpetually routing messages. This is impractical if the many
construction in section III, make two major assumptions. First, of nodes in the gossip network are small devices with
they assume that the server’s underlying data structure is intermittent network connectivity and rely on battery power.
• Using a black-box blockchain as a trust anchor: This ap-
append-only, i.e. the identity provider could never delete any
verification data. Second, that users are not always online. proach [41], [16] guarantees consistency as long as the
The guarantee is that if a user checks their key history for a blockchain is secure, i.e., it does not have forks. This alle-
period of time, they will catch the a cheating server. In fact, viates any risks for clients who only need to trust the black-
there is no upper bound on the time the user can wait before box blockchain, however, (i) it introduces a significant ex-
checking their key history but still get the same guarantee. tension of the trust assumptions, which might not be correct
This construction changes these assumptions as follows: given the multitude of attacks blockchains suffer [11], (ii)
it limits the update speed of the system to the finality speed
1) We assume that the system only considers a value stale- of the blockchain (which is tens of minutes in Bitcoin), and
enough to tombstone, if it was added before the previous (iii) it doesn’t solve the problem of eclipse attacks against
tombstone. The ability to tombstone and delete values clients as they remain a challenge in blockchains. Also, no
is already an improvement over previous, append-only existing protocols has provided a detailed security analysis
verifiable data structures. We argue that the compromise for the interactions with any existing blockchain.
of keeping data for a certain period of time is still a major
One way to avoid these issues is having auditors run consensus
improvement for long-term storage costs.
on each update of the IdP, essentially replacing the trust
2) Instead of allowing the user to go for an unbounded amount
assumptions of the blockchain with a custom-made blockchain
of time before checking their key history, we require that
just for the transparency layer. However, this is simply an
they come online between each pair of tombstone and
overkill. As a final contribution, Parakeet shows a consensus-
corresponding deletion epochs and check their key history
less protocol that achieves all the desired properties for defend-
to get guarantees for that period. Since large servers only
ing against split-view attacks. Section V-C further extends this
need to run deletions infrequently, while more strict that the
protocol to provide censorship resistance and read-freshness.
previous assumption, this still provides a user with leeway.
In fact, this approach also has the added benefit that there A. Consensus-less Strong Consistency
is an upper bound on the amount of time the server can The Parakeet protocol consists of network messages ex-
cheat for a particular user and go undetected. changed between the participants (Section II-A). The users
Leakage. The leakage of the subset of algorithms of this communicate with the identity provider (IdP) and witnesses.
cVKD construction, which are inherited from the defi- The IdP communicates with the witnesses and users, but the
nition of a VKD is identical for this construction. For witnesses need not communicate directly with each other.
cVKD.TombstoneElts and cVKD.Compact, they both leak Updating the state. All updates to the state of the IdP start
the size of the cVKD at the start of the operation. Also, with an update request sent by a user to the IdP. The request
cVKD.TombstoneElts leaks the number of tombstoned values contains a user identifier id and the new key pknew associated
and cVKD.Compact leaks the updated size of the cVKD. with id. The IdP collects several user requests and runs the

10
process update C. Anti-Censorship and Freshness Subprotocol
notification
2 We provide an optional enhancement of the consistency
witness 1 protocol presented in Section V providing censorship re-
witness 2 sistance from malicious IdP. This protocol allows users to
witness 3 tie the liveness of their update requests with the liveness
witness 4 of the entire system effectively defending against selective
censorship attacks. The protocol works as follows:
IdP • Users whose updates keep not appearing in the IdP’s state
1 3
make update assemble send their update request to the witnesses.
notification certificate • Every epoch witnesses collect these requests and forward
them to the IdP.
Fig. 4: Illustration of the key update protocol. • Once the witness has forwarded a user update request, it
key-update protocol illustrated in Figure 4. The IdP broadcasts stops signing any future IdP’s update notification that do not
an update notification to the witnesses to notify them that come with a proof that the user update has been included
the state of the key directory has been updated (Ê). This in the state.
notification contains the following fields: (i) a commitment As a result the moment the censored client has sent their
to its state comt , (ii) the epoch t, (iii) a proof π Upd that comt update request to f + 1 honest witnesses, the liveness of the
is a valid commitment and that the state is a valid update of the protocol is tied to the update being included. Hence the IdP
previous state, and (iv) a signature by the identity provider over can now only act as crashed and halt the whole system if it
this data. Each witness locally verifies that the epoch number t wants to censor the update (at which point no other requests
has been incremented by 1 and that the proof π Upd is valid; it can be processed).
then counter-signs the update notification (Ë). Finally, the IdP A second enhancement of the consistency protocol is that
collects at 2/3f +1 witnesses’ signatures into a certificate (Ì) during an update the witnesses send their signature over not
which is attesting that a quorum of witnesses verified each past only the hash of the tree but also their local timestamp. Then
update and the IdP could not have equivocated. clients are given the option to ask for 2f +1 timestamps during
Reading the state. The user sends a read request to the a read operation. Given that a median timestamp is robust to
IdP to request the key associated with a specific identifier. f faults and assuming bounded clock-drift clients can deduce
This request contains the identifier key to query. The identity the freshness of the state tree received as reply. They can thus
provider replies to read requests with a message containing the choose to ignore the reply if it is too old (e.g., more than a
following fields: (i) the key key associated with the identifier, day old).
(ii) a certificate from the witnesses over its latest update Given that now the updates are coupled to anti-censorship
request, (iii) a proof that key is included in the certified state, and clients are aware of the time the state is updated in
and (iv) a signature by the identity provider over this data. Parakeet, we guarantee that key updates are included in a
B. Proofs Sketches timely manner in the state and every client quickly receives
the new update (otherwise a malicious IdP is forced to halt
We provide the intuition for the proofs of consistency,
the whole system ).
validity, and termination of the protocol defined in Section II.
Consistency. We prove consistency by contradiction. Let’s VI. I MPLEMENTATION AND B ENCHMARKS
assume two correct users output different commitments comt We implement our VKD scheme presented in Section III in
and comt0 for the same epoch t. Then comt is signed by Rust. We also implement a networked multi-core eventually
2f + 1 witnesses out of which f + 1 are assumed honest. synchronous Parakeet based on our VKD. It uses tokio [7]
Similarly, comt0 is signed by 2f + 1 witnesses out of which for asynchronous networking, ed25519-dalek [8] for el-
f + 1 are honest and did not sign comt0 . But then there should liptic curve based signatures, and data-structures are persisted
be f + 1 + f + 1 honest and f malicious witnesses. But using Rocksdb [9] (unless otherwise specified). It uses TCP
n = 3f + 1 < 3f + 2, hence a contradiction. to achieve reliable point-to-point channels, necessary to cor-
Validity. Validity directly follows from the integrity property rectly implement the distributed system abstractions. We have
of the signature scheme used by the IdP and the soundness open-sourced both our VKD [5] and consistency protocol [5]
of the proof π Upd . Honest witnesses only counter-sign update implementations (which together form Parakeet), including
notifications if they are correctly signed by the IdP and the our Amazon Web Services (AWS) orchestration scripts and
proof π Upd verifies. As a result, there cannot exist a valid measurements data to enable reproducible results [10].
certificate over an invalid update. Benchmarks. We performed several benchmarks to thor-
Termination. Assuming all messages are eventually delivered oughly assess the practicality of Parakeet. To this end, we used
and there exist 2f +1 honest witnesses then the honest IdP will two kinds of AWS instances, both with up to 5Gbps of band-
send a single commitment for each epoch t which all honest width, running Linux Ubuntu server 20.04, running on 2.5GHz
witnesses will counter-sign and produce a valid certificate for. Intel Xeon Platinum 8259CL machines: (1) t3.medium

11
instances with 2 virtual CPUs (1 physical core) and 4GB Measure Mean (ms) Std. (ms)
memory. These machines are relatively cheap due to their Verify notification 68.70 1.50
low specs and we chose them to assess how performant the Create vote 0.02 0.00
Verify certificate 0.16 0.01
witnesses of Parakeet are. (2) t3.2xlarge instances with
8 virtual CPUs and 32GB of memory. These machines are TABLE I: Microbenchmark of single core CPU overhead of the consis-
used to benchmark the most heavyweight IdP operations, tency protocol operations of Parakeet on witness machines. Committee of
4 witnesses; notifications batch 1,024 updates of 64 bytes each; average and
namely publish. We implemented IdP and Witnesses in Rust standard dev. of 10 measurements.
and for signatures we used ed25519_dalek with 32-byte
public/private keys pairs. Measure Mean (ms) Std. (ms)

A. Microbenchmarks Create notification 47.12 9.30


Verify vote 0.07 0.00
We microbenchmark the overhead of the consistency proto- Aggregate certificate 0.00 0.00
col on the AWS t3.medium instance and the IdP capabilities TABLE II: Microbenchmark of witness CPU overhead of the consistency
of our implementation on the t3.2xlarge instances. We set protocol operations of Parakeet on the IdP machine; notifications batch 1,024
the committee size to 4 witnesses, the size of keys and values updates of 64 bytes each; average and standard dev. of 10 measurements.
to 32 bytes each (i.e. a key-value pair is 64 bytes), and a batch At this rate, we would expect roughly 850GB for 1 billion
size of 1,024 key-values per batch. In the tables below, every users – demonstrating the scalability of our solution.
measure displays the mean, standard deviation over 10 runs. Figure 6, shows the time it takes to run a publish operation
CPU analysis. Table II shows the results of microbenchmarks on Parakeet’s VKD. The storage layer of this experiment
for the following VKD operations: implements a cache and uses a MySQL database for persistent
storage. Each of these update operations, as well as the ones
1) Create notification: The IdP generates a batch of (random)
in fig. 5 required only one or two persistent storage accesses
public keys and publishes them. Then it generates an audit
each, and the storage accesses made up a majority of the time
(append-only) proof over this publish operation. The audit
it took for the publish operation to complete. For example,
proof along with the the new tree root and the sequence
for inserting 100k users on top of a VKD containing 500k
number are signed by the IdP. This message constitutes a
existing users, it took about 19 minutes, of which 12 were for
notification. Note that the result in table II only includes the
MySQL writes. We attempted to run similar benchmarks for
cost of the consistency and audit related operations, which
SEEMless. Our implementation of an aZKS included all the
we derive by subtracting the server-side publish cost from
same caching optimizations as the oZKS implementation we
the total cost of publishing and creating a notification.
used for Parakeet. With persistent memory, however, this data
2) Verify notification: A witness verifies a notification. This
structure became infeasible at fairly small scale. For example,
step consists of the verification of the IdP-generated signa-
inserting 100k new users into a set of 300k users and this
ture and the audit proof.
operation took ≈ 66 minutes. Of this, 51 were spent writing
3) Create vote: A witness signs the verified notification and
to persistent storage and about 7 were spent on reading from
creates a vote.
it. This took 47 persistent storage accesses. In appendix I, we
4) Verify vote: A witness verifies a vote.
show that as the number of epochs increases, for the same set
5) Aggregate certificate: A list of votes are combined to
of users, SEEMless sees a large storage size blowup, versus
form a certificate. The number of votes needed to create a
Parakeet, whose storage size remains constant.
certificate is 2f + 1 (see Section V). With a four-witness
setup, this means combining 3 votes. B. End-to-end benchmarks
6) Verify certificate: All 3 votes in a certificate are verified. We evaluate the throughput and latency of our implementa-
The slowest operations are the generation and signing of audit tion of Parakeet through experiments on AWS. We particularly
proofs, dispatching them to witnesses, Create notification, and aim to demonstrate the following claims:
the verification of the signature and audit proof presented in (C1) Parakeet achieves enough throughput to operate at plan-
Create notification, i.e. Verify notification. Similarly, verifica- etary scale.
tion of these two contributed to the second slowest operation (C2) Parakeet achieves low latency even under high load, in
Verify notification. Comparatively, remaining operations were the WAN, and with large committee sizes.
very fast – completed under one millisecond. (C3) Parakeet runs efficiently on cheap machines with low
Storage analysis. To examine the storage costs of Parakeet, specs (comparable to common HSMs).
as well as SEEMless, we inserted 100,000 users at a time (C4) Parakeet is robust when some parts of the system in-
to the respective VKDs with a MySQL-based storage layer, evitably crash-fail. Note that evaluating BFT protocols in the
running locally in a docker container. Figure 5 shows the presence of Byzantine faults is still an open question [12].
storage cost and its breakdown for Parakeet’s VKD, as the We deploy a testbed on AWS, using t3.medium instances
number of users increases. For 5M users, the VKD requires across 5 different AWS regions: N. Virginia (us-east-1), N.
about 4.5GB and this cost grows linearly in number of users. California (us-west-1), Sydney (ap-southeast-2), Stockholm

12
30.0
Total User Data
Parakeet Storage (GB)
4 10 witnesses (batch: 2^10)
Merkle Tree Data Tree Metadata 25.0 20 witnesses (batch: 2^10)
20.0 50 witnesses (batch: 2^10)

Latency (s)
3
15.0
2
10.0
1 5.0
0 0.0
0 10 20 30 40 50 0 200 400 600 800
Number of keys (100K) Throughput (updates/s)

Fig. 5: Memory consumption of Parakeet’s VKD. Measurements at intervals Fig. 7: Throughput-latency performance of Parakeet. WAN measurements with
of 100k new users and up to 4.5M users. Total memory consumption is the 10, 20, 50 witnesses. No faulty witnesses, 1024 maximum batch size, and 64B
sum of memory required by the Merkle Patricia Trie data, some metadata and updates size.
the original database of usernames and public keys. 30.0
10 witnesses (batch: 2^5) 10 witnesses (batch: 2^10)
25.0 10 witnesses (batch: 2^7) 10 witnesses (batch: 2^15)
20.0

Latency (s)
15
Time (mins)

15.0
10 10.0
5.0
5 Total Time to Insert 100K Users Database Write Time 0.0
Database Read Time 0 200 400 600 800 1k
0 Throughput (updates/s)
0 1 2 3 4 5
Number of keys (100K) Fig. 8: Throughput-latency performance of Parakeet. WAN measurements with
10 witnesses. No faulty witnesses, various batch sizes, and 64B updates size.
Fig. 6: Time to publish with insertion of a batch of 100k new users into
Parakeet’s VKD, with an existing VKD of sizes 0-500k. This graph also
includes the time for database reads and writes for each insertion.
make the network to become the system’s bottleneck.
Figure 8 illustrates the performance of Parakeet when vary-
(eu-north-1), and Tokyo (ap-northeast-1). Witnesses are dis- ing the batch size form 25 to 215 . The maximum throughput
tributed across those regions as equally as possible. we observe for batches sizes of 25 and 27 is respectively 100
In the following sections, each measurement in the graphs updates/s and 350 updates/s. This is much lower than the 800
is the average of 3 independent runs, and the error bars updates/s that Parakeet can achieve when configured with a
represent one standard deviation; errors bars are sometimes batch size over 210 . Small batch sizes, however, allow Parakeet
too small to be visible on the graphs. Our baseline experiment to trade throughput for latency. Parakeet configured with a
parameters are 10 honest witnesses, a batch size of 1024, batch size of 25 can process up to 100 updates/s in under
and an update size of 64B. We instantiate one benchmark 800ms, setting the batch size to 27 allows Parakeet to operate
client (collocated on the same machine as the IdP) submitting at scale while robustly maintaining sub-second latency.
transactions at a fixed rate for a duration of 5 minutes. When Benchmark under crash-faults. Figure 9 depicts the perfor-
referring to latency, we mean the time elapsed from when the mance of Parakeet when a committee of 10 witnesses suffers
client submits a request to when the IdP receives confirmation 1 to 3 (crash-)faults (the maximum that can be tolerated in this
that the request is successfully processed. We measure it by setting). It shows that Parakeet’s performance is not affected
tracking sample requests throughout the system. by (crash-)faults, thus satisfying claim (C4).
Benchmark in the common case. Figure 7 illustrates the Contrarily to BFT consensus systems [28], Parakeet main-
latency and throughput of Parakeet for varying numbers of tains a good level of throughput under crash-faults. The
witnesses. The maximum throughput we observe is around 800 underlying reason for steady performance under crash-faults
updates/s while keeping the latency below 3.5 seconds. Based is that Parakeet doesn’t rely on a leader to drive the protocol.
on the system usages estimates for the large-scale end-to-end This is in sharp contrast with related work (e.g. [16], [13],
encrypted messaging service WhatsApp (Section I), we would [41]) that rely on an external blockchain for consistency.
arrive at the requirement to process around 120 updates/s. VII. R ELATED W ORK
Parakeet exceeds by over 6x the throughput required to operate
at this scale, and thus satisfies claims (C1) and (C3). Key transparency. As discussed in Section I, this work
Figure 7 also illustrates that performance do not vary with extends SEEMless [16] by instantiating a light-weight con-
10, 20 or even 50 witnesses. This observation concurs with sistency protocol to prevent server equivocation in a practical
Section VI-A showing that the bottleneck of Parakeet is the setting, while also extending SEEMless to handle real-world
IdP. Increasing the number of geo-distributed witnesses up to constraints on storage capacity, efficiency, and scalability to
50 doesn’t impact the end-to-end performance of the system; billions of users. While Keybase ([23], [26]) was the first
Parakeet thus satisfies claim (C2). We however expect that deployment of an auditable public key directory (created as
keeping increasing the number of witnesses will eventually a user-friendly alternative for PGP), CONIKS [34], was the

13
30.0 to verify a series of key updates from linear to logarithmic,
10 witnesses (batch: 2^10)
25.0 10 witnesses (batch: 2^10) - 1 faulty assuming that key updates can be signed by clients through the
20.0 10 witnesses (batch: 2^10) - 3 faulty use of “signature chains”. However, it is unclear what integrity
Latency (s)

15.0 protection the system will provide if the signature keys of the
clients are lost. Assuming that the clients can maintain long
10.0
term cryptographic secret keys is unrealistic, especially in the
5.0
setting of key transparency, where the focus is on building
0.0
0 200 400 600 800 a PKI for clients who cannot remember cryptographic key
Throughput (updates/s) material. AAD [40], Aardvark [29], Tomescu et. al. [42], Tyagi
Fig. 9: Comparative throughput-latency under crash-faults of Parakeet. WAN
et. al. [43], and Verdict [44] have proposed using accumu-
measurements with 10 witnesses. Zero, one, and three crash-faults, 1024 lators (bilinear and RSA) and SNARKs as commitments in
maximum batch size, and 64B updates size. order to make auditor verification more efficient. However,
the computational overhead incurred from relying on the
first academic work that formalized the notion of a service algebraic assumptions themselves can outweigh the asymptotic
that maintains and periodically commits to a public key improvements over the number of key updates per epoch.
directory. SEEMless itself can be seen as an extension to
Atomic transactions. Atomic transactions [27] allow all-or-
CONIKS. CONIKS essentially relies on a synchronous gossip
none type of execution for a set of operations. In large systems,
protocol among users in order to detect server equivocation.
they are often a necessity since the underlying database can
Unfortunately, this assumption is both hard to scale efficiently
end up in an inconsistent state if operations are not sequen-
for millions of users and easily breakable over the internet.
tially executed (e.g., withdrawing money from account A and
Since then, multiple works have proposed different ways depositing it to account B). To this end, several solutions have
of publishing directory commitments and serving audit proofs been proposed in the literature—[37], [39], [20] to name a few.
to clients. EthIKS [13] and Catena [41] demonstrate how to Although they are a strong primitive for building concurrent
leverage a public blockchain (Ethereum and Bitcoin, resp.) as applications, transactions come with their cost; locks might
a ledger to provide non-equivocation guarantees and support leave the systems in a dead-locked state whereas failure in a
auditability for the commitment to the CONIKS directory, single operation can cancel a transaction and might require
while minimizing bandwidth overhead for its clients. In this re-execution of the whole set; or alternatively they might not
paper we show that the strong assumption of using consensus be supported cross-shard [19].
is unnecessary and instead propose a lightweight consistency
In VKD, we side-step such issues by (1) executing a publish
protocol. Mog [33] takes a different approach by proposing a
operation by a single writer and (2) preserving the previous
gossip protocol and a verifiable registry that allows individual
value of a node. (1) is needed to ensure that concurrent publish
clients to perform their own auditing. The gossip protocol still
operations not overwrite nodes’ updated states and (2) allows
requires synchrony, however, it relies on the assumption of
us to allow concurrent reads (e.g., lookup proof generation)
sleepy committees to solve the consensus problem. In theory,
and writes (i.e., publish operation). In result of these two
this would enable scalability but could hinder liveness for long
properties, we only require that the update order is preserved
periods when there are not enough available witnesses or when
only between the node updates and the latest epoch, i.e., we
the network is unstable. In this work, instead, we have shown
allow the nodes to be updated in any order.
that consensus is unnecessary for guaranteeing a consistent
The downside of this approach is that the storage cost
view of the tree to the clients which enables us to provide
is effectively doubled. Yet, we believe this is an acceptable
consistent views without relying on a good network. Finally,
trade-off due to drastic storage reduction compared to existing
KTACA [45] relies on clients to (1) gossip through out-of-
key transparency solutions such as SEEMless [16] and the
band channels to share consistent commitments to a directory,
flexibility to use any storage as the underlying key directory.
(2) run any global client audits. Our work could be extended
to using sleepy committees or gossip, however we opted to VIII. C ONCLUSION
use highly available witnesses that should be deployed by While much recent effort has focused on various aspects
professional services (e.g., professional blockchain validators) of key transparency, large-scale applications on the order of
and will provide timely security to the users. KTACA addi- billions of users have not previously been considered. We
tionally combines the gossip approach with anonymous key expose the gaps in purely academic scale implementations and
history checks and lookups to prevent an adversarial server bridge these gaps is our design of Parakeet, a key transparency
from causing targetted forked views of its directory. Allowing system with large-scale deployment in mind. Our production-
anonymous lookups, however, goes counter to our privacy grade implementation of Parakeet shows the feasibility of our
requirement, that only authenticated and permitted users be approach, which we further demonstrate through experiments.
permitted to query.
Several recent works on key transparency have focused on ACKNOWLEDGMENTS
improving auditor efficiency. Merkle2 [25] proposes a solution This work is supported by the Novi team at Meta. Harjasleen
which reduces the amount of work required for an auditor Malvai was funded in part by IC3 industry partners and NSF

14
grant 1943499. [25] Y. Hu, K. Hooshmand, H. Kalidhindi, S. J. Yang, and R. A. Popa,
“Merkle2 : A low-latency transparency log system,” in 2021 IEEE
R EFERENCES Symposium on Security and Privacy (SP). IEEE, 2021, pp. 285–303.
[26] Keybase, https://keybase.io/ /api/1.0/merkle/root.json?seqno=1, 2014.
[1] “How whatsapp enables multi-device capability,” https://engineering.fb.
com/2021/07/14/security/whatsapp-multi-device/, [Online; accessed 5- [27] B. W. Lampson, “Atomic transactions,” in Distributed
July-2022]. Systems—Architecture and Implementation. Springer, 1981, pp.
[2] “Most popular global mobile messenger apps as of January 2022, 246–265.
based on number of monthly active users,” https://www.statista. [28] H. Lee, J. Seibert, E. Hoque, C. Killian, and C. Nita-Rotaru, “Turret:
com/statistics/258749/most-popular-global-mobile-messenger-apps/, A platform for automated attack finding in unmodified distributed
[Online; accessed 5-July-2022]. system implementations,” in 2014 IEEE 34th International Conference
[3] “Whatsapp revenue and usage statistics (2022),” https: on Distributed Computing Systems. IEEE, 2014, pp. 660–669.
//www.businessofapps.com/data/whatsapp-statistics/, [Online; accessed [29] D. Leung, Y. Gilad, S. Gorbunov, L. Reyzin, and N. Zeldovich,
5-July-2022]. “Aardvark: A concurrent authenticated dictionary with short proofs,”
[4] https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX: Cryptology ePrint Archive, Report 2020/975, 2020, https://eprint.iacr.
32016R0679&from=EN#d1e2606-1-1, 2015. org/2020/975.
[5] https://github.com/facebook/akd, 2022.
[6] https://github.com/Microsoft/oZKS, 2022. [30] D. Malkhi and M. Reiter, “Byzantine quorum systems,” Distributed
[7] https://github.com/tokio-rs/tokio, 2022. computing, vol. 11, no. 4, pp. 203–213, 1998.
[8] https://github.com/dalek-cryptography/ed25519-dalek, 2022. [31] M. Marlinspike, “Advanced cryptographic ratcheting,” https://signal.org/
[9] https://rocksdb.org/, 2022. blog/advanced-ratcheting/, 2013.
[10] https://github.com/asonnino/key-transparency/tree/main/scripts, 2022. [32] ——, “Whatsapp’s signal protocol integration is now complete,” https:
[11] S. Bano, A. Sonnino, M. Al-Bassam, S. Azouvi, P. McCorry, S. Meik- //signal.org/blog/whatsapp-complete/, 2016.
lejohn, and G. Danezis, “Sok: Consensus in the age of blockchains,”
[33] S. Meiklejohn, P. Kalinnikov, C. S. Lin, M. Hutchinson, G. Belvin,
in Proceedings of the 1st ACM Conference on Advances in Financial
M. Raykova, and A. Cutter, “Think global, act local: Gossip and client
Technologies, 2019, pp. 183–198.
audits in verifiable data structures,” arXiv preprint arXiv:2011.04551,
[12] S. Bano, A. Sonnino, A. Chursin, D. Perelman, Z. Li, A. Ching, and
2020.
D. Malkhi, “Twins: Bft systems made robust,” in 25th International Con-
ference on Principles of Distributed Systems (OPODIS 2021). Schloss [34] M. S. Melara, A. Blankstein, J. Bonneau, E. W. Felten, and M. J.
Dagstuhl-Leibniz-Zentrum für Informatik, 2022. Freedman, “CONIKS: Bringing key transparency to end users,” in 24th
[13] J. Bonneau, “Ethiks: Using ethereum to audit a coniks key transparency USENIX Security Symposium (USENIX Security 15), 2015, pp. 383–398.
log,” in International Conference on Financial Cryptography and Data [35] S. Micali, M. Rabin, and J. Kilian, “Zero-knowledge sets,” in 44th
Security. Springer, 2016, pp. 95–105. Annual IEEE Symposium on Foundations of Computer Science, 2003.
[14] C. Cachin, R. Guerraoui, and L. Rodrigues, Introduction to reliable and Proceedings. IEEE, 2003, pp. 80–91.
secure distributed programming. Springer Science & Business Media, [36] K. Nikitin, E. Kokoris-Kogias, P. Jovanovic, N. Gailly, L. Gasser,
2011. I. Khoffi, J. Cappos, and B. Ford, “{CHAINIAC}: Proactive software-
[15] D. Catalano, D. Fiore, and M. Messina, “Zero-knowledge sets with update transparency via collectively signed skipchains and verified
short proofs,” in Annual International Conference on the Theory and builds,” in 26th {USENIX} Security Symposium ({USENIX} Security
Applications of Cryptographic Techniques. Springer, 2008, pp. 433– 17), 2017, pp. 1271–1287.
450.
[16] M. Chase, A. Deshpande, E. Ghosh, and H. Malvai, “Seemless: Secure [37] D. Peng and F. Dabek, “Large-scale incremental processing using
end-to-end encrypted messaging with less trust,” in Proceedings of distributed transactions and notifications,” in 9th USENIX Symposium
the 2019 ACM SIGSAC conference on computer and communications on Operating Systems Design and Implementation (OSDI 10), 2010.
security, 2019, pp. 1639–1656. [38] K. Pothong, L. Pschetz, R. Catlow, and S. Meiklejohn, “Problematising
[17] M. Chase, A. Healy, A. Lysyanskaya, T. Malkin, and L. Reyzin, transparency through LARP and deliberation,” in DIS ’21: Designing
“Mercurial commitments with applications to zero-knowledge sets,” in Interactive Systems Conference 2021, Virtual Event, USA, 28 June, July
Annual International Conference on the Theory and Applications of 2, 2021, W. Ju, L. Oehlberg, S. Follmer, S. E. Fox, and S. Kuznetsov,
Cryptographic Techniques. Springer, 2005, pp. 422–439. Eds. ACM, 2021, pp. 1682–1694.
[18] B. Chen, Y. Dodis, E. Ghosh, E. Goldin, B. Kesavan, A. Marcedone, and [39] Y. Sovran, R. Power, M. K. Aguilera, and J. Li, “Transactional storage
M. E. Mou, “Rotatable zero knowledge sets: Post compromise secure for geo-replicated systems,” in Proceedings of the Twenty-Third ACM
auditable dictionaries with application to key transparency,” Cryptology Symposium on Operating Systems Principles, 2011, pp. 385–400.
ePrint Archive, Paper 2022/1264, 2022, https://eprint.iacr.org/2022/1264.
[40] A. Tomescu, V. Bhupatiraju, D. Papadopoulos, C. Papamanthou,
[Online]. Available: https://eprint.iacr.org/2022/1264
N. Triandopoulos, and S. Devadas, “Transparency logs via append-only
[19] A. Cheng, X. Shi, L. Pan, A. Simpson, N. Wheaton, S. Lawande,
authenticated dictionaries,” in Proceedings of the 2019 ACM SIGSAC
N. Bronson, P. Bailis, N. Crooks, and I. Stoica, “Ramp-tao: layering
Conference on Computer and Communications Security, 2019, pp.
atomic transactions on facebook’s online tao data store,” Proceedings of
1299–1316.
the VLDB Endowment, vol. 14, no. 12, pp. 3014–3027, 2021.
[20] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, [41] A. Tomescu and S. Devadas, “Catena: Efficient non-equivocation via
S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild et al., “Spanner: bitcoin,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE,
Google’s globally distributed database,” ACM Transactions on Computer 2017, pp. 393–409.
Systems (TOCS), vol. 31, no. 3, pp. 1–22, 2013. [42] A. Tomescu, Y. Xia, and Z. Newman, “Authenticated dictionaries with
[21] C. Dwork, N. Lynch, and L. Stockmeyer, “Consensus in the presence cross-incremental proof (dis)aggregation,” Cryptology ePrint Archive,
of partial synchrony,” Journal of the ACM (JACM), vol. 35, no. 2, pp. Report 2020/1239, 2020, https://eprint.iacr.org/2020/1239.
288–323, 1988. [43] N. Tyagi, B. Fisch, J. Bonneau, and S. Tessaro, “Client-auditable
[22] M. J. Fischer, N. A. Lynch, and M. S. Paterson, “Impossibility of verifiable registries,” Cryptology ePrint Archive, 2021.
distributed consensus with one faulty process,” Journal of the ACM
(JACM), vol. 32, no. 2, pp. 374–382, 1985. [44] I. Tzialla, A. Kothapalli, B. Parno, and S. Setty, “Transparency
[23] N. Group, https://keybase.io/docs-assets/blog/NCC Group Keybase dictionaries with succinct proofs of correct operation,” IACR Cryptol.
KB2018 Public Report 2019-02-27 v1.3.pdf, 2019. ePrint Arch., p. 1263, 2021. [Online]. Available: https://eprint.iacr.org/
[24] E. Heilman, A. Kendler, A. Zohar, and S. Goldberg, “Eclipse 2021/1263
attacks on bitcoin’s peer-to-peer network,” in USENIX 2015, [45] T. K. Yadav, D. Gosain, A. Herzberg, D. Zappala, and K. Seamons,
J. Jung and T. Holz, Eds. USENIX Association, 2015, “Automatic detection of fake key attacks in secure messaging,” in
pp. 129–144. [Online]. Available: https://www.usenix.org/conference/ Proceedings of the 2022 ACM SIGSAC Conference on Computer and
usenixsecurity15/technical-sessions/presentation/heilman Communications Security, 2022, pp. 3019–3032.

15
A PPENDIX proof πS that com0 is a commitment to DSt+1 such that
DSt ⊆ DSt+1 and each entry of DSt+1 \ DSt = {(label,
A. VKD with Compaction Glossary
val, t + 1) for some label and val}.
In order to discuss details about our approach to the • 0/1 ← oZKS.VerifyInsertions(com, com0 , πS , t): Verifies
compaction of server-side storage, we will need some notation. the proof πS that com, com0 commit to some datastores
We gather notation for our construction here: DS, DS0 respectively, such that DS ⊆ DS0 and that DS0 \
• T OMBSTONE : A special string to replace the value of an DS = {(label, val, t) for some label, val}.
oZKS entry which is valid to delete. • (π, val, t)/⊥ ← oZKS.QueryMem(st, DS, label): This al-
• TombstoneEpochs: The set of epochs when tombstone gorithm takes a datastore DS and a corresponding inter-
marking is permitted. No insertions should be allowed in nal state st, and a label label. If there exists an entry
this epoch, only tombstoning. (label, val, t) ∈ DS, then the algorithm returns (π, val, t)
• CompactionEpochs or DeletionEpochs: The set of epochs where t is the oZKS epoch when the tuple with the label
when entries with value T OMBSTONE are permitted to be label was inserted. If (label, ·, ·) ∈ / DS, return ⊥.
deleted from the oZKS. • 0/1 ← oZKS.VerifyMem(com, label, val, t, π): This algo-
• StaleParam: A system parameter for calculating the upper rithm takes in a triple label, val, t, the membership proof
bound for the epochs in which a node must have been π for (label, val, t) with respect to com. It outputs 1 if the
inserted in order to be valid for tombstoning. proof verifies and 0 otherwise.
• DeletionParam: A system parameter defining the number • π/⊥ ← oZKS.QueryNonMem(st, DS, label): This algo-
of epochs between a tombstone epoch and the subsequent rithm takes a datastore DS and a corresponding internal
compaction epoch. Assuming state st, and a label label. It returns π, a proof of non-
CompactionEpochs and TombstoneEpochs are sorted in in- membership of label in DS.
creasing order, CompactionEpochs = {ti + DeletionParam • 0/1 ← oZKS.VerifyNonMem(com, label, π): This algo-
| ti is the ith element of TombstoneEpochs}i . rithm takes in a label, com and a non-membership proof
Also note that we use the terms identity provider and server π. It outputs 1 if the proof verifies and 0, if not.
interchangeably. • (com0 , stcom0 , DSt+1 , πS , t + 1) ←
oZKS.TombstoneElts(stcom , DSt , S, t, tstale ): This
B. Formal Definitions algorithm should only be called if t ∈ TombstoneEpochs
Definition 1. Given a datastore, DS, i.e. a set of tuples (labeli , and tstale = t − StaleParam. It algorithm takes in the
vali ), DS.Labels, denotes the set {label | ∃(label, ·) ∈ DS}. current datastore DSt , the internal server state stcom , the
current server epoch t, a stale epoch parameter tstale and a
Definition 2. An ordered append-only zero-knowledge set set S of triples {(labeli , vali , ti )} for an update, such that
(with compaction) (oZKS) is a data-structure with a set of (labeli , vali , ti ) ∈ DSt . The algorithm checks that for each
public parameters p, where p includes a set TombstoneEpochs (labeli , vali , tj ) ∈ S, tj ≤ tstale . It initializes DSt+1 = DSt
of epochs, and parameters StaleParam, DeletionParam. An and for each (labeli , vali , tj ) ∈ S, it replaces the entry
oZKS supports the following algorithms (oZKS.CommitDS, (labeli , vali , tj ) of DSt+1 with (labeli , T OMBSTONE , tj ).
oZKS.QueryMem, oZKS.QueryNonMem, oZKS.VerifyMem, It returns this updated DSt+1 , com0 , the commitment to
oZKS.VerifyNonMem, oZKS.InsertionToDS, DSt+1 , stcom0 the updated internal state of the server and
oZKS.VerifyInsertions, oZKS.TombstoneElts, a proof πS that com0 is a commitment to the data store
oZKS.VerifyTombstone, oZKS.DeleteElts, oZKS.VerifyDel) DSt+1 such that (1) if (labeli , T OMBSTONE , tj ) ∈ DSt+1 ,
described below: then (labelj , valj , tj ) ∈ DSt , for some valj , with
• (com, stcom , DS1 ) ← oZKS.CommitDS(1λ , DS): This algo- tj ≤ tstale ; (2) if (labelk , valk , tk ) ∈ DSt with
rithm takes in a datastore DS consisting of label-value pairs tk > tstale , then (labelk , valk , tk ) ∈ DSt+1 and, (3)
(labeli , vali ) with unique keys, and a security parameter λ DSt+1 .Labels = DSt .Labels.
and outputs: stcom , an internal state; com, a commitment to • 0/1 ← oZKS.VerifyTombstone(com, com0 , πS , t, tstale ):
DS; DS1 = {(labeli , vali , 1) | (labeli , vali ) ∈ DS}, here 1 is Verifies the proof πS that com, com0 commit to some
the oZKS epoch, at which each tuple from DS was inserted datastores DS, DS0 respectively, such that DS0 .Labels =
and aux is any extra information needed for for this tuple. DS.Labels, and that for any label ∈ DS0 .Labels such
• (com0 , stcom0 , DSt+1 , πS , t + 1) ← that (label, val, tj ) ∈ DS and (label, val0 , tj ) ∈ DS0 , with
oZKS.InsertionToDS(stcom , DSt , S, t): This algorithm val 6= val0 , then tj ≤ tstale . If all these checks pass, and com
takes in the current datastore DSt , the internal server is the commitment to the epoch t and tstale = t−StaleParam,
state stcom , the current server epoch t and a set S then this algorithm outputs 1, otherwise it outputs 0.
of label-value pairs {(labeli , vali )} for an update, • (com0 , stcom0 , DSt+1 , πS , t + 1) ← oZKS.DeleteElts(stcom ,
such that (labeli , ·, ·) ∈ / DSi . The algorithm computes DSt , t): This algorithm is only run if t − DeletionParam ∈
S 0 = {(label, val, t + 1) | (label, val) ∈ S}. It returns TombstoneElts. If so, takes in the current datas-
DSt+1 = DSt ∪ S 0 , com0 , the commitment to DSt+1 , tore DSt , the internal server state stcom , the cur-
stcom0 the updated internal state of the server and a rent server epoch t and computes the set S =

16
{(label, ·, ·)|(label, T OMBSTONE , ·) ∈ DSt }. It returns
DSt+1 = DSt \ S 0 , com0 , the commitment to DSt+1 ,
stcom0 the updated internal state of the server and a Pr[(label, {ti , comi , ΠUpd
i , auxi }n−1
i=1 , comn , tstale ,
proof πS that com0 is a commitment to DSt+1 such that ΠVer Ver λ
1 , (val1 , Epoch1 ), Π2 , (val2 , Epoch2 )) ← A(1 , p) :
DSt+1 ⊆ DSt and each entry of DSt \ DSt+1 = {(label,
T OMBSTONE , ·)|label ∈ DSt .Labels}. {oZKS.VerifyUpd(comi , comi+1 , ΠUpd i , ti + 1, auxi )}n−1
i=1
^
{ti+1 = ti + 1}n−1i=1
^
∀i ∈ [1, n], ti ∈/ TombstoneEpochs
^
∀i ∈ [1, n], ti − DeletionParam ∈ / TombstoneEpochs
^
1 ≤ l ≤ k ≤ n)
• 0/1 ← oZKS.VerifyDel(com, com0 , πS , t): If ^
t − DeletionParam ∈
/ TombstoneElts, then this oZKS.VerifyMem(coml , label, val1 , Epoch1 , ΠVer 1 )
algorithm outputs 0. Otherwise, it verifies the proof ^
(
πS that com, com0 commit to some datastores
DS, DS0 respectively, such that DS0 ⊆ DS and that (
DS \ DS0 = {(label, T OMBSTONE , ·)|label ∈ DSt .Labels}. (([l, k] ∩ TombstoneEpochs = [l, k] ∩ DeletionEpochs = ∅)∧
((val1 , Epoch1 ) 6= (val2 , Epoch2 ) ∧ oZKS.VerifyMem(comk ,
label, val2 , Epoch2 , ΠVer
2 ))
_
(oZKS.VerifyNonMem(comk , label, ΠVer
2 ))

))
• 0/1 ← oZKS.VerifyUpd(t1 , tn , {comi , πSi , auxi }n−1
i=1 , comn , πS ):
_
(
First, this algorithm checks that tn − t1 is the number
of commitments it received. It denotes ti+1 = ti + 1, [l, k] ∩ TombstoneElts = {j} ∧ [l, k] ∩ DeleteElts = ∅∧
for i ∈ [1, n − 1]. Then it outputs 1, if all of tstale = tj − StaleParam∧
the following checks pass. For i ∈ [1, n − 1]: if
(((val1 , Epoch1 ) 6= (val2 , Epoch2 ) ∧ Epoch1 > tstale
ti ∈ TombstoneEpochs, it parses auxi = tstale and runs the
check oZKS.VerifyTombstone(comi , comi+1 , πSi , ti , tstale ). ∧oZKS.VerifyMem(comk , label, val2 , Epoch2 , ΠVer
2 ))
ti −
_
Else, it ignores auxi and if ((val1 , Epoch1 ) 6= (val2 , Epoch2 )
DeletionParam ∈ TombstoneEpochs, it runs
oZKS.VerifyDel(comi , comi+1 , πSi , ti ), else, it runs ∧oZKS.VerifyMem(comk , label, val2 , Epoch2 , ΠVer
2 )
the check oZKS.VerifyInsertions(comi , comi+1 , πSi , ti ). ∧Epoch1 ≤ tstale ∧ val2 6= T OMBSTONE )
_
(oZKS.VerifyNonMem(comk , label, ΠVer
2 ))

))
_
(
[l, k] ∩ DeleteElts = {j}
∧(oZKS.VerifyNonMem(comk , label, ΠVer
2 )
∧val1 6= T OMBSTONE )
))].

Definition of a VKD. [16] defined a VKD as a collection of


the following algorithms (included here for completeness):
Soundness for oZKS with Compaction.
• (Dirt , stt , ΠUpd )/⊥ ← VKD.Publish(Dirt−1 , stt−1 , St ):
This algorithm takes as input a directory Dirt−1 , an
internal server sstate stt−1 and a set of updates St
consisting of (label, val) pairs. It updates Dirt−1 and stt−1
to reflect the updates in St . If this update is successful, the
algorithm updates the commitment using the WitnessAPI
Definition 3. An oZKS with compaction, defined in sec- and returns the next state stt , the updated directory Dirt
tion III-C is said to be sound if, for any PPT adversary A, and a proof that this update was correct. Otherwise it
each of the following is less than or equal to negl(λ): outputs ⊥.

17
• (π, val, α) ← VKD.Query(stt , Dirt , label): This algo- consecutive epochs and a parameter tstale which tells it
rithm takes as input a server internal state stt , a directory which values are old enough for tombstoning, and verifies
Dirt and a label label and if label is in Dirt , it returns the proof ΠUpd of correct tombstoning.
the value val, the version number α and the proof π that • (comt+1 , Dirt+1 , stt+1 , ΠUpd , t + 1) ← cVKD.Compact(
the provided information is correct with respect to the Dirt , stt , t): This algorithm takes in a directory Dirt , an
commitment for the epoch t. internal state stt , the epoch t and outputs the updated
• 0/1 ← VKD.VerifyQuery(t, label, val, π, α): This algo- directory, state, commitment and epoch, with tombstoned
rithm takes as input an epoch t, a label label, a purported values deleted. It also includes a proof for this update.
version number (α) and value val, corresponding to label • 0/1 ← cVKD.VerifyCompact(comt , comt+1 , ΠUpd ): This
and a proof π that val and α are correct. It retrieves the algorithm takes as input a server’s commitment before and
commitment comt for the epoch t using the WitnessAPI after a compaction and checks the proof ΠUpd that the only
and verifies π with respect to this commitment. It returns change made by the server was removing data associated
1 if the proof verifies and 0 in any other case. with tombstoned values.
• ((vali , ti )ni=1 , ΠVer ) ←
VKD.KeyHistory(stt , Dirt , t, label): This algorithm
takes as input the internal state (stt ) and directory (Dirt ) The constraints on TombstoneEpochs, CompactionEpochs
at epoch t and a label label. It outputs an ordered set and StaleParam for a cVKD are as follows:
of tuples (vali , ti )ni=1 , where vali is the purported value
corresponding to thatcameintoef f ectatepochti . It also
returns the proof ΠVer to attest to these state changes for
the label label. • If ti and ti+1 are two epochs in the ordered set
• 0/1 ← VKD.VerifyHistory(t, label, (vali , ti )ni=1 , ΠVer ): TombstoneEpochs, then, there exists a deletion epoch tdel ∈
This algorithm takes as input the ordered set of tuples DeleteElts, between them. That is, ti < tdel < ti+1 .
(vali , ti )ni=1 and proof ΠVer , retrieves the commitment for • A piece of data can only be marked as tombstoned in
the epoch t and the label label, and verifies ΠVer with a particular epoch ti ∈ TombstoneEpochs if it was in-
respect to the commitment. serted prior to the last tombstone epoch ti−1 . That is,
• 0/1 ← VKD.VerifyUpd(t, com, com0 , ΠUpd ): This algo- StaleParam > min{|t − t0 | : t, t0 ∈ TombstoneEpochs}.
rithm verifies the output of a single publish operation
with respect to an initial and  final commitment.
tn−1
• 0/1 ← VKD.Audit(t1 , tn , ΠUpd t ): This algorithm Since we have introduced two new kinds of updates to the
t=t1 server, we modify the key history and audit algorithms for a
takes as input a starting epoch t1 and an ending epoch tn ,
cVKD as follows:
retrieves the required commitments to verify the proofs
ΠUpd
t attesting to the correct evolution of the server’s
state.
• 0/1 ← cVKD.VerifyHistory(t, label, mint , maxt ,
Definition of a VKD with compaction. Our modified primi-
(vali , ti )maxt
i=mint , Π Ver
): This algorithm takes as input
tive, which we call a VKD with compaction, denoted as cVKD,
an epoch t, with respect to which history is being verified
is defined below. Note the additional set of public parameters:
and a label label whose values are being checked, mint ,
(TombstoneEpochs, CompactionEpochs, StaleParam). These
which represents a minimum version number which is not
determine when tombstoning (ordered set TombstoneEpochs)
deleted or tombstoned at epoch t and maxt , the maximum
or compaction (ordered set CompactionEpochs) are permitted,
version number for label. It also includes the values and
and how old a piece of data needs to be, in order to be eligible
epochs at which each version in [mint , maxt ] was inserted.
for tombstoning (StaleParam).
A VKD with compaction (cVKD) is a VKD with It obtains the server commitment at epoch t and verifies
public parameters (TombstoneEpochs, CompactionEpochs, the proof ΠVer that all versions below mint were correctly
StaleParam) and following additional algorithms: tombstoned or deleted and that that the history presented
Upd for versions from mint onwards is correct.
• (comt , Dirt , stt , Π , t) ← cVKD.TombstoneElts(Dirt−1 , tn −1
• 0/1 ← cVKD.Audit(t1 , tn , {ΠUpd t }t=t1 ): For each epoch
stt−1 , t−1, S, tstale ): This algorithm takes as input an epoch
t ∈ [t1 , tn − 1], if t ∈ TombstoneEpochs, this algorithm
t − 1, the server’s directory Dirt−1 at this epoch, its internal
gets the corresponding commitments and proof, passes it
state stt−1 , a set S of items it wants to tombstone and a
to cVKD.VerifyTombstone, if t ∈ CompactionEpochs,
parameter tstale to determine the cutoff for tombstoning. It
it passes the corresponding commitments and proof to
updates the data in the state and directory by tombstoning
cVKD.VerifyCompact, else, it passes the appropriate inputs
the appropriate elements of S and returns the updated
to cVKD.VerifyUpd.
Dirt , stt and corresponding commitment comt .
Upd
• 0/1 ← cVKD.VerifyTombstone(comt , comt+1 , Π , tstale ):
This algorithm takes as input two commitments from Definition 4. A VKD with compaction is said to be sound if,

18
for any PPT adversary A, internal state stt+1 and commitment comt+1 and returns
comt+1 , DSt+1 , stt+1 as well as a proof ΠUpd , that
Pr[(label, {t, {(valtk , Epochtk )}max Ver
k=mint , Πt }t∈{t1 ,...,tn } ,
t
comt+1 is indeed the commitment to an update to the
{(comk , ΠUpd tcurrent ∗ λ
k )}k=t1 , t , j, (π, val, β)) ← A(1 ) : data store comt commited to, with update set S.
∧{comk ← WitnessAPI.GetCom(k)}tk=t
current • 0/1 ← SA.VerifyUpd(com, com0 , S, ΠUpd ): This algo-
1
tcurrent −1
rithm takes as input a commitment com to a data store
∧VKD.Audit(t1 , tcurrent , {(ΠUpd
k )}k=t1 ) DS, another commitment com0 to an update to DS, the
∧∀t ∈ [t1 , ..., tn ], VKD.VerifyHistory(t, label, set of updates S and the proof that com0 was indeed the
{(valti , Epochti )}max Ver result of updating DS with S. It outpus 1 if ΠUpd verifies,
i=mint , Πt )
t

otherwise, it outputs 0.
∧∀t ∈ TombstoneEpochs, s.t.
• (comt+1 , DSt1 , stt+1 , ΠUpd ) ←
[t, t + DeletionParam] ∩ [t1 , ..., tcurrent ] 6= ∅ SA.DeleteElts(DSt , stt , S): This algorithm takes in
=⇒ {t1 , ..., tn } ∩ [t, t + DeletionParam] 6= ∅ a datastore DSt , corresponding stt , comt , as well as a
∧t1 ≤ t2 ≤ ... ≤ tn ≤ tcurrent set S of deletions. It initializes DSt+1 = DSt . For all
(l, v) ∈ S, if (l, v) ∈ DSt , for some v, it removes (l, v)
∧StaleParam >
from DSt+1 . It computes the corresponding updated
0 0
min{|t − t | | t, t ∈ TombstoneEpochs} internal state stt+1 and commitment comt+1 and returns
∧( comt+1 , DSt+1 , stt+1 as well as a proof ΠUpd , that
0 0 0 comt+1 is indeed the commitment to an update to the
(∃t, t ∈ {t1 , ..., tn }, γ, (valtγ , Epochtγ ) 6= (valtγ , Epochtγ ))
data store comt commited to, with deletion set S.
∨ • 0/1 ← SA.VerifyDel(com, com0 , S, ΠUpd ): This algo-

(tj−1 ≤ t < tj rithm takes as input a commitment com to a data store
∧(Epochtαj ≤ t∗ < Epochα+1
j t
∨ α = maxtj ) DS, another commitment com0 to an update to DS, the
set of deletions S and the proof that com0 was indeed the
∧valtαj 6= val ∧ t∗ < t0j result of deleting the entries of S from DS. It outputs 1
∧VKD.VerifyQuery(t∗ , label, val, π)) if ΠUpd verifies, otherwise, it outputs 0.
)] ≤ negl(λ). 2) oZKS from a (SA, sVRF, sCS) triple.
C. Constructing an oZKS Our oZKS construction, first implemented by [6] is very
similar to the aZKS construction of [16], based on a strong
accumulator (SA), a simulatable commitment scheme (sCS)
1) Strong Accumulators and a strong VRF (sVRF).
• (com1 , DS1 , st1 ) ← SA.CommitDS(1λ , DS): This algo- The main distinction between the oZKS construction and the
rithm takes in a data store DS, a security parameter 1λ aZKS construction of [16] becomes clear when we consider
and outputs a commitment com1 to DS, a copy DS1 of a call of the form oZKS.UpdateDS(stcom , DSt , S, t). The al-
the data store and st1 , the internal state corresponding to gorithm computing oZKS.UpdateDS computes S 0 = {(label,
DS1 and com1 . val, t + 1) | (label, val) ∈ S}, then it computes S 00 = {(l, v) |
• (v, π) ← SA.Query(DSt , stt , comt , l): This algorithm l = sVRF.Compute(SK, label), v = (sCS(val; r), t + 1), for
takes in a data store DSt , the corresponding internal state (label, val) ∈ S 0 }, DSt+1 = DSt ∪ S 0 , DSSA SA
t+1 = DSt ∪ S .
00
SA 00
and commitment to it stt , comt , and a queried label l. If Finally, it calls SA.UpdateDS(DSt , S ). [16]’s aZKS simply
there exists a pair (l, v) ∈ DSt , the algorithm outputs v omits including the epoch t + 1 in the value committed in the
and the proof π that (l, v) ∈ DSt . Otherwise it returns SA.
v = ⊥ and π is a non-membership proof for l in DSt . The proof of correct update includes the newly inserted pairs
• 0/1 ← SA.Verify(com, l, v, π): Given a commitment com S 00 = {(l, v)} and the auditor of the update must additionally
to a datastore, a label l and corresponding string v, if parse each v to ensure that it contains the correct epoch (t+1).
v = ⊥, the algorithm parses π as a non-membership proof The main novelty of our oZKS construction is in the
for l, with respect to the commitment com, otherwise, it tombstone and deletion paradigm. The additional functions to
parses it as a membership proof for (l, v). The algorithm support compaction are as follows.
0
outputs 1 if π verifies and 0 otherwise. • (com , stcom0 , DSt+1 , πS , t + 1) ←
• (comt+1 , DSt1 , stt+1 , ΠUpd ) ← oZKS.TombstoneElts(stcom , DSt , S, t, tstale ): This
SA.UpdateDS(DSt , stt , S): This algorithm takes in algorithm should only be called if t ∈ TombstoneEpochs
a datastore DSt , corresponding stt , comt , as well as a and tstale = t − StaleParam. It algorithm takes in the
set S of updates. It initializes DSt+1 = DSt . For all current datastore DSt , the internal server state stcom ,
(l, v) ∈ S, if (l, v 0 ) ∈ DSt , for some v 0 , it replaces the current server epoch t, a stale epoch parameter
this string in DSt+1 with (l, v), else, it adds (l, v) tstale and a set S of triples {(labeli , vali , ti )} for
to DSt+1 . It computes the corresponding updated an update, such that (labeli , vali , ti ) ∈ DSt . The

19
algorithm checks that for each (labeli , vali , tj ) ∈ S, of the SA needs to be verified, even in the case of historical
tj ≤ tstale . Then, this algorithm computes S 0 = queries about some label. This alleviates the need to store all
{(sVRF(SK, labeli ), T OMBSTONE , tj )|(labeli , vali , tj ) ∈ states of an evolving oZKS for the VKD construction, making
S, r ←$ {0, 1}λ }. It calls SA.UpdateDS(DSSA 0
t , stt , S ) space complexity of the VKD’s state st linear in the number
SA SA
to obtain DSt+1 , stt+1 , comt+1 , π. It instantiates of leaves, rather than in the number of epochs, unlike the
DSt+1 = DSt and for each (labeli , vali , tj ) ∈ S, construction using the aZKS of [16]. Additionally, this means
it replaces the entry (labeli , vali , tj ) of DSt+1 with that the RAM complexity of multiple, simultaneous history
(labeli , T OMBSTONE , tj ). It returns this updated queries can be amortized over the number of queries and the
DSt+1 , com0 , the commitment to DSt+1 , stcom0 the number of proofs πij being proven over all the history queries.
updated internal state of the server and a proof πS D. Witness API
that com0 is a commitment to the data store DSt+1
such that (1) if (labeli , T OMBSTONE , tj ) ∈ DSt+1 , The witness API is a set of algorithms (GetCom, VerifyCert,
then (labelj , valj , tj ) ∈ DSt , for some valj , with ProposeNewEp) and a variable Epoch, initialized to 0.
tj ≤ tstale ; (2) if (labelk , valk , tk ) ∈ DSt with • (comt , certt )/⊥ ← WitnessAPI.GetCom(t): This algo-
tk > tstale , then (labelk , valk , tk ) ∈ DSt+1 and, (3) rithm, callable by any party, takes as input an epoch t and
DSt+1 .Labels = DSt .Labels. if it has a commitment for this epoch, it returns the epoch
• 0/1 ← oZKS.VerifyTombstone(com, com0 , πS , t, tstale ): comt and the corresponding certificate certt . Else, it returns
This algorithm parses πS to get the set S of en- ⊥.
tries {(labeli , T OMBSTONE , tj )} and a proof π of • 0/1 ← WitnessAPI.VerifyCert(comt , certt , p): This algo-
strong accumulator update. It verifies the proof π that rithm verifies the certificate certt to comt , with respect to
com, com0 include committments comSA , com0SA , such public parameters p.
that SA.VerifyUpd(comSA , com0SA , S, π) outputs 1. It • certt /⊥ ← WitnessAPI.ProposeNewEp(t, comt , π): This
also checks that com, com0 commit to datastores DS, algorithm, called by a server, takes as input a proposed
DS0 such that DS0 .Labels = DS.Labels, and that for any commitment comt , an epoch t, as well as a proof π. It
label ∈ DS0 .Labels such that (label, val, tj ) ∈ DS and verifies π using the Audit algorithm of the VKD solu-
(label, val0 , tj ) ∈ DS0 , with val 6= val0 , then tj ≤ tstale tion implemented by the server, and if this verifies, if
and label ∈ S. If all these checks pass, and com is the WitnessAPI.Epoch = t − 1, WitnessAPI this algorithm sets
commitment to the epoch t and tstale = t − StaleParam, Epoch = t and outputs certt . Else, it outputs ⊥ and leaves
then this algorithm outputs 1, otherwise it outputs 0. Epoch = t − 1.
• (com0 , stcom0 , DSt+1 , πS , t + 1) ← Note that we assume an implicit setup (which includes deter-
oZKS.DeleteElts(stcom , DSt , t): This algorithm is mining the various cryptographic operations, the identity of
only run if t − DeletionParam ∈ TombstoneElts. If so, it the IdP, etc. We also assume that the identity of the parties
takes in the current datastore DSt , the internal server state jointly supplying the WitnessAPI is public and they each have
stcom , the current server epoch t and computes the set known public keys.
S = {(sVRF(SK, label), ·, ·)|(label, T OMBSTONE , ·) ∈ E. oZKS-based VKD Construction
DSt }. It returns DSt+1 = DSt \ S, it gets
com0 and stcom0 by computing the subroutine Below, we discuss the details of each of the API calls of
SA.DeleteElts(DSSA SA our VKD construction, using an oZKS. We assume that the
t , stt , S), correspondingly updates
its state to get DSt+1 , stcom0 the updated internal server’s identity (public key) is public and signs its responses
state of the server and a proof πS that com0 is to user/auditor queries, so they cannot be impersonated.
Upd
a commitment to DSt+1 such that DSt+1 ⊆ DSt • (Dirt , stt , Π )/⊥ ← VKD.Publish(Dirt−1 , stt−1 , St ): The
and each entry of DSt \ DSt+1 = {(label, server receives as input a set St containing (label, val)
T OMBSTONE , ·)|label ∈ DSt .Labels}. pairs. The algorithm instantiates a set St0 = ∅. If label
• 0/1 ← oZKS.VerifyDel(com, com0 , πS , t): If t − does not exist in VKD, it sets α = 1, else, label must
DeletionParam ∈ / TombstoneElts, then this algorithm have a version number α − 1. In either case, the algo-
outputs 0. Otherwise, it verifies the proof πS that rithm adds (label|α, val) to St0 . If α > 1, it also adds
com, com0 commit to some datastores DS, DS0 respec- (label|α − 1|0 stale0 , 0) to St0 . It retrieves (stoZKS t−1 , DSt−1 )
tively, such that DS0 ⊆ DS and that DS \ DS0 = from stt−1 , and gets the output (comt , stoZKS t , DSt , π, t)
0
{(label, T OMBSTONE , ·)|label ∈ DSt .Labels} by veri- of oZKS.InsertionToDS(stoZKS t−1 , DS t−1 , St , t − 1), updates
oZKS Upd
fying SA.VerifyDel. stt−1 with stt , DSt to get stt , sets Πt ← π.
Dirt starts off as equal to Dirt−1 . For each label ∈
3) Efficiency improvements due to oZKS
St , if α (set above) is 1, the algorithm initializes an
The major advantage of constructing an oZKS and including empty list Llabel , else, it retrieves (label, Llabel ); it appends
the epoch a leaf was inserted in the tree is that if audits are (α, val, t + 1) to the front of Llabel and includes this
honestly verified, a SA membership proof includes when the updated (label, Llabel ) in Dirt . Then, the algorithm calls
leaf was inserted. This means that only the latest commitment WitnessAPI.ProposeNewEp(t, comt , ΠUpd ). If WitnessAPI.

20
tn −1
ProposeNewEp outputs ⊥, the algorithm reverts all its It checks that t1 < tn and that the tuple (ΠUpd t )t=t1 )
Upd
internal changes to stt−1 , Dirt−1 , and outputs ⊥. Finally, parses to exactly tn − t1 proofs Πt . If this check
the algorithm returns (Dirt , stt , ΠUpd ). fails, it outputs 0, and if it passes, the algorithm obtains
• (π, val, α) ← VKD.Query(stt , Dirt , label): If label ∈ / Dirt , (comt , certt ) ← WitnessAPI.GetCom(t) for t ∈ [t1 , tn ].
this algorithm outputs ⊥. Else, it recovers α, the latest It verifies WitnessAPI.VerifyCert(comt , certt , p), for t ∈
version number for label, i.e. the first entry of Llabel . It [t1 , tn ], and if it outputs 0, the algorithm outputs 0.
sets β = 2blog(α)c , and parses out (stoZKS t , DSt ) out of stt Else, the algorithm outputs 1 if VKD.VerifyUpd(ti , ti +
it computes (1) (πmem , val, i) ← oZKS.QueryMem(stoZKS t , 1, (comti , ΠUpd
ti ), comti +1 ) outputs 1 for each i ∈ [1, n − 1].
DSt , label|α), (2) πfresh ← oZKS.QueryNonMem(stoZKS t ,
DSt , label|α|0 stale0 ) and (3) (πhist , valhist , ihist ) ← oZKS. F. cVKD: VKD with Secure Compaction
QueryMem(stoZKS t , DSt , label|β). The algorithm sets π = 1) Secure Compaction: Attempt 1
(πmem , πfresh , πhist , i, valhist , ihist , ) and returns (π, val, α).
• 0/1 ← VKD.VerifyQuery(t, label, val, π, α): The client As a first attempt, let us extend the oZKS data structure
running this algorithm calls WitnessAPI.GetCom(t) to of [6] to support two additional algorithms:
obtain (comt , certt ). If WitnessAPI.VerifyCert(comt , certt , • (com0 , stcom0 , DSt+1 , πS , t + 1) ← oZKS.DeleteElts1 (stcom ,
p) outputs 0, then this algorithm outputs 0. Else, it DSt , S, t, tstale ): This algorithm takes in the current datastore
parses π as (πmem , πfresh , πhist , i, valhist , ihist ), sets β = DSt , the internal server state stcom , the current server epoch
2blog(α)c and returns 1 if all of the following re- t, a stale epoch parameter tstale and a set S of triples
turn 1: (1) oZKS.VerifyMem(comt , label|α, val, i, πmem ), (2) {(labeli , vali , ti )} for an update, such that (labeli , vali , ti ) ∈
oZKS.VerifyNonMem(comt , label|α|0 stale0 , πfresh ) and (3) DSt . The algorithm checks that for each (·, ·, tj ) ∈ S, tj ≤
oZKS.VerifyMem(comt , label|β, valhist , ihist , πhist ). tstale . It returns DSt+1 = DSt \ S 0 , com0 , the commitment
• ((vali )ni=1 , ΠVer ) ← VKD.KeyHistory(stt , Dirt , t, label): If to DSt+1 , stcom0 the updated internal state of the server and
label is not in Dirt , then the algorithm returns ⊥. Else, it a proof πS that com0 is a commitment to DSt+1 such that
computes stoZKS t , DSt from stt and retrieves (label, Llabel ) DSt+1 ⊆ DSt and each entry of DSt \ DSt+1 = {(labelj ,
from Dirt . Let Llabel = {(i, vali , ti )}ni=1 . Then, for each valj , tj ) for some tj ≤ t}.
i = 2, ..., n, the algorithm computes • 0/1 ← oZKS.VerifyDel1 (com, com0 , πS , tstale ): Verifies the
– (π1i , vali , ti ) ← oZKS.QueryMem(st, DS, label|i), proof πS that com, com0 commit to some datastores DS, DS0
– (π2i , 0, ti ) ← oZKS.QueryMem(st, DS, label|i−1|0 stale0 ), respectively, such that DS0 ⊆ DS and that DS \ DS0 =
It also computes (π11 , val1 , t1 ) ← oZKS.QueryMem(st, DS, {(label, val, t) for some t ≤ tstale }.
label|1). With a = blog(n)c+1, b = blog(t)c, and α = 2a − Based on these two algorithms, we could construct com-
1, the algorithm computes π3j ← oZKS.QueryNonMem(st, paction as follows:
DS, label|j) for j = n + 1, ..., α. It also gets π4k ← oZKS. • (Dirt , stt , ΠUpd )/⊥ ← VKD.Compact1 (Dirt−1 , stt−1 , t − 1,
QueryNonMem(st, DS, label|2k ) for k ∈ a, ..., b. Finally, tstale ): The algorithm initializes St = ∅. For (label, Llabel ) ∈
it sets ΠVer = ((π1i )ni=1 , (π2i )ni=2 , (π3j )α k b
j=n+1 , (π4 )k=a ) and Dirt−1 , for (α, val, tα ) ∈ Llabel , if tα ≤ tstale , it adds
n Ver
outputs ((vali , ti )i=1 , Π ). (label|α) to St . If α > 1, it also adds (label|α − 1|0 stale0 )
• 0/1 ← VKD.VerifyHistory(t, label, (vali , ti )ni=1 , ΠVer ): to St . Then, it obtains (comoZKS , stoZKS , DSoZKS , πSt , t) ←
t t t
This algorithm calls WitnessAPI.GetCom(t) to obtain oZKS.DeleteElts1 (stcom , DSt−1 , St , tstale ), updates the cor-
(comt , certt ). If WitnessAPI.VerifyCert(comt , certt , p) out- responding state stt = (DSoZKS , stoZKS ), deletes the
t t
puts 0, then this algorithm outputs 0. Else, this algorithm corresponding (α, val, tα ) tuples from Llabel for each
parses ΠVer as ((π1i )ni=1 , (π2i )ni=2 , (π3j )α k b
j=n+1 , (π4 )k=a ) label to get Dirt , comt = comoZKS . Then, it calls
t
where a = blog(n)c + 1, b = blog(t)c and α = 2a − 1. WitnessAPI.ProposeNewEp(t, comt , ΠUpd ). If WitnessAPI.
It outputs 1 if t1 < t2 < ... < tn , and all of the following ProposeNewEp outputs ⊥, the algorithm reverts all its
output 1: internal changes to stt−1 , Dirt−1 , and outputs ⊥. Otherwise
– oZKS.VerifyMem(comt , label|i, vali , ti , π1i ) for i ∈ [1, n]. it returns (Dirt , stt , ΠUpd ).
– oZKS.VerifyMem(comt , label|i − 1|0 stale0 , 0, ti , π2i ) for The auditing algorithm VKD.VerifyCompact1 , which ver-
i = [2, n]. ifies the proof output by VKD.Compact1 would work in
– oZKS.VerifyNonMem(comt , label|j, π3j ) for j ∈ [n + the obvious way, calling oZKS.VerifyDel1 as a subroutine.
1, α]. Correspondingly, VKD.Audit must be modified, so at epochs
– oZKS.VerifyNonMem(comt , label|2k , π4k ) for k ∈ [a, b]. which include deletions, it calls VKD.VerifyCompact1 , instead
tn−1
• 0/1 ← VKD.VerifyUpd(t1 , tn , (comt , ΠUpd t )t=t1 , comtn ): of VKD.VerifyUpd.
This algorithm outputs 1 if This construction, however, creates a problem if a user is
oZKS.VerifyInsertions(comti , comti +1 , πti , ti + 1) outputs not always online. Consider the following attack: suppose the
1 for each i ∈ [1, n − 1]. server is at epoch 1000, the label Alice is at version 10, with
tn−1
• 0/1 ← VKD.Audit(t1 , tn , (ΠUpd t )t=t1 ): This algorithm value val10 . Also suppose that the entry val10 for Alice was
takes as input a starting time t1 and an end time tn . inserted in epoch 100. Now, if tstale = 101 the server could

21
compact the oZKS label Alice|10 and roll back her key to its – oZKS.VerifyNonMem(comt , label|i|stale, π2i ) for 1 ≤
previous version. i < j.
2) Secure Compaction: Attempt 2 This algorithm parses π undeleted as
i n i n j α k b
((π1 )i=1j , (π2 )i=2 , (π3 )j=n+1 , (π4 )k=a ) where
Recall that when a client performs a lookup for label a = blog(n)c + 1, b = blog(t)c and α = 2a − 1.
label, the IdP internally calls VKD.Query to obtain param- It outputs 1 if t1 < t2 < ... < tn , and all of the
eters including a value val, a version α and proofs π = following output 1:
(πmem , πfresh , πhist ). Here πfresh is a non-membership proof for
– oZKS.VerifyMem(comt , label|i, vali , ti , π1i ) for i ∈
the label label|α|0 stale0 in the underlying oZKS. If compaction
[j, n].
simply designed as VKD.Compact1 above, a malicious server
– oZKS.VerifyMem(comt , label|i−1|0 stale0 , 0, ti , π2i ) for
may only add label|α|0 stale0 label to St for some time-step
i = [j + 1, n].
t and never the label|α label, which makes it possible for
– oZKS.VerifyNonMem(comt , label|j, π3j ) for j ∈ [n +
a lookup proof to pass with a stale (or compromised key)
1, α].
returned upon a lookup.
– oZKS.VerifyNonMem(comt , label|2k , π4k ) for k ∈
We attempt to mitigate this issue by introducing a special
[a, b].
public parameter DeletionEpochs, such that calls to oZKS
deletion will only verify if they are made at an epoch in the While the above patch mitigates the previous attack of
set DeletionEpochs. We propose an updated deletion for the arbitrary mutations, this patch would only work to satisfy
oZKS. soundness if the user audits her own key history at every epoch
0 t ∈ CompactionEpochs. If not, there is at least one epoch
• (com , stcom0 , DSt+1 , πS , t + 1) ← oZKS.DeleteElts2 (stcom ,
where the server could cheat by deleting her latest version,
DSt , S, t, tstale ): This algorithm runs if and only if t ∈
rolling back to a previous one, then reinserting the correct
DeletionEpochs. All other operations are the same as
version. Recall that our original assumption was that a user
oZKS.DeleteElts1 .
0 should be able to check their entire key-history when coming
• 0/1 ← oZKS.VerifyDel2 (com, com , πS , tstale , t): In addi-
online at any epoch, after any amount of time offline.
tion to the checks in oZKS.VerifyDel1 , this algorithm also
ensures that the version number of DS is t and t ∈ Besides, even if each user came online at each epoch in
DeletionEpochs. CompactionEpochs, the server losing part of its history in
compaction epochs means that if the server deletes a user’s
Since deletions in the updated oZKS API are only pos- latest key (if it was old enough), the audit would pass and
sible at epochs in the set, DeletionEpochs, we can up- the user would have no way to show that its latest key was
date our VKD to include a corresponding public parameter deleted.
CompactionEpochs, and introduce the following assumption:
In the following design, we (1) slacken the requirement for
All users come online at deletion epochs to ensure their
when a user needs to come online to check that it’s key is
stale keys are deleted appropriately.
going to be correctly deleted, and (2) give the user a way to
As in the previous construction VKD.Compact2
contest the changes to its own existing keys.
and VKD.VerifyCompact2 call oZKS.DeleteElts and
oZKS.VerifyDel as subroutines. VKD.Compact2 only runs 3) Two-phase Compaction
at epochs in CompactionEpochs and VKD.VerifyCompact2
only verifies if the epoch presented is in CompactionEpochs. In this section, we present our final construction, which we
call a two-phase compaction, which allows a VKD to support
Finally, we augment VKD.Audit as follows:
Upd tn −1 the algorithm Compact, without additional privacy leakage.
• VKD.Audit(t1 , tn , (Πt , auxt )t=t ): Now, for
1 As stated before, even if compaction were supported with
t = t1 , ..., tn − 1, if t ∈ CompactionEpochs, fixed epochs which are demarcated for compaction, since after
this algorithm parses aux = tstale and calls compaction, there is no record of the changes made, unless a
VKD.VerifyCompact2 (comt , comt+1 , ΠUpd , tstale ). Else, it user is online at the compaction epoch, she cannot ensure that
calls VKD.VerifyUpd(t, t + 1, (comt , ΠUpd t ), comt+1 ). It any oZKS entries associated with her label were not modified.
outputs 1, if all subroutines output 1, otherwise it outputs We weaken this requirement by allowing a grace period for a
0. user to check her key history, before a compaction, by marking
The algorithm VKD.KeyHistory gets modified to the fol- oZKS entries slated for deletion as tombstoned for a while,
lowing: before they are actually deleted.
n Ver
• 0/1 ← VKD.VerifyHistory1 (t, label, (vali , ti )i=j , Π ): Now, we use the algorithms (oZKS.TombstoneElts,
This algorithm calls WitnessAPI.GetCom(t) to obtain oZKS.VerifyTombstone, oZKS.DeleteElts, oZKS.VerifyDel),
(comt , certt ). If WitnessAPI.VerifyCert(comt , certt , p) defined in section III-C to construct a VKD with compaction.
outputs 0, then this algorithm outputs 0. The algorithm Recall that when a client performs a lookup for label
parses ΠVer as (π deleted , π undeleted ). label, the IdP internally calls VKD.Query to obtain param-
It parses π deleted as ((π1i )ji=1 , (π2i )ji=2 ). Then, it verifies eters including a value val, a version α and proofs π =
– oZKS.VerifyNonMem(comt , label|i, π1i ) for 1 ≤ i < j. (πmem , πfresh , πhist ). Here πfresh is a non-membership proof for

22
the label label|α|0 stale0 in the underlying oZKS. If compaction the auditor calls VKD.VerifyTombstone(comt , comt+1 ,
simply designed as VKD.Compact1 above, a malicious server ΠUpd
t , t − StaleParam), else, if t − DeletionParam ∈
may only add label|α|0 stale0 label to St for some time-step TombstoneEpochs, i.e. t ∈ CompactionEpochs,
t and never the label|α label, which makes it possible for call VKD.VerifyCompact(comt , comt+1 , ΠUpdt ,
a lookup proof to pass with a stale (or compromised key) t − DeletionParam − StaleParam). For all other
returned upon a lookup. t ∈ [t1 , tn − 1], this algorithm calls VKD.VerifyUpd(t, t + 1,
Finally, we can define compaction as follows: (comt , ΠUpd
t ), comt+1 ). It outputs 1, if all subroutines
Upd output 1, otherwise it outputs 0.
• (comt , Dirt , stt , Π , t) ←
VKD.TombstoneElts(Dirt−1 , stt−1 , t − 1, tstale ):
Soundness for VKD with Compaction.
As in VKD.Compact1 and VKD.Compact2 , the
algorithm initializes StoZKS = ∅, StAVD = ∅. For Theorem 1. The construction for a VKD with compaction
(label, Llabel ) ∈ Dirt−1 , for (α, val, tα ) ∈ Llabel : in appendix F3 satisfies Definition 4.
– If tα ≤ tstale , it adds (label|α) to St .
Proof. First, note that for any epoch t, the value comt received
– If α > 1, it also adds (label|α − 1|0 stale0 ) to St .
by all pairs of parties must be equal. If not, the adversary of the
Then, it obtains the required stoZKS t and DSt and calls WitnessAPI can use a VKD adversary to output inconsistent
oZKS.TombstoneElts( stcom , DSt , St , tstale ). It updates the values, breaking the security of the WitnessAPI.
corresponding state, deletes the corresponding (α, val, tα ) Suppose for every t ∈ TombstoneElts such that [t, t +
tuples from Llabel for each label. Then, the algorithm calls DeletionParam] ∩ [t1 , tcurrent ] is non-empty, the user receives
WitnessAPI.ProposeNewEp(t, comt , ΠUpd ). If WitnessAPI. a verifying key history proof at some epoch t0 ∈ [t, t +
ProposeNewEp outputs ⊥, the algorithm reverts all its DeletionParam]. Also, assume that the system parameter
internal changes to stt−1 , Dirt−1 , and outputs ⊥. Finally, StaleParam is large enough that no data relating to changes
this algorithm returns the updated commitments and proof in the label-value set of the VKD made between between two
output by oZKS.TombstoneElts. consecutive tombstone epochs is tombstoned.
Upd
• 0/1 ← VKD.VerifyTombstone(comt , comt+1 , Π , tstale ): Now, if the user receives two
This algorithm calls WitnessAPI.GetCom(t) to ob- (t, {(valtk , Epochtk )}max , Π Ver
t ),
t
proofs k=mint
tain (comt , certt ) and WitnessAPI.GetCom(t + 1) to (u, {(valuk , Epochuk )}max u
, Π Ver
) in response to
k=minu u
get (comt+1 , certt+1 ). Finally, it parses ΠUpd to get VKD.KeyHistory, with t1 ≤ t < u ≤ tcurrent . For any
ΠUpd
oZKS . If either WitnessAPI.VerifyCert(comt , certt , p) or version number β ∈ [mint , maxt ] ∩ [minu , maxu ], we claim
WitnessAPI.VerifyCert(comt+1 , certt+1 , p) outputs 0, the that (valtβ , Epochtβ ) = (valuβ , Epochuβ ). Suppose not: to
algorithm outputs 0. Otherwise, it returns the output of generate such a pair of proofs, the adversary would have to
oZKS.VerifyDel(comoZKS , com0oZKS , ΠUpd oZKS , tstale ). include membership proofs for diverging views of the same
Upd
• (comt+1 , Dirt+1 , stt+1 , Π , t+1) ← VKD.Compact(Dirt , label ‘label|β’ in the oZKS at times t and u. Recall that
stt , t, tstale ): This algorithm parses out DSoZKS t , stoZKS
t from we assume that if any deletion epochs occurred between t
oZKS oZKS
stt and calls oZKS.DeleteElts(stt , DSt , t, tstale ), to and u, the user checked correct tombstoning of any deleted
oZKS
obtain (comoZKS oZKS
t+1 , stt+1 , DSt+1 , πS , t+1), which it uses to values. This means that if the label label|β is included in
update comt , stt , Dirt to get comt+1 , stt+1 , Dirt+1 and sets both proofs, the minimum version number minu ≤ β and the
ΠUpd = πS and returns (comt+1 , Dirt+1 , stt+1 , ΠUpd , t+1). value committed with the label label|β must have remained
Upd
• 0/1 ← VKD.VerifyCompact(comt , comt+1 , Π , tstale ):
oZKS
unchanged in the epochs between t and u. Thus, for valuβ and
This algorithm simply parses out comt , comoZKS
t+1 from valtβ to diverge, the adversary must have produced proofs of
comt , comt+1 , respectively, and outputs the output of
diverging views in the oZKS.
oZKS.VerifyDel(comoZKS t , comoZKS
t+1 , Π
Upd
, tstale ).
Fact. For two history proofs from epochs t and
The VerifyHistory algorithm has the following extra checks: u, for any version β, included in both proofs,
n Ver
• 0/1 ← VKD.VerifyHistory(t, label, (vali , ti )i=α , Π ): (valtβ , Epochtβ ) = (valuβ , Epochuβ ).
min
This algorithm is the same as VKD.VerifyHistory1 , except
it also ensures that if vali = T OMBSTONE , then both the We have established that if a user is checking once be-
labels label|i and label|i|stale have the value T OMBSTONE . tween any pair of tombstone epochs in [t1 , tcurrent ], any two
It also ensures that if label|i does not correspond to the VKD.KeyHistory proofs which pass must contain the same
value T OMBSTONE , then neither does label|i|stale. Finally, (val, Epoch) for a given version number β.
it ensures that valn 6= T OMBSTONE , i.e. the most recent Now, suppose a call to VKD.Query returns proof π,
value of this user is not tombstoned. If all checks pass, this which verifies at some epoch t∗ ∈ [t1 , tn ], with version
algorithm outputs 1, otherwise it outputs 0. α and the associated value-epoch pair (val, Epoch). Also,
The updated algorithms mean that VKD.Audit is defined as suppose that for some t ∈ [t1 , ..., tn ], the proof ΠVer t ,
follows: with associated values (valk , Epochk )}maxt
k=mint passes, such
Upd tn −1
• VKD.Audit(t1 , tn , (Πt )t=t 1
): If for any of the that t∗ ∈ [Epochβ , Epochβ+1 ] for some version number
epochs, t ∈ [t1 , tn − 1], t is in TombstoneEpochs, β ∈ [mint , maxt ].

23
We only consider the case where val 6= T OMBSTONE , the following cases:
since users should not accept T OMBSTONE as a value, in • α = β, val 6= valβ : Recall that (valβ , Epochβ ) are
response to VKD.Query. only included in the output of KeyHistory if valβ 6=
At a high level, our soundness definition requires that with T OMBSTONE . Also recall that the oZKS auditors check
high probability, val = valβ . that any value which is not marked tombstoned is not
Case I t∗ , t ∈ [TombEpk , TombEpk+1 ), for some k. An mutated. Hence, this reduces to the problem of the
adversary could cause a disparity between val and valβ in one adversary showing diverging views for the label label|α
of the following ways: in the oZKS – the probability of the adversary succeeding
0
• α = β, val 6= val : In this case, the adversary must is negligible due to the security of the oZKS.
produce a membership proof for label|α in the oZKS, • α < β: This reduces to two cases: (1) mint ≤ α <
with values val, valβ which are unequal. Recall that β ≤ maxt and (2) α < mint . In case (1), if the proof
except for tombstone and deletion epochs, the oZKS must ΠVer verifies, we know that for i < j, Epochi < Epochj
remain append-only. Further, any oZKS values which and the label label|i|stale must include a proof of being
are deleted in a deletion epoch must be tombstoned. If inserted at the epoch Epochi+1 . Hence, at epoch t∗ , the
val 6= T OMBSTONE at the epoch T OMBSTONE k , this adversary must have shown a non-membership proof for
breaks the security of the oZKS, since an oZKS adversary label|α|stale as part of π, but a membership proof for
can use this VKD adversary as a subroutine to produce label|α|stale with epoch Epochα+1 ≤ t∗ ≤ Epochβ as
proofs for disparate values within epochs which are not part of ΠVer . This violates the oZKS security. For case
tombstone epochs. (2), this means that α was never included with the proof
• α < β: For VKD.VerifyHistory to output 1 with ΠVer and hence we do not need to consider it.
Epochβ ≤ t∗ , for all α < β, it should either (1) • β < α ≤ maxt : By the same argument as in the
receive a non-membership proof of label|α in the oZKS previous case, for VerifyQuery to verify with version
or (2) a membership proof of label|α|stale that should number α at epoch t∗ , Epoch ≤ t∗ . However, for
have been added at epoch Epochα ≤ Epochβ. On VerifyHistory to pass, it must include a membership
the other hand, for VKD.VerifyQuery to pass, it should proof of the label label|β|stale, inserted at an epoch
have verified a membership proof of label|α as well as Epochβ < e ≤ Epoch ≤ t∗ . Thus the adversary either
a non-membership proof of label|α|stale at epoch t∗ . included membership proofs for label|α with epochs
Hence, if a VKD.VerifyQuery returned 1, and either (1) Epochα 6= Epoch in ΠVer , π, or got an Audit to pass
or (2) is true, the adversary violated the requirement with the wrong epoch committed at a leaf.
blog βc+1
that the auditors ensure that the epoch when a node is • maxt < α ≤ 2 − 1 or α ≥ 2blog βc+1 : The cases
inserted is committed with its value in the oZKS and that are identical to the corresponding cases in Case I above.
both membership and non-membership checks of non- Hence, with all but negligible probability, α = β and val =
tombstoned values passed, val0 in both cases. Note that we require that for every pair of
blog(β)c+1
• 2 > α > β: In order for ΠVer to verify with tombstone epochs which intersect with the interval [t1 , tcurrent ],
the history at epoch t, with the version number β for we require some element of {t1 , ..., tn } to be in that interval.
epochs [Epochβ , Epochβ+1 ), it needs to present one of We also showed that the values for the same version number
the following: (1) a proof of oZKS membership for the during two key history checks should match, hence covering
label label|α with value (val, t∗ ), with t∗ > t, or, (2) a these two cases is exhaustive.
non-membership proof in the oZKS of the label label|α at
the epoch t. If either of these oZKS proofs was generated,
in order for π to also verify, the adversary would have G. Summary of the aZKS Construction from [16]
to also generate either (1) both a membership and non- Append-only Zero Knowledge Set (aZKS) was introduced
membership proof for the same label, or (2) generate two in [16] (Section 3). We summarize the construction below.
membership proofs mapping to distinct value-epoch pairs. Part of the text is copied verbatim from [16].
This would allow constructing an oZKS adversary. aZKS is a primitive that lets a (malicious) prover commit
bβc+1
• α ≥ 2 + 1: In this case, for the proof π to verify, to an append-only dictionary of (label,value) pairs (where the
at epoch t∗ , the adversary must have generated a proof labels form a set) such that: 1) the commitment is succinct and
of oZKS membership for the label label|2bαc mapping to does not leak any information about the committed dictionary
a tuple of the form (·, t00 ) such that t00 < t. As in the 2) the prover can prove statements about membership and non-
previous case, the adversary must have also generated membership of (label,value) pairs with respect to the succinct
either (1) a proof of oZKS membership for the label commitment 3) the prover can prove that for two dictionaries,
label|2bαc but mapping to a tuple (·, t000 ) with t000 > t∗ , or D1 , D2 , D1 ⊆ D2 with respect to their respective succinct
(b) an oZKS non-membership proof for the same label. commitments and 4) the proofs are efficient and do not leak
All of these violate oZKS soundness. any information about the rest of the committed dictionary.
Case II t∗ ∈ [TombEpk , TombEpk+1 ), and t ∈ aZKS has two security properties: Soundness and Zero-
[TombEpk+1 , TombEpk+2 ), for some k. Again, we consider Knowledge Privacy (with leakage). Soundness ensures that a

24
malicious prover 1) will not be able to produce two verifying
SEEMless
proofs for two different values for the same label with respect Unspecificed
to a commitment or 2) should not be able to modify an existing Verifiable
Commitment
label. Zero-Knowledge privacy property captures that the Random
Serving
query proofs and append-proofs leak no information beyond Function
Commitment Mechanism
the query answer (which is a value or ⊥ for membership aZKS
queries and a bit indicating validity of an an append operation) Scheme
and a well-defined leakage function. Fig. 10: Schematic of the building blocks for SEEMless.

H. Summary of SEEMless construction [16]


In this section, we summarize the construction from [16], and comold,t1 , . . . , comold,tα and the hashes
Section 4. Part of the text is copied verbatim from [16]. necessary to verify the hash chain:
SEEMless assumes that server’s identity and public key is H(comall,0 , comold,0 ), . . . , H(comall,t , comold,t ). For
known to each user and auditor and all the messages from versions i = 1 to n, the server retrieves the vali for ti and
the server are signed under the server’s key, so that the server version i of label from Dirti . Let 2a ≤ α < 2a+1 for some
cannot be impersonated. a where α is the current version of the label. The server
SEEMless uses two aZKS: one “all” aZKS to store all generates the following proofs (together called as Π):
versions of all (label, val) pairs, and a second “old” aZKS 1) Correctness of comti and comti −1 : For each
that stores all of the out of date label versions. SEEMless also i, output comti comti −1 . Also output the
uses a hash chain. Hash Chain is a classical authentication data values necessary to verify the hash chain:
structure that chains multiple data elements by successively H(comall,0 , comold,0 ), . . . , H(comall,t , comold,t ).
applying cryptographic hashes, e.g. a hash chain that hashes 2) Correct version i is set at epoch ti : For each i:
elements a, b, c is H(c, H(b, H(a)). Membership proof for (label| i) with value vali in the
• VKD.Publish(Dirt−1 , stt−1 , St ): At every epoch, the server “all” aZKS with respect to comti .
gets a set St of (label, value) pairs that have to be added 3) Server couldn’t have shown version i − 1 at or after
to the VKD. The server first checks if the label already ti : For each i: Membership proof in “old” aZKS with
exists for some version α − 1, else sets α = 1. It adds a respect to comti for (label| i − 1).
new entry (label | α, val) to the “all” aZKS and also adds 4) Server couldn’t have shown version i before epoch
(label | α − 1, null) to the “old” aZKS if α > 1. If the ti : For each i: Non membership proof for (label| i) in
new version α = 2i for some i, then the server adds a “all” aZKS with respect to comti −1 .
marker entry (label | mark |i,“marker”) to the “all” aZKS. 5) Server can’t show any version from α + 1 to 2a+1 at
The server computes commitments to both the aZKS, and epoch t or any earlier epoch: Non membership proofs
adds them to the hash chain to obtain a new head comt . in the “all” aZKS with respect to comt for (label| i +
It also produces a proof ΠUpd consisting of the previous 1), (label| i + 2), . . . , (label| 2a+1 − 1).
and new pair of aZKS commitments comall,t−1 , comall,t 6) Server can’t show any version higher than 2a+1 at
and comold,t−1 , comold,t and the corresponding aZKS update epoch t or any earlier epoch: Non membership proofs
proofs. in “all” aZKS with respect to comt for marker nodes
• VKD.Query(stt , Dirt , label): When a client Bob queries for (label| mark| a + 1) up to (label| mark| log t).
Alice’s label, he should get the val corresponding to the • VKD.VerifyHistory(t, label, (vali , ti )ni=1 , ΠVer ): Verify each
latest version α for Alice’s label and a proof of correctness. of the above proofs.
Bob gets three proofs in total: First is the membership  tn−1

• VKD.Audit(t1 , tn , ΠUpd
t ): Auditors will audit the
proof of (label | α, val) in the “all” aZKS. Second is t=t1
the membership proof of the most recent marker entry commitments and proofs to make sure that no entries ever
(label |mark |a) for α ≥ 2a . And third is non membership get deleted in either aZKS. They do so by verifying the
proof of label | α in the “old” aZKS. Proof 2 ensures that update proofs ΠUpd output by the server. They also check
Bob is not getting a value higher than Alice’s current version that at each epoch both aZKS commitments are added to
and proof 3 ensures that Bob is not getting an old version the hash chain. Note that, while the Audit interface gives
for Alice’s label. a monolithic audit algorithm, our audit is just checking the
• VKD.VerifyQuery(t, label, val, π, α): The client checks each updates between each adjacent pair of aZKS commitments,
membership or non-membership proof, and the hash chain. so it can be performed by many auditors in parallel.
Also check that version α as part of proof is less than current
I. Storage
epoch t.
• VKD.KeyHistory(stt , Dirt , t, label): The server first In the next two subsections we go over specific storage
retrieves all the update epochs t1 , . . . , tα for requirements of SEEMless and our solution Parakeet, respec-
label versions 1, . . . , α from T , the corresponding tively. We first provide high-level statistics about a world-
comall,t1 −1 , comall,t1 , . . . , comall,tα −1 , comall,tα scale key transparency solution based on compressed Merkle

25
Parakeet 21.36
Reliable

Storage (TB)
Verifiable Broadcast
Random & SEEMless Parakeet
Function Witness
Commitment Quorum
oZKS
Scheme 0.87
Fig. 11: Schematic of the building blocks for Parakeet. Note that the shaded 1 10 24 144
components are different from the corresponding components of SEEMless, Number of epochs per day
which are shaded in Figure 10.
Fig. 12: Comparison of storage costs of Parakeet and SEEMless in the first
year with varying number of epochs per day ranging from daily to every 10
minutes.
trees and use WhatsApp (the most popular E2EE messaging
app) discussed in section I, as an example. We assume that 110.9 SEEMless Parakeet
WhatsApp has about two billion existing keys (KI ), and
88.1

Storage (TB)
roughly ten million daily key updates (KD ). To simplify our
current calculations, we assume that after initially setting up 65.5
the key transparency solution, the server only receives requests 43.2
to update keys for existing users. Note that updates to keys for 21.4
existing users require more values to be inserted into the tree 3.4
(to mark old version numbers as stale), so our assumptions 1 2 3 4 5
are closer to a worst case scenario. This means that: Years
• The total number of keys added to the system in the first Fig. 13: Comparison of storage costs of Parakeet and SEEMless in the first
five years.
year is KT = 2 × 109 + 365 × 107 = 5.65 billion keys.
• In both SEEMless [16] and Parakeet which are based on
a (compressed) Patricia Merkle Trie, the number of nodes With 64-byte node states, for hash function output (32-byte)
needed for key transparency is roughly NT = 2 × KI + 4 × along with other node info (32-byte) such as the parent and
KD = 18.6 billion by the following reasoning children, the total storage requirement is around 27TBs.
– Adding a leaf node results in additional one node (i.e.,
two nodes in total) for the longest common prefix parent. 2) Parakeet Storage
– When initially setting up, all keys are treated as a users Parakeet’s main advantage is that only the latest state of a
initial version and hence require only one leaf of the form node needs to be stored. Essentially the final cost depends on
uname|1, for each user name. the number of total nodes in the tree at the end of the year.
– For all subsequent updates, two leaves must be added for The total cost considering 64-byte node states – same as
each key: one of the form uname|i + 1, to add the new SEEMless, NT ∗ 64B = 1.1TBs. Since to allow concurrent
key and, the other of the form uname|i|stale, to mark the proof generation and Parakeet stores a previous node state,
old key as stale. the final cost is 2.2TBs – an order of magnitude more efficient
• Number of nodes initially is NI = 2 × KI = 4 billion. than the previous best solution. Furthermore, our compaction
• Number of nodes created daily is ND = 4 × KD = 40 mechanism (See Section III) can further reduce the storage
million. requirements for Parakeet.
• Number of nodes that need to be updated for a new leaf node In comparison, the storage cost of actual keys and their
is upper-bounded by the depth of the tree. The amortized owners’ information (e.g., phone number) is around 360GB
depth of any inserted node is log(n) where n is the number (considering 64B record sizes). An efficient key transparency
of leaves in the tree before this insertion. solution with Parakeet is highly feasible.
1) SEEMless Storage
SEEMless relies on saving the state of a node every time the
node is updated. This makes the storage cost highly dependent
on the number of epochs and the number of nodes in the tree.
Let us use the number of leaves in the tree initially as a
lower bound on how many node states need to be updated;
and we assume there is one epoch per-day. Total number of
nodes that SEEMless needs to store states for in this case is
NI nodes initially and ND ∗ log NI daily. In total 2 × 109 +
365 × 4 × 108 × log2 (4 × 109 ) is approximately 470 billion in
the first year.

26

You might also like