An Efficient and Secure Dynamic Auditing Protocol For Data Storage in Cloud Computing
An Efficient and Secure Dynamic Auditing Protocol For Data Storage in Cloud Computing
Abstract—In cloud computing, data owners host their data on cloud servers and users (data consumers) can access the data
from cloud servers. Due to the data outsourcing, however, this new paradigm of data hosting service also introduces new security
challenges, which requires an independent auditing service to check the data integrity in the cloud. Some existing remote integrity
checking methods can only serve for static archive data and thus cannot be applied to the auditing service since the data in the
cloud can be dynamically updated. Thus, an efficient and secure dynamic auditing protocol is desired to convince data owners
that the data are correctly stored in the cloud. In this paper, we first design an auditing framework for cloud storage systems
and propose an efficient and privacy-preserving auditing protocol. Then, we extend our auditing protocol to support the data
dynamic operations, which is efficient and provably secure in the random oracle model. We further extend our auditing protocol
to support batch auditing for both multiple owners and multiple clouds, without using any trusted organizer. The analysis and
simulation results show that our proposed auditing protocols are secure and efficient, especially it reduce the computation cost
of the auditor.
Index Terms—Storage Auditing, Dynamic Auditing, Privacy-Preserving Auditing, Batch Auditing, Cloud Computing.
TABLE 1
Comparison of Remote Integrity Checking Schemes
auditing. Another drawback is that their scheme requires not require any additional trusted organizer. The
an additional trusted organizer to send a commitment to the multi-owner batch auditing can greatly improve the
auditor during the multi-cloud batch auditing, because their auditing performance, especially in large scale cloud
scheme applies the mask technique to ensure the data pri- storage systems.
vacy. However, such additional organizer is not practical in The remaining of this paper is organized as follows. In
cloud storage systems. Furthermore, both Wang’s schemes Section 2, we describe definitions of the system model and
and Zhu’s schemes incur heavy computation cost of the security model. In Section 3, we propose an efficient and
auditor, which makes the auditor a performance bottleneck. inherently secure auditing protocol and extend it to support
In this paper, we propose an efficient and secure dynamic the dynamic auditing in Section 4. We further extend our
auditing protocol, which can meet the above listed require- auditing protocol to support the batch auditing for multiple
ments. To solve the data privacy problem, our method is owners and multiple clouds in Section 5. Section 6 give the
to generate an encrypted proof with the challenge stamp performance analysis of our proposed auditing protocols in
by using the Bilinearity property of the bilinear pairing, terms of communication cost and computation cost. The
such that the auditor cannot decrypt it but can verify the security proof will be shown in the supplemental file. In
correctness of the proof. Without using the mask technique, Section 7, we give the related work on storage auditing.
our method does not require any trusted organizer during Finally, the conclusion is given in Section 8.
the batch auditing for multiple clouds. On the other hand,
in our method, we let the server compute the proof as
2 P RELIMINARIES AND D EFINITIONS
an intermediate value of the verification, such that the
auditor can directly use this intermediate value to verify the In this section, we first describe the system model and give
correctness of the proof. Therefore, our method can greatly the definition of storage auditing protocol. Then, we define
reduce the computing loads of the auditor by moving it to the threat model and security model for storage auditing
the cloud server. system.
Our original contributions can be summarized as follows.
1) We design an auditing framework for cloud storage 2.1 Definition of System Model
systems and propose a privacy-preserving and effi-
cient storage auditing protocol. Our auditing proto-
&KDOOHQJH
col ensures the data privacy by using cryptography $XGLWRU 6HUYHUV
method and the Bilinearity property of the bilinear 3URRI
pairing, instead of using the mask technique. Our
auditing protocol incurs less communication cost Q
WLR
,Q
D
LWL
D
L]
,Q
LR
Q
cloud server (server) and the third party auditor (auditor). 1) Replace Attack. The server may choose another valid
The owners create the data and host their data in the cloud. and uncorrupted pair of data block and data tag
The cloud server stores the owners’ data and provides the (mk ,tk ) to replace the challenged pair of data block
data access to users (data consumers). The auditor is a and data tag (mi ,ti ), when it already discarded mi or
trusted third party that has expertise and capabilities to ti .
provide data storage auditing service for both the owners 2) Forge Attack. The server may forge the data tag of
and servers. The auditor can be a trusted organization data block and deceive the auditor, if the owner’s
managed by the government, which can provide unbiased secret tag keys are reused for the different versions
auditing result for both data owners and cloud servers. of data.
Before describing the auditing protocol definition, we 3) Replay Attack. The server may generate the proof
first define some notations as listed in Table 2. from the previous proof or other information, without
retrieving the actual owner’s data.
TABLE 2
Notations 3 E FFICIENT AND P RIVACY - PRESERVING
AUDITING P ROTOCOL
Symbol Physical Meaning
In this section, we first present some techniques we ap-
skt secret tag key plied in the design of our efficient and privacy-preserving
pkt public tag key auditing protocol. Then, we describe the algorithms and
skh secret hash key the detailed construction of our auditing protocol for cloud
M data component storage systems. The correctness proof will be shown in
T set of data tags the supplemental file.
n number of blocks in each component
s number of sectors in each data block 3.1 Overview of Our Solution
Minfo abstract information of M The main challenge in the design of data storage auditing
C challenge generated by the auditor protocol is the data privacy problem (i.e., the auditing pro-
P proof generated by the server tocol should protect the data privacy against the auditor.).
This is because: 1) For public data, the auditor may obtain
the data information by recovering the data blocks from the
Definition 1 (Storage Auditing Protocol). A storage au- data proof. 2) For encrypted data, the auditor may obtain
diting protocol consists of the following five algorithms: content keys somehow through any special channels and
KeyGen, TagGen, Chall, Prove and Verify. could be able to decrypt the data. To solve the data privacy
KeyGen(λ ) → (skh , skt , pkt ). This key generation algorithm problem, our method is to generate an encrypted proof with
takes no input other than the implicit security parameter λ . the challenge stamp by using the Bilinearity property of the
It outputs a secret hash key skh and a pair of secret-public bilinear pairing, such that the auditor cannot decrypt it. But
tag key (skt , pkt ). the auditor can verify the correctness of the proof without
TagGen(M, skt , skh ) → T . The tag generation algorithm decrypting it.
takes as inputs an encrypted file M, the secret tag key skt Although the auditor has sufficient expertise and ca-
and the secret hash key skh . For each data block mi , it pabilities to conduct the auditing service, the computing
computes a data tag ti based on skh and skt . It outputs a ability of an auditor is not as strong as cloud servers.
set of data tags T = {ti }i∈[1,n] . Since the auditor needs to audit for many cloud servers
Chall(Minfo ) → C. The challenge algorithm takes as input and a large number of data owners, the auditor could be the
the abstract information of the data Minfo (e.g., file identity, performance bottleneck. In our method, we let the server
total number of blocks, version number and timestamp etc.). compute the proof as an intermediate value of the verifi-
It outputs a challenge C. cation (calculated by the challenge stamp and the linear
Prove(M, T, C) → P. The prove algorithm takes as inputs combinations of data blocks), such that the auditor can
the file M, the tags T and the challenge from the auditor use this intermediate value to verify the proof. Therefore,
C. It outputs a proof P. our method can greatly reduce the computing loads of the
Verify(C, P, skh , pkt , Minfo ) → 0/1. The verification algo- auditor by moving it to the cloud server.
rithm takes as inputs the P from the server, the secret hash To improve the performance of an auditing system, we
key skh , the public tag key pkt and the abstract information apply the Data Fragment Technique and Homomorphic
of the data Minfo . It outputs the auditing result as 0 or 1. Verifiable Tags in our method. The data fragment technique
can reduce number of data tags, such that it can reduce the
storage overhead and improve the system performance. By
2.2 Definition of Security Model using the homomorphic verifiable tags, no matter how many
We assume the auditor is honest-but-curious. It performs data blocks are challenged, the server only responses the
honestly during the whole auditing procedure but it is sum of data blocks and the product of tags to the auditor,
curious about the received data. But the sever could be whose size is constant and equal to only one data block.
dishonest and may launch the following attacks: Thus, it reduces the communication cost.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
4
confirmed, the owner can choose to delete the local copy of Phase 3: Sampling Auditing
the data. Then, the auditor conducts the sampling auditing The auditor will carry out the sampling auditing peri-
periodically to check the data integrity. odically by challenging a sample set of data blocks. The
frequency of taking auditing operation depends on the
$XGLWRU 2ZQHU 6HUYHU service agreement between the data owner and the auditor
(and also depends on how much trust the data owner has
C]q?]f kc` 3 kcl 3 hcl !
over the server). Similar to the confirmation auditing in
2ZQHU
LY_?]f E3 kcl3 kc`! L Phase 2, the sampling auditing procedure also contains two-
,QLWLDOL]DWLRQ
Eaf^g 3 kc` 3 hcl ! E3 L ! way communication as illustrated in Fig. 2.
Suppose each sector will be corrupted with a probability
;`Ydd Eaf^g ! ; ; of ρ on the server. For a sampling auditing involved with t
&RQILUPDWLRQ &KDOOHQJH challenged data blocks, the probability of detection can be
$XGLWLQJ Hjgn] E3 L3 ;! H
H calculated as
3URRI
N]ja^q ;3 H3 kc` 3 hcl ! (5)
Pr(t, s) = 1 − (1 − ρ)t·s .
5HVXOW
TABLE 3
ITable of the Abstract Information of Data M
(c) After inserting before m2 , all items (d) After deleting m2 , all items
(a) Initial Abstract Information (b) After modifying m2 , V2 and before m2 move backward with the after m2 move forward with the
of M. T2 are updated index increased by 1. index decreased by 1.
Index Bi Vi Ti Index Bi Vi Ti Index Bi Vi Ti Index Bi Vi Ti
1 1 1 T1 1 1 1 T1 1 1 1 T1 1 1 1 T1
2 2 1 T2 2 2 2 T2∗ 2 n+1 1 Tn+1 2 3 1 T3
3 3 1 T3 3 3 1 T3 3 2 1 T2 3 4 1 T4
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . . . . . . .
n n 1 Tn n n 1 Tn n+1 n 1 Tn n−1 n 1 Tn
4.2 Algorithms and Constructions for Dynamic message Msgmodi f y = (i, Bi ,Vi∗ , Ti∗ ). Then, it sends the new
Auditing pair of data block and tag (m∗i ,ti∗ ) to the server and sends
The dynamic auditing protocol consists of four phases: the update message Msgmodi f y to the auditor.
Owner Initialization, Confirmation Auditing, Sampling Au- Insert(m∗i , skt , skh ) → (Msginsert ,ti∗ ). The insertion algo-
diting and Dynamic Auditing. rithm takes as inputs the new data block m∗i , the secret tag
key skt and the secret hash key skh . It inserts a new data
$XGLWRU 2ZQHU 6HUYHU block m∗i before the i-th position. It generates an original
number B∗i , a new version number Vi∗ and a new timestamp
Eg\a^q e{a 3 kcl 3 kc` ! Ek_eg\a^q 3 l{a !
Ti∗ . Then, it calls the TagGen to generate a new data tag ti∗
'DWD 5Afk]jl e{a 3 kcl 3 kc` ! Ek_afk]jl 3 l{a !
8SGDWH 5<]d]l] ea ! Ek_\]d]l] for the new data block m∗i . It outputs the new tag ti∗ and the
Ek_eg\a^q 5Ek_afk]jl
update message Msginsert = (i, B∗i ,Vi∗ , Ti∗ ). Then, it inserts
e{a 3 l{a !
5Ek_\]d]l] the new pair of data block and tag (m∗i ,ti∗ ) on the server
and sends the update message Msginsert to the auditor.
,QGH[ Aeg\a^q Ek_eg\a^q ! Delete(mi ) → Msgdelete . The deletion algorithm takes as
8SGDWH 5Aafk]jl Ek_afk]jl ! input the data block mi . It outputs the update message
5A\]d]l] Ek_\]d]l] !
Msgdelete = (i, Bi ,Vi , Ti ). It then deletes the pair of data
{ {
block and its tag (mi ,ti ) from the server and sends the
;`Ydd Eaf^ g! ; ;{ update message Msgdelete to the auditor.
&KDOOHQJH
8SGDWH Hjgn] E { 3 L { 3 ; { ! H { Step 2: Index Update
&RQILUPDWLRQ H{
3URRI Upon receiving the three types of update messages, the
N]ja^q ; { 3 H { 3 kc` 3 hcl ! (5) auditor calls three corresponding algorithms to update the
5HVXOW
ITable. Each algorithm is designed as follows.
IModify(Msgmodi f y ). The index modification algorithm
Fig. 3. Framework of Auditing for Dynamic Operations takes the update message Msgmodi f y as input. It replaces
the version number Vi by the new one Vi∗ and modifies Ti
The first three phases are similar to our privacy- by the new timestamp Ti∗ .
preserving auditing protocol as described in the above IInsert(Msginsert ). The index insertion algorithm takes as
section. The only differences are the tag generation algo- input the update message Msginsert . It inserts a new record
rithm TagGen and the ITable generation during the owner (i, B∗i ,Vi∗ , Ti∗ ) in i-th position in the ITable. It then moves
initialization phase. Here, as illustrated in Fig. 3, we only the original i-th record and other records after the i-th
describe the dynamic auditing phase, which contains three position in the previous ITable backward in order, with the
steps: Data Update, Index Update and Update Confirmation. index number increased by one.
Step 1: Data Update IDelete(Msgdelete ). The index deletion algorithm takes as
There are three types of data update operations that can input the update message Msgdelete . It deletes the i-th record
be used by the owner: Modification, Insertion and Deletion. (i, Bi ,Vi , Ti ) in the ITable and all the records after the i-th
For each update operation, there is a corresponding algo- position in the original ITable moved forward in order, with
rithm in the dynamic auditing to process the operation and the index number decreased by one.
facilitate the future auditing, defined as follows. Table 3 shows the change of ITable according to the
Modify(m∗i , skt , skh ) → (Msgmodi f y ,ti∗ ). The modification different type of data update operation. Table 3(a) describe
algorithm takes as inputs the new version of data block the initial table of the data M = {m1 , m2 , · · · , mn } and Table
m∗i , the secret tag key skt and the secret hash key skh . It 3(b) describes the ITable after m2 is updated. Table 3(c) is
generates a new version number Vi∗ , new timestamp Ti∗ the ITable after a new data block is insert before m2 and
and calls the TagGen to generate a new data tag ti∗ for Table 3(d) shows the ITable after m2 is deleted.
data block m∗i . It outputs the new tag ti∗ and the update Step 3: Update Confirmation
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
7
After the auditor updates the ITable, it conducts a con- of sectors. We can use the similar technique proposed in
firmation auditing for the updated data and sends the result Section 3.2 to deal with the situation that each data blocks
to the owner. Then, the owner can choose to delete the is split into different number of sectors. ) The owner Ok
local version of data according to the update confirmation runs the tag generation algorithm TagGen to generate the
auditing result. data tags Tkl = {tkl,i }i∈[1,nkl ] as
s
m
5 BATCH AUDITING FOR M ULTI - OWNER tkl,i = (h(skh,kl ,Wkl,i ) · ∏ uk,kl,i
j )
j skt,k
.
AND M ULTI - CLOUD j=1
Data storage auditing is a significant service in cloud where Wkl,i = FIDkl ||i||Bkl,i ||Vkl,i ||Tkl,i .
computing which helps the owners check the data integrity After all the data tags are generated, each owner Ok (k ∈
on the cloud servers. Due to the large number of data O) sends the data component Mkl = {mkl,i j }k∈O,l∈S
i∈[1,n ], j∈[1,s]kl
owners, the auditor may receive many auditing requests and the data tags Tkl = {tkl,i }k∈O,l∈S to the corresponding
i∈[1,nkl ]
from multiple data owners. In this situation, it would greatly server Sl . Then, it sends the public tag key pkt,k , the set of
improve the system performance, if the auditor could secret hash key {skhl,k }l∈S , the abstract information of data
combine these auditing requests together and only conduct {Minfo,kl }k∈O,l∈S to the auditor.
the batch auditing for multiple owners simultaneously. The
previous work [25] cannot support the batch auditing for Phase 2: Batch Auditing for Multi-owner and Multi-
multiple owners. That is because parameters for generating Cloud
the data tags used by each owner are different and thus the Let Ochal and Schal denote the involved set of owners and
auditor cannot combine the data tags from multiple owners cloud servers involved in the batch auditing respectively.
to conduct the batch auditing. The batch auditing also consists of three steps: Batch
On the other hand, some data owners may store their data Challenge, Batch Proof and Batch Verification.
on more than one cloud servers. To ensure the owner’s data Step 1: Batch Challenge
integrity in all the clouds, the auditor will send the auditing During this step, the auditor runs the batch challenge
challenges to each cloud server which hosts the owner’s algorithm BChall to generate a batch challenge C for a set
data, and verify all the proofs from them. To reduce the of challenged owners Ochal and a set of clouds Schal . The
computation cost of the auditor, it is desirable to combine batch challenge algorithm is defined as follows.
all these responses together and do the batch verification.
BChall({Minfo,kl }k∈O,l∈S ) → C. The batch challenge algo-
In the previous work [25], the authors proposed a coop-
rithm takes all the abstract information as input. It selects a
erative provable data possession for integrity verification in
set of owners Ochal and a set of cloud servers Schal . For each
multi-cloud storage. In their method, the authors apply the
data owner Ok (k ∈ Ochal ), it chooses a set of data blocks
mask technique to ensure the data privacy, such that it re-
as the challenged subset Qkl from each server Sl (l ∈ Schal ).
quires an additional trusted organizer to send a commitment
It then generates a random number vkl,i for each chosen
to the auditor during the commitment phase in multi-cloud
data block mkl,i (k ∈ Ochal , l ∈ Schal , i ∈ Qkl ). It also chooses
batch auditing. In our method, we apply the encryption
a random number r ∈ Z∗p and computes the set of challenge
method with the Bilinearity property of the bilinear pairing
stamp {Rk }k∈Ochal =pkt,k
r . It outputs the challenge as
to ensure the data privacy, rather than the mask technique.
Thus, our multi-cloud batch auditing protocol does not have C = ({Cl }l∈Schal , {Rk }k∈Ochal ),
any commitment phase, such that our method does not
require any additional trusted organizer. where Cl = {(k, l, i, vkl,i )}k∈Ochal .
Then, the auditor sends each Cl to each cloud server
5.1 Algorithms for Batch Auditing for Multi-owner Sl (l ∈ Schal ) together with the challenge stamp {Rk }k∈Ochal .
and multi-cloud Step 2: Batch Proof
Let O be the set of owners and S be the set of cloud servers. Upon receiving the challenge, each server Sl (l ∈ Schal )
The batch auditing for multi-owner and multi-cloud can be generates a proof Pl = (T Pl , DPl ) by using the following
constructed as follows. batch prove algorithm BProve and sends the proof Pl to
Phase 1: Owner Initialization the auditor.
Each owner Ok (k ∈ O) runs the key generation algo- BProve({Mkl }k∈Ochal , {Tkl }k∈Ochal , Cl , {Rk }k∈Ochal ) → Pl .
rithm KeyGen to generate the pair of secret-public tag key The batch prove algorithm takes as inputs the data
(skt,k , pkt,k ) and a set of secret hash key {skh,kl }l∈S . That {Mkl }k∈Ochal , the data tags {Tkl }k∈Ochal , the received
is, for different cloud servers, the owner has different secret challenge Cl and the challenge stamp {Rk }k∈Ochal . It
hash keys. We denote each data component as Mkl , which generates the tag proof T Pl as
means that this data component is owned by the owner v
Ok and stored on the cloud server Sl . Suppose the data T Pl = ∏ ∏ tkl,ikl,i .
k∈Ochal i∈Qkl
component Mkl is divided into nkl data blocks and each
data block is further split into s sectors. (Here we assume Then, for each j ∈ [1, s], it computes the sector linear
that each data block is further split into the same number combination MPkl, j of all the chosen data blocks of each
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
8
It outputs the proof Pl = (T Pl , DPl ). t is the number of challenged data blocks from each
Step 3: Batch Verification owner on each cloud server; s is the number of sectors
Upon receiving all the proofs from the challenged in each data block; n is the total number of data
servers, the auditor runs the following batch verification blocks of a file in Wang’s scheme.
algorithm BVerify to check the correctness of the proofs.
BVerify(C, {Pl }, {skh,lk }, {pkt,k }, {Min f o,kl }) → 0/1. The the number of data blocks from each owner on each cloud
batch verification algorithm takes as inputs the chal- server should be st. The result is described in Table 4.
lenge C, the proofs {Pl }l∈Schal , the set of secret hash From the table, we can see that the communication cost
keys {skh,kl }k∈Ochal ,l∈Schal , the public tag keys {pkt,k }k∈Ochal in Wang’s auditing scheme is not only linear to C, K ,
and the abstract information of the challenged data t, s, but also linear to the total number of data blocks
blocks {Min f o,kl }k∈Ochal ,l∈Schal . For each owner Ok (k ∈ n. As we know, in large scale cloud storage systems, the
Ochal ), it computes the set of identifier hash values total number of data blocks could be very large. Therefore,
{h(skh,kl ,Wkl,i )}l∈Schal ,i∈Qkl for all the chosen data blocks Wang’s auditing scheme may incur high communication
from each challenged server, and use these hash values to cost.
compute a challenge hash Hchal,k as Our scheme and Zhu’s IPDP have the same total commu-
Hchal,k = ∏ ∏ h(skh,kl ,Wkl,i )rvkl,i . nication cost during the challenge phase. During the proof
l∈Schal i∈Qkl phase, the communication cost of the proof in our scheme is
only linear to C, but in Zhu’s IPDP, the communication cost
When finished the calculation of all the data owners’
of the proof is not only linear to C and K, but also linear
challenge hash {Hchal,k }k∈Ochal , it verifies the proofs by the
to s. That is because Zhu’s IPDP uses the mask technique
batch verification equation as
to protect the data privacy, which requires to send both the
e(∏l∈Schal T Pl , gr2 ) masked proof and the encrypted mask to the auditor. In our
∏ DPl =
∏k∈Ochal e(Hchal,k , pkt,k )
. (2)
scheme, the server is only required to send the encrypted
l∈Schal
proof to the auditor and thus incurs less communication
If Eq.2 is true, it outputs 1. Otherwise, it outputs 0. cost than Zhu’s IPDP.
&RPSXWDWLRQ7LPHRQ$XGLWRUV
&RPSXWDWLRQ7LPHRQ$XGLWRUV
1XPEHURI&KDOOHQJHG'DWD%ORFNV 1XPEHURI&KDOOHQJHG&ORXGV 1XPEHURI&KDOOHQJHG2ZQHUV
(a) Single Owner, Single Cloud (b) Single Owner, 5 blocks/Cloud (c) Single Cloud, 5 blocks/Owner
2XU6FKHPH 2XU6FKHPH.%\WH
data blocks goes to 500 (i.e., the challenged data size equals
=KX
V,3'3 =KX
V,3'3.%\WH
&RPSXWDWLRQ7LPHRQ6HUYHUV
&RPSXWDWLRQ7LPHRQ6HUYHUV
to 500KByte), but it can illustrate the linear relationship
challenged data size. From the Fig. 4(a), we can see that
our scheme incurs less computation cost of the auditor than
In real cloud storage systems, the data size is very (a) Single Owner, Single Cloud (b) Single Cloud, 5 blocks/Owner
large (e.g., petabytes), our scheme apply the sampling
Fig. 5. Comparison of Computation Cost on the Serer
auditing method to ensure the integrity of such large data.
(s = 50)
The sample size and the frequency are determined by
the service level agreement. From the simulation results,
we can estimate that it requires 800 seconds to audit for of the auditing from the auditor to the server, such that it
1GByte data. However, the computing abilities of the cloud can greatly reduce the computation cost of the auditor.
server and the auditor are much more powerful than our
simulation PC, so the computation time can be relatively
small. Therefore, our auditing scheme is practical in large 7 R ELATED W ORK
scale cloud storage systems. To support the dynamic auditing, Ateniese et al. developed
Fig. 4(b) describes the computation cost of the auditor of a dynamic provable data possession protocol [29] based on
the multi-cloud batch auditing scheme versus the number cryptographic hash function and symmetric key encryption.
of challenged clouds. It is easy to find that our scheme Their idea is to pre-compute a certain number of metadata
incurs less computation cost of the auditor than Zhu’s IPDP during the setup period, so that the number of updates and
scheme, especially when there are a large number of clouds challenges is limited and fixed beforehand. In their protocol,
in the large scale cloud storage systems. each update operation requires recreating all the remaining
Because Zhu’s IPDP does not support the batch auditing metadata, which is problematic for large files. Moreover,
for multiple owners, in our simulation, we repeat the their protocol cannot perform block insertions anywhere
computation for several times which is equal to the number (only append-type insertions are allowed). Erway et al. [22]
of data owners. Then, as shown in Fig. 4(c), we compare the also extended the PDP model to support dynamic updates
computation cost of the auditor between our multi-owner on the stored data and proposed two dynamic provable data
batch auditing and the general auditing protocol which does possession scheme by using a new version of authenticated
not support the multi-owner batch auditing (e.g., Zhu’s dictionaries based on rank information. However, their
IPDP). Fig. 4(c) also demonstrates that the batch auditing schemes may cause heavy computation burden to the server
for multiple owners can greatly reduce the computation since they relied on the PDP scheme proposed by the
cost. Although in our simulation the number of data owners Ateniese.
goes to 500, it can illustrate the trend of computation cost In [23], the authors proposed a dynamic auditing protocol
of the auditor that our scheme is much more efficient than that can support the dynamic operations of the data on
Zhu’s scheme in large scale cloud storage systems that may the cloud servers, but this method may leak the data
have millions to billions of data owners. content to the auditor because it requires the server to send
the linear combinations of data blocks to the auditor. In
[24], the authors extended their dynamic auditing scheme
6.2.2 Computation Cost of the Server
to be privacy-preserving and support the batch auditing
We compare the computation cost of the server versus the for multiple owners. However, due to the large number
number of data blocks in Fig. 5(a) and the number of data of data tags, their auditing protocols will incur a heavy
owners in Fig. 5(b). Our scheme moves the computing loads storage overhead on the server. In [25], Zhu et al. proposed
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
10
a cooperative provable data possession scheme that can [10] Y. Deswarte, J. Quisquater, and A. Saidane, “Remote integrity
support the batch auditing for multiple clouds and also checking,” in The Sixth Working Conference on Integrity and Internal
Control in Information Systems (IICIS). Springer Netherlands,
extend it to support the dynamic auditing in [26]. However, November 2004.
it is impossible for their scheme to support the batch [11] M. Naor and G. N. Rothblum, “The complexity of online memory
auditing for multiple owners. That is because parameters for checking,” J. ACM, vol. 56, no. 1, 2009.
generating the data tags used by each owner are different [12] A. Juels and B. S. K. Jr., “Pors: proofs of retrievability for large
files,” in ACM Conference on Computer and Communications Secu-
and thus they cannot combine the data tags from multiple rity, P. Ning, S. D. C. di Vimercati, and P. F. Syverson, Eds. ACM,
owners to conduct the batch auditing. Another drawback is 2007, pp. 584–597.
that their scheme requires an additional trusted organizer [13] T. J. E. Schwarz and E. L. Miller, “Store, forget, and check: Using
algebraic signatures to check remotely administered storage,” in
to send a commitment to the auditor during the batch ICDCS. IEEE Computer Society, 2006, p. 12.
auditing for multiple clouds, because their scheme applies [14] D. L. G. Filho and P. S. L. M. Barreto, “Demonstrating data
the mask technique to ensure the data privacy. However, possession and uncheatable data transfer,” IACR Cryptology ePrint
such additional organizer is not practical in cloud storage Archive, vol. 2006, p. 150, 2006.
[15] F. Sebé, J. Domingo-Ferrer, A. Martı́nez-Ballesté, Y. Deswarte,
systems. Furthermore, both Wang’s schemes and Zhu’s and J.-J. Quisquater, “Efficient remote data possession checking in
schemes incur heavy computation cost of the auditor, which critical information infrastructures,” IEEE Trans. Knowl. Data Eng.,
makes the auditing system inefficient. vol. 20, no. 8, pp. 1034–1038, 2008.
[16] G. Yamamoto, S. Oda, and K. Aoki, “Fast integrity for large data,”
in Proceedings of the ECRYPT workshop on Software Performance
Enhancement for Encryption and Decryption. Amsterdam, the
8 C ONCLUSION Netherlands: ECRYPT, June 2007, pp. 21–32.
In this paper, we proposed an efficient and inherently secure [17] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, “Auditing
to keep online storage services honest,” in HotOS, G. C. Hunt, Ed.
dynamic auditing protocol. It protects the data privacy USENIX Association, 2007.
against the auditor by combining the cryptography method [18] C. Wang, K. Ren, W. Lou, and J. Li, “Toward publicly auditable
with the bilinearity property of bilinear paring, rather than secure cloud data storage services,” IEEE Network, vol. 24, no. 4,
using the mask technique. Thus, our multi-cloud batch pp. 19–24, 2010.
[19] K. Yang and X. Jia, “Data storage auditing service in cloud com-
auditing protocol does not require any additional organizer. puting: challenges, methods and opportunities,” World Wide Web,
Our batch auditing protocol can also support the batch vol. 15, no. 4, pp. 409–428, 2012.
auditing for multiple owners. Furthermore, our auditing [20] G. Ateniese, R. C. Burns, R. Curtmola, J. Herring, L. Kissner,
scheme incurs less communication cost and less compu- Z. N. J. Peterson, and D. X. Song, “Provable data possession at
untrusted stores,” in ACM Conference on Computer and Communi-
tation cost of the auditor by moving the computing loads cations Security, P. Ning, S. D. C. di Vimercati, and P. F. Syverson,
of auditing from the auditor to the server, which greatly Eds. ACM, 2007, pp. 598–609.
improves the auditing performance and can be applied to [21] H. Shacham and B. Waters, “Compact proofs of retrievability,” in
ASIACRYPT, ser. Lecture Notes in Computer Science, J. Pieprzyk,
large scale cloud storage systems. Ed., vol. 5350. Springer, 2008, pp. 90–107.
[22] C. C. Erway, A. Küpçü, C. Papamanthou, and R. Tamassia, “Dy-
namic provable data possession,” in ACM Conference on Computer
R EFERENCES and Communications Security, E. Al-Shaer, S. Jha, and A. D.
Keromytis, Eds. ACM, 2009, pp. 213–222.
[1] P. Mell and T. Grance, “The NIST definition of cloud computing,” [23] Q. Wang, C. Wang, K. Ren, W. Lou, and J. Li, “Enabling public
National Institute of Standards and Technology, Tech. Rep., 2009. auditability and data dynamics for storage security in cloud comput-
[2] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, ing,” IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 5, pp. 847–859,
A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and 2011.
M. Zaharia, “A view of cloud computing,” Commun. ACM, vol. 53,
[24] C. Wang, Q. Wang, K. Ren, and W. Lou, “Privacy-preserving
no. 4, pp. 50–58, 2010.
public auditing for data storage security in cloud computing,” in
[3] T. Velte, A. Velte, and R. Elsenpeter, Cloud Computing: A Practical
INFOCOM. IEEE, 2010, pp. 525–533.
Approach, 1st ed. New York, NY, USA: McGraw-Hill, Inc., 2010,
ch. 7. [25] Y. Zhu, H. Hu, G. Ahn, and M. Yu, “Cooperative provable data
[4] J. Li, M. N. Krohn, D. Mazières, and D. Shasha, “Secure untrusted possession for integrity verification in multi-cloud storage,” Parallel
data repository (sundr),” in Proceedings of the 6th conference on and Distributed Systems, IEEE Transactions on, pp. 1–14, 2011.
Symposium on Operating Systems Design & Implementation, Berke- [26] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S. S. Yau, “Dynamic
ley, CA, USA, 2004, pp. 121–136. audit services for integrity verification of outsourced storages in
[5] G. R. Goodson, J. J. Wylie, G. R. Ganger, and M. K. Reiter, clouds,” in SAC, W. C. Chu, W. E. Wong, M. J. Palakal, and C.-
“Efficient byzantine-tolerant erasure-coded storage,” in DSN. IEEE C. Hung, Eds. ACM, 2011, pp. 1550–1557.
Computer Society, 2004, pp. 135–144. [27] K. Zeng, “Publicly verifiable remote data integrity,” in ICICS, ser.
[6] V. Kher and Y. Kim, “Securing distributed storage: challenges, Lecture Notes in Computer Science, L. Chen, M. D. Ryan, and
techniques, and systems,” in StorageSS, V. Atluri, P. Samarati, G. Wang, Eds., vol. 5308. Springer, 2008, pp. 419–434.
W. Yurcik, L. Brumbaugh, and Y. Zhou, Eds. ACM, 2005, pp. [28] G. Ateniese, S. Kamara, and J. Katz, “Proofs of storage from
9–25. homomorphic identification protocols,” in ASIACRYPT, ser. Lecture
[7] L. N. Bairavasundaram, G. R. Goodson, S. Pasupathy, and Notes in Computer Science, M. Matsui, Ed., vol. 5912. Springer,
J. Schindler, “An analysis of latent sector errors in disk drives,” in 2009, pp. 319–333.
SIGMETRICS, L. Golubchik, M. H. Ammar, and M. Harchol-Balter, [29] G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, “Scalable
Eds. ACM, 2007, pp. 289–300. and efficient provable data possession,” IACR Cryptology ePrint
[8] B. Schroeder and G. A. Gibson, “Disk failures in the real world: Archive, vol. 2008, p. 114, 2008.
What does an mttf of 1, 000, 000 hours mean to you?” in FAST. [30] P. Ning, S. D. C. di Vimercati, and P. F. Syverson, Eds., Proceedings
USENIX, 2007, pp. 1–16. of the 2007 ACM Conference on Computer and Communications
[9] M. Lillibridge, S. Elnikety, A. Birrell, M. Burrows, and M. Isard, “A Security, CCS 2007, Alexandria, Virginia, USA, October 28-31,
cooperative internet backup scheme,” in USENIX Annual Technical 2007. ACM, 2007.
Conference, General Track. USENIX, 2003, pp. 29–41.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
11
Xiaohua Jia received his BSc (1984) and MEng (1987) from Uni-
versity of Science and Technology of China, and DSc (1991) in
Information Science from University of Tokyo. He is currently Chair
Professor with Dept of Computer Science at City University of Hong
Kong. His research interests include cloud computing and distributed
systems, computer networks, wireless sensor networks and mobile
wireless networks. Prof. Jia is an editor of IEEE Trans. on Parallel
and Distributed Systems (2006-2009), Wireless Networks, Journal
of World Wide Web, Journal of Combinatorial Optimization, etc. He
is the General Chair of ACM MobiHoc 2008, TPC Co-Chair of IEEE
MASS 2009, Area-Chair of IEEE INFOCOM 2010, TPC Co-Chair of
IEEE GlobeCom 2010 Ad Hoc and Sensor Networking Symp, and
Panel Co-Chair of IEEE INFOCOM 2011.