0% found this document useful (0 votes)
22 views8 pages

p1550 Zhu

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views8 pages

p1550 Zhu

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Dynamic Audit Services for Integrity Verification

of Outsourced Storages in Clouds

Yan Zhu1,2 , Huaixi Wang3 , Zexing Hu1 , Gail-Joon Ahn4 , Hongxin Hu4 , Stephen S. Yau4
1
Institute of Computer Science and Technology, Peking University, Beijing 100871, China
2
Key Laboratory of Network and Software Security Assurance (Peking University), Ministry of Education
3
School of Mathematical Sciences, Peking University, Beijing 100871, China
4
School of Computing, Informatics, and Decision Systems Engineering,
Arizona State University, Tempe, AZ 85287, USA
{yan.zhu,wanghx,huzx}@pku.edu.cn, {gahn,hxhu,yau}@asu.edu

ABSTRACT or failures, it would bring irretrievable losses to the clients


In this paper, we propose a dynamic audit service for veri- since their data or archives are stored in an uncertain stor-
fying the integrity of an untrusted and outsourced storage. age pool outside the enterprises. These security risks come
Our audit service is constructed based on the techniques, from the following reasons: the cloud infrastructures are
fragment structure, random sampling and index-hash table, much more powerful and reliable than personal computing
supporting provable updates to outsourced data, and timely devices. However, they are still susceptible to internal and
abnormal detection. In addition, we propose a probabilis- external threats; for the benefits of their possession, there
tic query and periodic verification for improving the perfor- exist various motivations for cloud service providers (CSP)
mance of audit services. Our experimental results not only to behave unfaithfully towards the cloud users; furthermore,
validate the effectiveness of our approaches, but also show the dispute occasionally suffers from the lack of trust on
our audit system verifies the integrity with lower computa- CSP. Consequently, their behaviors may not be known by
tion overhead, requiring less extra storage for audit meta- the cloud users, even if this dispute may result from the
data. users’ own improper operations. Therefore, it is necessary
for cloud service providers to offer an efficient audit service
to check the integrity and availability of the stored data [10].
Categories and Subject Descriptors Security audit is an important solution enabling trace-
H.3.2 [Information Storage and Retrieval]: Information back and analysis of any activities including data accesses,
Storage; E.3 [Data]: Data Encryption security breaches, application activities, and so on. Data
security tracking is crucial for all organizations that should
comply with a wide range of federal regulations including
General Terms the Sarbanes-Oxley Act, Basel II, HIPAA and so on1 . Fur-
Design, Performance, Security thermore, compared to the common audit, the audit service
for cloud storages should provide clients with a more efficient
Keywords proof for verifying the integrity of stored data.
In this paper, we introduce a dynamic audit service for
Dynamic Audit, Storage Security, Integrity Verification integrity verification of untrusted and outsourced storages.
Our audit system can support dynamic data operations and
1. INTRODUCTION timely abnormal detection with the help of several effective
Cloud computing provides a scalable environment for grow- techniques, such as fragment structure, random sampling,
ing amounts of data and processes that work on various ap- and index-hash table. Furthermore, we propose an efficient
plications and services by means of on-demand self-services. approach based on probabilistic query and periodic verifi-
Especially, the outsourced storage in clouds has become a cation for improving the performance of audit services. A
new profit growth point by providing a comparably low- proof-of-concept prototype is also implemented to evaluate
cost, scalable, location-independent platform for managing the feasibility and viability of our proposed approaches. Our
clients’ data. The cloud storage service (CSS) relieves the experimental results not only validate the effectiveness of
burden for storage management and maintenance. How- our approaches, but also show our system does not create
ever, if such an important service is vulnerable to attacks any significant computation cost while requiring less extra
storage for integrity verification.
The rest of the paper is organized as follows. Section 2 de-
scribes the research background and related work. Section 3
Permission to make digital or hard copies of all or part of this work for addresses our audit system architecture and main techniques
personal or classroom use is granted without fee provided that copies are and Section 4 describes the construction of corresponding al-
not made or distributed for profit or commercial advantage and that copies gorithms. In Section 5, we present the performance of our
bear this notice and the full citation on the first page. To copy otherwise, to schemes and the experimental results. Finally, we conclude
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee. this paper in Section 6.
SAC’11 March 21-25, 2011, TaiChung, Taiwan. 1
Copyright 2011 ACM 978-1-4503-0113-8/11/03 ...$10.00. http://www.hhs.gov/ocr/privacy/.

1550
2. BACKGROUND AND RELATED WORK tegrity checking and auditing. Other security services, such
The traditional cryptographic technologies for data in- as user authentication and data encryption, are orthogonal
tegrity and availability, based on hash functions and sig- to and compatible with audit services.
nature schemes [4, 11, 13], cannot work on the outsourced
data without a local copy of data. In addition, it is not a 3. ARCHITECTURE AND TECHNIQUES
practical solution for data validation by downloading them We introduce an audit system architecture for the out-
due to the expensive communications, especially for large- sourced data in clouds as shown in Figure 1. In this archi-
size files. Moreover, the ability to audit the correctness of tecture, we consider that a data storage service involves four
the data in a cloud environment can be formidable and ex- entities: data owner (DO), who has a large amount of data
pensive for the cloud users. Therefore, it is crucial to realize to be stored in the cloud; cloud service provider (CSP), who
public auditability for CSS, so that data owners may resort provides data storage service and has enough storage space
to a third party auditor (TPA), who has expertise and capa- and computation resources; third party auditor (TPA), who
bilities that a common user does not have, for periodically has capabilities to manage or monitor the outsourced data
auditing the outsourced data. This audit service is signifi- under the delegation of data owner; and authorized appli-
cantly important for digital forensics and data assurance in cations (AA), who have the right to access and manipulate
clouds. the stored data. Finally, application users can enjoy various
To implement public auditability, the notions of proof of cloud application services via these authorized applications.
retrievability (POR) [5] and provable data possession (PDP) [1]
have been proposed by some researchers. Their approach
was based on a probabilistic proof technique for a storage
'%
provider to prove that clients’ data remain intact. For ease 'DWD)ORZ

'%
of use, some POR/PDP schemes work on a publicly verifi-

9H QIR
UL UP
'%

L
'DWD2ZQHU

IL DW
GLW
able way, so that anyone can use the verification protocol to

FD LR
$X 1DPH6HUYHU '%

WL Q
RQ
prove the availability of the stored data. Hence, this pro-


'DWD6HUYHU

G\
3H PLF HW
QD 6HFU
UP
vides us an effective approach to accommodate the require-

LV RSHU H\
7KLUG3DUW\

VL
$XGLWRU

RQ DWLR

ORZ
ments from public auditability. POR/PDP schemes evolved

I

D)
RU QV
N

'DW
TX

around an untrusted storage offer a publicly accessible re-

HUL
HV
mote interface to check the tremendous amount of data. &ORXG
There exist some solutions for audit services on outsourced $XWKRUL]HG
data. For example, Xie et al. [9] proposed an efficient method $SSOLFDWLRQV

on content comparability for outsourced database, but it $SSOLFDWLRQ8VHUV


was not suitable for irregular data. Wang et al. [8] also pro-
vided a similar architecture for public audit services. To Figure 1: The audit system architecture.
support their architecture, a public audit scheme was pro-
posed with privacy-preserving property. However, the lack We assume the TPA is reliable and independent through
of rigorous performance analysis for a constructed audit sys- the following audit functions: TPA should be able to make
tem greatly affects the practical application of this scheme. regular checks on the integrity and availability of the dele-
For instance, in this scheme an outsourced file is directly gated data at appropriate intervals; TPA should be able to
split into n blocks, and then each block generates a verifica- organize, manage, and maintain the outsourced data instead
tion tag. In order to maintain security, the length of block of data owners, and support the dynamic data operations
must be equal to the size of cryptosystem, that is, 160 bits for authorized applications; and TPA should be able to take
which are 20 bytes. This means that 1M bytes file is split the evidences for disputes about the inconsistency of data
into 50,000 blocks and generates 50,000 tags [7], and the in terms of authentic records for all data operations.
storage of tags is at least 1M bytes. It is clearly inefficient To realize these functions, our audit service is comprised
to build an audit system based on this scheme. To address of three processes:
such a problem, we introduce a fragment technique to im-
prove performance and reduce the extra storage (see Section Tag Generation: the client (data owner) uses the secret
3.1). key sk to pre-process a file, which consists of a collection
Another major concern is the security issue of dynamic of n blocks, generates a set of public verification param-
data operations for public audit services. In clouds, one eters (PVP) and index-hash table (IHT) that are stored
of the core design principles is to provide dynamic scala- in TPA, transmits the file and some verification tags to
bility for various applications. This means that remotely CSP, and may delete its local copy (see Figure 2(a));
stored data might be not only accessed by the clients but
Periodic Sampling Audit: by using an interactive proof
also dynamically updated by them, for instance, through
protocol of retrievability, TPA (or other applications) is-
block operations such as modification, deletion and inser-
sues a “Random Sampling” challenge to audit the integrity
tion. However, these operations may raise security issues in
and availability of the outsourced data in terms of the
most of existing schemes, e.g., the forgery of the verification
verification information (involves PVP and IHT) stored
metadata (called as tags) generated by data owners and the
in TPA (see Figure 2(b)); and
leakage of the user’s secret key. Hence, it is crucial to de-
velop a more efficient and secure mechanism for dynamic Audit for Dynamic Operations: An authorized applica-
audit services, in which a potential adversary’s advantage tions, who hold a data owner’s secret key sk, can manipu-
through dynamic data operations should be prohibited. late the outsourced data and update the associated index-
Note that this paper only addresses the problems of in- hash table (IHT) stored in TPA. The privacy of sk and
1551
the checking algorithm ensure that the storage server can- which is used in our approach is shown in Figure 3: an out-
not cheat the authorized applications and forge the valid sourced file F is split into n blocks {m1 , m2 , · · · , mn }, and
audit records (see Figure 2(c)). each block mi is split into s sectors {mi,1 , mi,2 , · · · , mi,s }.
The fragment framework consists of n block-tag pair (mi , σi ),
where σi is a signature tag of a block mi generated by some
)LOH
7DJV
'%
'%
secrets τ = (τ1 , τ2 , · · · , τs ). Finally, these block-tag pairs are
'% stored in CSP and the encryption of the secrets τ (called as
7DJ*HQ 1DPH6HUYHU 'DWD6HUYHU PVP) are in TTP. Although this fragment structure is sim-
)LOH $OJRULWKP
ple and straightforward, but the file is split into n×s sectors
,QGH[KDVK
7DEOH
9HULILFDWLRQ
$XGLW
'%
and each block (s sectors) corresponds to a tag, so that the
6HFUHWNH\ 3DUDPHWHUV storage of signature tags can be reduced with the increase
7KLUG3DUW\$XGLWRU
of s. Hence, this structure can reduce an extra storage for
.H\*HQ
$OJRULWKP 3XEOLFNH\ tags and improve the audit performance.
D 7DJJHQHUDWLRQIRURXWVRXUFLQJILOH
There exist some schemes to the convergence of s blocks
to generate a secure signature tag, e.g., MAC-based, ECC
5DQGRP
'%
or RSA schemes [1, 7]. These schemes, built from collision-
6DPSOLQJ ,QWHUDFWLYH
$XGLW
'%
3URRI '% resistance signatures (see Appendix A) and the random or-
3URWRFRO
7KLUG3DUW\$XGLWRU
'%
acle model, support scalability, performance and security.
1DPH6HUYHU 'DWD6HUYHU

)DOVH7UXH t1 t2 ts
ĂĂ
E 3HULRGLFVDPSOLQJDXGLW
P P P ĂĂ PV s1 v1
0RGLILHG
)LOH7DJV
'%
5 &
'% P P P ĂĂ PV s2 v2 K
'% H
V D
0RGLILHG 8SGDWH'HOHWH
&KHFNLQJ
1DPH6HUYHU 'DWD6HUYHU P P P ĂĂ PV s3 v3 O
)LOH ,QVHUW $OJRULWKP R
%ORFNV $OJRULWKP
X O
噯 噯 噯 噯 噯

ĂĂ
ĂĂ
0RGLILHG
U H
,QGH[KDVK $XGLW 噯 噯 噯 噯 噯 Q
7DEOH '% F
6HFUHWNH\ H J
7KLUG3DUW\$XGLWRU
PQ PQ PQ ĂĂ PQV sn vn H
F '\QDPLFGDWDRSHUDWLRQVDQGDXGLW
ĂĂ
Figure 2: Three processes of audit system. m1 m2 ms s'

5HVSRQVH
In general, the authorized applications should be cloud ap-
Figure 3: Fragment structure and sampling audit.
plication services inside clouds for various application pur-
poses, but they must be specifically authorized by data own-
ers for manipulating the outsourced data. Since the ac- 3.2 Periodic Sampling Audit
ceptable operations require that the authorized applications In contrast with “whole” checking, random “sampling” check-
must present authentication information for TPA, any unau- ing greatly reduces the workload of audit services, while still
thorized modifications for data will be detected in audit achieving an effective detection of misbehavior. Thus, the
processes or verification processes. Based on this kind of probabilistic audit on sampling checking is preferable to real-
strong authorization-verification mechanism, we assume nei- ize the abnormal detection in a timely manner, as well as to
ther CSP is trusted to guarantee the security of stored data, rationally allocate resources. The fragment structure shown
nor a data owner has the capability to collect the evidence in Figure 3 provides probabilistic audit as well: given a ran-
of CSP’s faults after errors have been found. dom chosen challenge (or query) Q = {(i, vi )}i∈I , where I
The ultimate goal of this audit infrastructure is to en- is a subset of the block indices and vi is a random coeffi-
hance the credibility of cloud storage services, but not to cient, an efficient algorithm is used to produce a constant-
increase data owner’s burden and overheads. For this pur- size response (µ1 , µ2 , · · · , µs , σ ′ ), where µi comes from all
pose, TPA should be constructed in clouds and maintained {mk,i , vk }k∈I and σ ′ is from all {σk , vk }k∈I . Generally, this
by a cloud storage provider (CSP). In order to ensure the algorithm relies on homomorphic properties to aggregate
trust and security, TPA must be secure enough to resist ma- data and tags into a constant size response, which minimizes
licious attacks, and it also should be strictly controlled to network communication costs.
prevent unauthorized accesses even for internal members in Since the single sampling checking may overlook a very
clouds. A more practical way is that TPA in clouds should small number of data abnormality, we propose a periodic
be mandated by a trusted third party (TTP). This mecha- sampling approach to audit the outsourced data, which is
nism not only improves the performance of audit services, named as Periodic Sampling Audit. In this way, the audit
but also provides the data owner with a maximum access activities are efficiently scheduled in an audit period, and a
transparency. This means that data owners are entitled to TPA needs merely access small portions of files to perform
utilize the audit service without additional costs. audit in each activity. Therefore, this method can detect the
exceptions periodically, and reduce the sampling numbers in
3.1 Fragment Structure and Secure Tags each audit.
To maximize the storage efficiency and audit performance,
our audit system introduces a general fragment structure 3.3 Index-Hash Table
for the outsourced storage. An instance for this framework In order to support dynamic data operations, we introduce
1552
Third Party Cloud Service
Auditor(TPA) Provider(CSP)
Third Party Data Owner(DO)/ Cloud Service
Auditor(TPA) Authorized Applications(AA) Provider(CSP)
Initial Proof Cloud Service
Third Party Data Owner(DO)/
Commitment() Auditor(TPA) Authorized Applications(AA) Provider(CSP)
y ¬ TagGen() s ¬ TagGen()
Challenge()
Response() Query y
Verification() y
Query y Update(),
y Initial Proof Delete(), Insert()
Commitment() Updated s ¢
Initial Proof Updated y ¢
Challenge()
Commitment() Check()
Response()
Challenge()
Verification() Response()
Verification()

(a) Tag generation and user’s verification (b) Periodic sampling audit (c) Dynamic data operations and audit

Figure 4: The workflow of audit system.

a simple index-hash table (IHT) to record the changes of file server, and maintains a corresponding authenticated index
blocks, as well as generate the hash value of each block in the structure at a TPA. In Figure 4 (a), we describe this process
verification process. The structure of our index-hash table as follows: firstly, using KeyGen(), the owner generates a
is similar to that of file block allocation table in file systems. public/secret keypair (pk, sk) by himself or the system man-
Generally, the index-hash table χ consists of serial number, ager, and then sends his public key pk to TPA. Note that
block number, version number, random integer, and so on TPA cannot obtain the client’s secret key sk; secondly, the
(see Table 1 in Appendix A). Note that we must assure all owner chooses the random secret τ and then invokes the
records in the index-hash table differ from one another to algorithm T agGen() to produce public verification informa-
prevent the forgery of data blocks and tags. In addition to tion ψ = (u, χ) and signature tags σ, where τ is unique for
record data changes, each record χi in the table is used to each file. Finally, the owner sends ψ and (F, σ) to TPA and
generate a unique hash value, which in turn is used for the CSP, respectively, where χ is an index-hash table.
construction of a signature tag σi by the secret key sk. The
relationship between χi and σi must be cryptographically 4.1 Supporting Periodic Sampling Audit
secure, and we can make use of it to design our verification At any time, TPA can check the integrity of a file F as
protocol. follows: TPA first queries database to obtain the verification
Although the index-hash table may increases the complex- information ψ; and then it initializes an interactive protocol
ity of an audit system, it provides the higher assurance to P roof (CSP, Client) and performs a 3-move proof protocol
monitor the behavior of an untrusted CSP, as well as valu- in a random sampling way: Commitment, Challenge, and
able evidence for computer forensics, due to the reason that Response; finally, TPA verifies the interactive data to get
anyone cannot forge the valid χi (in TPA) and σi (in CSP) the results. In fact, since our scheme is a publicly verifiable
without the secret key sk. In practical applications, the protocol, anyone can run this protocol, but s/he is unable
designer should consider that the index-hash table is kept to get any advantage to break the cryptosystem, even if
into the virtualization infrastructure of cloud-based storage TPA and CSP cooperate for an attack. Let P (x) denotes
services. the subject P holds the secret x and hP, V i(x) denotes both
parties P and V share a common data x in a protocol. This
process can be defined as follows:
4. ALGORITHMS FOR AUDIT SYSTEM
In this section we describe the construction of algorithms Proof (CSP, T P A): is an interactive proof protocol be-
in our audit architecture. Firstly, we present the definitions tween CSP and TPA, that is hCSP (F, σ), T P Ai(pk, ψ),
for the tag generation process as follows: where a public key pk and a set of public parameters ψ
are the common inputs between TPA and CSP, and CSP
KeyGen (1κ ): takes a security parameter κ as input, and takes the inputs, a file F and a set of tags σ. At the end
returns a public/secret keypair (pk, sk); and of the protocol, TPA returns {0|1}, where 1 means the file
is correctly stored on the server.
T agGen (sk, F ): takes as inputs the secret key sk and a
file F , and returns the triple (τ, ψ, σ), where τ denotes An audit service executes the verification process period-
the secret used to generate the verification tags, ψ is a set ically by using the above-mentioned protocol. Figure 4(b)
of public verification parameters u and index-hash table shows such a two-party protocol between TPA and CSP, i.e.,
χ, i.e., ψ = (u, χ), and σ denotes a set of tags. P roof (CSP, T P A), without the involvement of a client (DO
or AA). In Figure 4 (b) shows two verification processes. To
Data owner or authorized applications only need to save the improve the efficiency of verification process, TPA should
secret key sk, that is, sk would not be necessary for the perform audit tasks based on a probabilistic sampling.
verification/audit process. The secret of the processed file
τ can be discarded after tags are generated due to public 4.2 Supporting Dynamic Data Operations
verification parameters u. In order to meet the requirements from dynamic scenar-
In Figure 4 demonstrates the workflow of our audit sys- ios, we introduce following definitions for our dynamic algo-
tem. Suppose a data owner wants to store a file in a storage rithms:
1553
Update(sk, ψ, m′i ): is an algorithm run by AA to update the
block of a file m′i at the index i by using sk, and it returns 500

a new verification metadata (ψ ′ , σ ′ ); 0.99

400 0.90

The number of queried blocks


0.80

Delete(sk, ψ, mi ): is an algorithm run by AA to delete the 0.70

0.60

block mi of a file mi at the index i by using sk, and it 300 0.50

returns a new verification metadata (ψ ′ ); and


200

Insert(sk, ψ, mi ): is an algorithm run by AA to insert the


block of a file mi at the index i by using sk, and it returns 100

a new verification metadata (ψ ′ , σ ′ ).


0
0 2000 4000 6000 8000 10000
To ensure the security, dynamic data operations are only The number of file blocks (100 disrupted blocks)

available to data owners or authorized applications, who


hold the secret key sk. Here, all operations are based on Figure 5: The number of queried blocks under dif-
data blocks. Moreover, in order to implement audit services, ferent detection probabilities and different numbers
applications need to update the index-hash table. It is nec- of file blocks.
essary for TPA and CSP to check the validity of updated
log(1−P )
data. In Figure 4(c), we describe the process of dynamic since w = nt = n·log(1−ρ ≈ Pe . However, the estimation
b)
data operations and audit. First, the authorized application of w is not an accurate measurement. To clearly represent
obtains the public verification information ψ from TPA. Sec- this ratio, Figure 6 plots w for different values of n, e and
ond, the application invokes the U pdate, Delete, and Insert P . It is obvious that the ratio of queried blocks tends to be
algorithms, and then sends the new ψ ′ and σ ′ to TPA and a constant value for a sufficiently large n. For instance, in
CSP, respectively. Finally, the CSP makes use of an efficient Figure 6 (Left) if there exist 100 disrupted blocks, the TPA
algorithm Check to verify the validity of updated data. Note asks for w = 4.5% and 2.3% of n (n > 1, 000) in order to
that the Check algorithm is important to ensure the effec- achieve P of at least 99% and 90%, respectively. However,
tiveness of the audit. this ratio w is also inversely proportional to the number
of disrupted blocks e. For example, in Figure 6 (Right) if
5. PERFORMANCE AND EVALUATION there exist 10 disrupted blocks, the TPA needs to ask for
w = 45% and 23% of n (n > 1, 000) in order to achieve the
It is obvious that audit activities would increase the com-
same P , respectively. It demonstrates our audit scheme is
putation and communication overheads of audit services.
very effective for higher probability of disrupted blocks.
However, less frequent activities may not detect abnormality
in a timely manner. Hence, the scheduling of audit activi-
ties is significant for improving the quality of audit services.
5.2 Schedule of Periodic Verification
In order to detect abnormality in a low-overhead and timely The sampling-based audit has the potential to significantly
manner, we attempt to optimize the audit performance from reduce the workload on the servers and increase the audit
two aspects: performance evaluation of probabilistic queries efficiency. Firstly, we assume that each audited file has an
and schedule of periodic verification. Our basic idea is to audit period T , which depends on how important it is for
maintain an overhead balance, which helps us improve the the owner. For example, a common audit period may be
performance of audit systems. assigned as one week or one month, and the audit period for
important files may be set as one day. Of course, these audit
5.1 Probabilistic Queries Evaluation activities should be carried out at night or on weekend.
The audit service achieves the detection of CSP servers’ Assume we make use of the audit frequency f to denote
misbehavior in a random sampling mode to reduce the work- the number of occurrences of an audit event per unit time.
load on the server. The detection probability P of disrupted This means that the number of TPA’s queries is T · f in an
blocks is an important parameter to guarantee that these audit period T . According to the above analysis, we have
blocks can be detected in a timely manner. Assume the the detection probability P = 1 − (1 − ρb )n·w in each audit
TPA modifies e blocks out of the n-block file. The proba- event. Let PT denotes the detection probability in an audit
bility of disrupted blocks is ρb = ne . Let t be the number of period T . Hence, we have the equation PT = 1 − (1 − P )T ·f .
queried blocks for a challenge in the protocol proof. We have In terms of 1 − P = (1 − ρb )n·w , the detection probability
detection probability P = 1 − ( n−e )t = 1 − (1 − ρb )t . Hence, PT can be denoted as PT = 1 − (1 − ρb )n·w·T ·f . In this
n
log(1−P ) equation, TPA can obtain the probability ρb depending on
the number of queried blocks is t = log(1−ρ b)
≈ Pe·n for a
2
the transcendental knowledge for the cloud storage provider.
sufficiently large n. This means that the number of queried Moreover, the audit period T can be predefined by a data
blocks t is directly proportional to the total number of file owner in advance. Hence, the above equation can be used
blocks n for the constant P and e. In Figure 5, we show to analyze the parameter value w and f . It is obvious to
the results of the number of queried blocks under different obtain the equation f = w·n·T log(1−PT )
.
·log (1−ρb )
detection probabilities (from 0.5 to 0.99), different number
This means that the audit frequency f is inversely pro-
of file blocks (from 10 to 10,000), and constant number of
portional to the ratio of queried blocks w. That is, with
disrupted blocks (100).
the increase of verification frequency, the number of queried
We observe the ratio of queried blocks in the total file
blocks decreases at each verification process. In Figure 7, we
blocks w = nt under different detection probabilities. Based
show the relationship between f and w under 10 disrupted
on our analysis, it is easy to determine that this ratio holds
blocks for 10,000 file blocks. We can observe a marked drop
2
In terms of (1− ne )t = 1− e·t
n
, we have P = 1−(1− e·t
n
)= e·t
n
. of w along with the increasing of frequency.
1554
0.05 0.5

The ratio of queried blocks in total file blocks


The ratio of queried blocks in total file blocks
0.99
0.99
0.04 0.4
0.90
0.90

0.80 0.80

0.70 0.70

0.03 0.60 0.3 0.60


0.50 0.50

0.02 0.2

0.01 0.1

0.00 0.0
0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000

The number of file blocks (100 disrupted blocks) The number of file blocks (10 disrupted blocks)

Figure 6: The ratio of queried blocks in total file blocks under different detection probabilities and different
number of disrupted blocks.

0.5
Compute Cloud (EC2) is used to provide the support for
verification protocol and dynamic data operations.
The ratio of queried blocks in total file blocks

0.99
0.4 0.90

0.80 Audit service: We used a local IBM server with two Intel
0.70

0.3 0.60 Core 2 processors at 2.16 GHz running Windows Server


0.50
2003. Our scheme was deployed in this server, and then it
0.2 implemented the integrity checking in S3 storage accord-
ing to the assigned schedule via 250 MB/sec of network
0.1 bandwidth. A socket port was also opened to support the
applications’ accesses and queries for the audit service.
0.0
1 2 3 4 5 6 7 8 9 10

The audit frequency(times/unit-time) (10 disrupted blocks)


Prototype software: Using GMP and PBC libraries, we
have implemented a cryptographic library upon which
Figure 7: The ratio of queried blocks in total file temporal attribute systems can be constructed. These
blocks under different audit frequency for 10 dis- C libraries contain approximately 5,200 lines of codes and
rupted blocks and 10,000 file blocks. have been tested on both Windows and Linux platforms.
The elliptic curve utilized in our experiments is a MNT
In fact, the relationship between f and w is compara- curve, with a base field size of 159 bits and the embedding
log(1−PT )
tively stable for PT , ρb , and n due to f · w = n·T . degree 6. The security level is chosen to be 80 bit, which
·log (1−ρb )
TPA should choose the appropriate frequency to balance the means |p| = 160.
overhead, according to the above equation. For example, if
e = 10 blocks in 10,000 blocks (ρb = 0.1%), then TPA asks Firstly, we quantify the performance of our audit scheme
for 658 blocks and 460 blocks for f = 7 and 10 in order to under different parameters, such as file size sz, sampling
achieve PT of at least 99%. Hence, an appropriate audit fre- ratio w, sector number per block s, and so on. Our analy-
quency would greatly reduce the sampling numbers, as well sis shows that the value of s should grow with the increase
as computation and communication overheads of an audit of sz in order to reduce computation and communication
service. costs. Thus, experiments were carried out as follows: the
stored files were chosen from 10KB to 10MB, the sector
5.3 Implementation and Experimental Results numbers were changed from 20 to 250 in terms of the file
To validate our approaches, we have implemented a pro- sizes, and the sampling ratios were also changed from 10%
totype of public audit service. Our prototype utilizes three to 50%. The experimental results were shown in Figure 8.
existing services/applications: Amazon Simple Storage Ser- These results indicate that computation and communication
vice (S3) is an untrusted data storage server; local applica- costs (including I/O costs) grow with increase of file size and
tion server provides our audit service; and the prototype is sampling ratio.
built on top of an existing open source project called Pairing- Next, we compare the performance of each activity in our
Based Cryptography (PBC) library. We present some de- verification protocol. It is easy to derive theoretically that
tails about these three components as follows: the overheads of “commitment” and “challenge” resemble one
another, and the overheads of “response” and “verification”
Storage service: Amazon Simple Storage Service (S3) is a also resemble one another. To validate such theoretical re-
scalable, pay-per-use online storage service. Clients can sults, we changed the sampling ratio w from 10% to 50%
store a virtually unlimited amount of data, paying for for a 10MB file and 250 sectors per block. In Figure 8, we
only the storage space and bandwidth that they are us- show the experiment results, in which the computation and
ing, without the initial start-up fee. The basic data unit communication costs of “commitment” and “challenge” are
in S3 is an object, and the basic container for objects in S3 slightly changed for sampling ratio, but those for “response”
is called a bucket. In our example, objects contain both and “verification” grow with the increase of sampling ratio.
data and meta-data (tags). A single object has a size limit Then, in the Amazon S3 service, we set that the size of
of 5 GB, but there is no limit on the number of objects block is 4K bytes and the value of s is 200. Our experi-
per bucket. Moreover, a small script on Amazon Elastic ments also show that, in TagGen phase, the time overhead
1555
180

Computation and communication costs. (s)

Computation and communication costs. (s)


100
150 ratio=50%

ratio=40%

ratio=30%

120 ratio=20%
10 Commitment
ratio=10%
Challenge

Response
90
Verification
1
Total Time

60

0.1

30

0 0.01
10 100 1000 10000 0.1 0.2 0.3 0.4 0.5
(s=20) (s=50) (s=100) (s=250) The ratio of queried blocks for total file blocks.(%)

The size of files. (K-Bytes) (10M-Bytes, 250 sectors/blocks)

Figure 8: The experiment results under different file size, sampling ratio, and sector number.

is directly proportional to the number of blocks. Ideally, of user-friendly hash comparison schemes. In ACSAC,
this process is only executed when the file is uploaded into a pages 105–114, 2009.
S3 service. The verification protocol can be run in approxi- [5] A. Juels and B. S. K. Jr. Pors: proofs of retrievability for
mately constant time. Similarly, three dynamic data opera- large files. In Proceedings of the 2007 ACM Conference on
Computer and Communications Security, CCS 2007, pages
tions can be performed in approximately constant time for 584–597, 2007.
any block. [6] C.-P. Schnorr. Efficient signature generation by smart
Finally, reducing the communication overheads and av- cards. J. Cryptology, 4(3):161–174, 1991.
erage workloads are critical for an efficient audit schedule. [7] H. Shacham and B. Waters. Compact proofs of
With probabilistic algorithm, our scheme is able to realize retrievability. In Advances in Cryptology - ASIACRYPT
the uniform distribution of verified sampling blocks accord- 2008, 14th International Conference on the Theory and
ing to the security requirements of clients, as well as the Application of Cryptology and Information Security, pages
90–107, 2008.
dependability of storage services and running environments.
[8] C. Wang, Q. Wang, K. Ren, and W. Lou.
In our experiments, we make use of a simple schedule to Privacy-preserving public auditing for data storage security
periodically manage all audit tasks. The results show that in cloud computing. In INFOCOM, 2010 Proceedings
audit services based on our scheme can support a great deal IEEE, pages 1 –9, 14-19 2010.
of audit tasks, and the performance of scheduled audits are [9] M. Xie, H. Wang, J. Yin, and X. Meng. Integrity auditing
more preferable than the straightforward individual audit. of outsourced data. In C. Koch, J. Gehrke, M. N.
Garofalakis, D. Srivastava, K. Aberer, A. Deshpande,
D. Florescu, C. Y. Chan, V. Ganti, C.-C. Kanne, W. Klas,
6. CONCLUSIONS and E. J. Neuhold, editors, VLDB, pages 782–793. ACM,
In this paper, we presented a construction of dynamic au- 2007.
dit services for untrusted and outsourced storages. We also [10] A. A. Yavuz and P. Ning. Baf: An efficient publicly
verifiable secure audit logging scheme for distributed
presented an efficient method for periodic sampling audit to systems. In ACSAC, pages 219–228, 2009.
enhance the performance of third party auditors and storage [11] A. R. Yumerefendi and J. S. Chase. Strong accountability
service providers. Our experiments showed that our solution for network storage. In FAST, pages 77–92. USENIX, 2007.
has a small, constant amount of overhead, which minimizes [12] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S. S. Yau.
computation and communication costs. Cooperative provable data possession. Technical Report
PKU-CSE-10-04, http://eprint.iacr.org/2010/234.pdf,
Peking University and Arizona State University, April 2010.
7. ACKNOWLEDGMENTS [13] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S. S. Yau.
The work of Y. Zhu, H. Wang, and Z. Hu was partially Efficient provable data possession for hybrid clouds. In
supported by the grants from National Natural Science Foun- Proceedings of the 17th ACM conference on Computer and
dation of China (No.61003216). This work of Gail-J. Ahn communications security, pages 756–758, 2010.
and Hongxin Hu was partially supported by the grants from
US National Science Foundation (NSF-IIS-0900970 and NSF- APPENDIX
CNS-0831360).
A. CONSTRUCTION FOR OUR SCHEME
8. REFERENCES Let H = {Hk } be a collision-resistance hash family of
[1] G. Ateniese, R. C. Burns, R. Curtmola, J. Herring, functions Hk : {0, 1}∗ → {0, 1}n indexed by k ∈ K. This
L. Kissner, Z. N. J. Peterson, and D. X. Song. Provable hash function can be obtained from hash function of BLS
data possession at untrusted stores. In Proceedings of the
2007 ACM Conference on Computer and Communications signatures [2]. Further, we set up our systems using bilinear
Security, CCS 2007, pages 598–609, 2007. map group system S = hp, G, GT , ei proposed in [3].
[2] D. Boneh, X. Boyen, and H. Shacham. Short group
signatures. In In proceedings of CRYPTO’04, volume 3152 A.1 Proposed Construction
of LNCS, pages 41–55. Springer-Verlag, 2004. We present our IPOR construction in Figure 9. In our
[3] D. Boneh and M. Franklin. Identity-based encryption from
scheme, each client holds a secret key sk, which can be used
the weil pairing. In Advances in Cryptology (CRYPTO’01),
volume 2139 of LNCS, pages 213–229, 2001. to generate the tags of many files. Each processed file will
[4] H.-C. Hsiao, Y.-H. Lin, A. Studer, C. Studer, K.-H. Wang, produce a public verification parameter ψ = (u, χ), where
H. Kikuchi, A. Perrig, H.-M. Sun, and B.-Y. Yang. A study u = (ξ (1) , u1 , · · · , us ), χ = {χi }i∈[1,n] is the index-hash ta-
1556
ble. We define χi = (Bi ||Vi ||Ri ), where Bi is the sequence
number of block, Vi is the version number of updates for Table 1: The index-hash table with random values.
No. Bi Vi Ri
this block, and Ri is a random integer to avoid collision. 0 0 0 0 ← Used to head
The value ξ (1) can be considered as the signature of the se- 1 1 2 r1′ ← Update
cret τ1 , · · · , τs . Note that, it must assure that ψ’s is different 2 2 1 r2
for all processed files. Moreover, it is clear that our scheme 3 4 1 r3 ← Delete
admits short responses in verification protocol. 4 5 1 r5
5 5 2 r5′ ← Insert
In our construction, the verification protocol has 3-move . . . .
structure: commitment, challenge and response. This pro- . . . .
. . . .
tocol is similar to Schnorr’s Σ protocol [6], which is a zero- n n 1 rn
← Append
knowledge proof system (Due to the space limitation, the n+1 n+1 1 rn+1
security analysis is omitted but can be found in [12]). By
using this property, we ensure the verification process does “Bi ||Vi ||Ri ” is unique in this table. Although the same val-
not reveal anything. ues of “Bi ||Vi ” may be produced by repeating the insert
and delete operations, the random Ri can avoid this colli-
KeyGen(1κ ): Given a bilinear map group system S = sion. An alterative method
(p, G, GT , e) and a collision-resistant hash function Hk (·), P is to generate an updated ran-
dom value by Ri′ ← HRi ( sj=1 m′i,j ), where the initial value
chooses a random α, β ∈R Zp and computes H1 = hα and P
H2 = hβ ∈ G. Thus, the secret key is sk = (α, β) and the is Ri ← Hξ(1) ( sj=1 mi,j ) and mi = {mi,j } denotes the i-
public key is pk = (g, h, H1 , H2 ). th data block. We show a simple example to describe the
TagGen(sk, F ): Splits the file F into n × s sectors F =
change of index-hash table for the different operations in Ta-
{mi,j } ∈ Zn×s . Chooses s random τ1 , · · · , τs ∈ Zp as the
ble 1, where an empty record (i = 0) is used to support the
p
secret of this file and computes ui = g τi ∈ G for i ∈ [1, s] operations on the first record. The “Insert” operation on the
Ps last record is replaced with “Append” operation. It is easy
and ξ (1) = Hξ (“F n”), where ξ = i=1 τi and F n is the
file name. Builds an index-hash table χ = {χi }n i=1 and
to prove the each χi is unique in χ in the above algorithms,
fills out the item χi = (Bi = i, Vi = 1, Ri ∈R {0, 1}∗ ) in that is. In an index table χ = {χi } and χi = “Bi ||Vi ||Ri ”,
(2)
χ for i ∈ [1, n], then calculates its tag as σi ← (ξi )α · there exists no two same records for dynamic data opera-
Ps
g j=1 τj ·mi,j ·β
∈ G. where ξi
(2)
= Hξ(1) (χi ) and i ∈ [1, n]. tions, if Ri = 6 Rj′ for any indexes i, j ∈ N.
Finally, sets u = (ξ (1) , u
1 , · · · , us ) and outputs ψ = (u, χ) Update(sk, ψ, m′i ): modifies the version number by Vi ←
to TPA, and σ = (σ1 , · · · , σn ) to CSP. maxBi =Bj {Vj } + 1 and chooses a new Ri in χi ∈ χ to get a
Proof(CSP, T P A): This is a 3-move protocol between Prover new ψ′ ; computes the new hash ξi
(2)
= Hξ(1) (“Bi ||Vi ||Ri ”);
(CSP) and Verifier (TPA), as follows: Q
(2) α m′
by using sk, computes = (ξi ) · ( sj=1 uj i,j )β , where
σi′
• Commitment(CSP → T P A): CSP chooses a random
u = {uj } ∈ ψ, finally outputs (ψ′ , σi′ , m′i ).
γ ∈ Zp and s random λj ∈R Zp for j ∈ [1, s], and sends its
commitment C = (H1′ , π) to TPA, where H1′ = H1γ and Delete(sk, ψ, mi ): computes the original σi by mi and com-
Q λ (2)
π ← e( sj=1 uj j , H2 ); putes the new hash ξi = Hξ(1) (“Bi ||0||Ri ”) and σi′ =
(2)
• Challenge(CSP ← T P A): TPA chooses a random chal- (ξi )α by sk; deletes i-th record to get a new ψ′ ; finally
lenge set I of t indexes along with t random coefficients outputs (ψ′ , σi , σi′ ).
vi ∈ Zp . Let Q be the set {(i, vi )}i∈I of challenge index Insert(sk, ψ, m′i ): inserts a new record in i-th position of the
coefficient pairs. TPA sends Q to CSP; index-hash table χ ∈ ψ, and the other records move back-
• Response(CSP ward in order; modifies Bi ← Bi−1 , Vi ← maxBi =Bj {Vj } +
Q → T P γ·v A): CSP calculates P the response
θ, µ as σ′ ← (i,vi )∈Q σi i , µj ← λj +γ · (i,vi )∈Q vi · 1, and a random Ri in χi ∈ χ to get a new ψ′ ; com-
(2)
mi,j , where µ = {µj }j∈[1,s] . P sends θ = (σ′ , µ) to TPA; putes the new hash ξi = Hξ(1) (“Bi ||Vi ||Ri ”) and σi′ =
(2) Q m′
Check: The verifier TPA checks whether the response (ξi )α · ( sj=1 uj i,j )β , where u = {uj } ∈ ψ, finally out-
? Q (2)
is correct by π · e(σ′ , h) = e( (i,vi )∈Q (ξi )vi , H1′ ) · puts (ψ′ , σi′ , m′i ).
Q µ
e( sj=1 uj j , H2 ). Check: The application sends the above result to cloud store
provider P via secret channel. For Update or Insert opera-
Figure 9: The proposed IPOR scheme. tions, P must check the following equation for (ψ′ , σi′ , m′i ) in
? (2) Q m′
terms of e(σi′ , h) = e(ξi , H1 ) · e( sj=1 uj i,j , H2 ). For Delete
A.2 Implementation of Dynamic Operations operation, P must check whether σi is equal to the stored σi
?
To support dynamic data operations, it is necessary for and e(σi′ , h) = e(Hξ(1) (“Bi ||0||Ri ”), H1 ). Further, T P A must
TPA to employ an index-hash table χ to record the realtime replace ψ by the new ψ′ and check the completeness of χ ∈ ψ.
status of the stored files. Some existing index schemes in a
Figure 10: The algorithms for dynamic operations.
dynamic scenario are insecure due to replay attack on the
same Hash values. To solve this problem, a simple index- According to the construction of index-hash tables, we
hash table χ = {χi } used in the above-mentioned construc- propose a simple method to provide dynamic data modifica-
tion (see Figure 9) is described in Table 1, which includes tion in Figure 10. All tags and the index-hash table should
four columns: No. denotes the real number i of data block be renewed and reorganized periodically to improve the per-
mi , Bi is the original number of block, Vi stores the version formance. Of course, we can replace the sequent lists by the
number of updates for this block, and Ri is a random integer dynamically linked lists to improve the efficiency of updating
to avoid collision. index-hash table. Further, we omit the discuss of the head
In order to ensure the security, we require that each χi = and tail index items in χ, and they are easy to implement.

1557

You might also like