0% found this document useful (0 votes)

13 views13 pages

Aaemw 2

This document proposes a new framework for secure data outsourcing that provides efficient and scalable query response times. The framework uses multiple service providers to guarantee availability and allows recovery from data corruption or hardware failures without impacting query response time. It uses a technique called "secret dividing" instead of encryption to distribute data across servers, which reduces computational complexity and privacy risks compared to existing solutions. The evaluations show the framework is scalable and practical for data outsourcing.

Uploaded by

s.bahrami1104

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views13 pages

Aaemw 2

Uploaded by

s.bahrami1104

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Information Sciences 263 (2014) 198–210

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

Dividing secrets to secure data outsourcing

Fatih Emekci a,⇑, Ahmed Methwally b, Divyakant Agrawal b, Amr El Abbadi b
a
Department of Computer Science, Turgut Ozal University, 06310 Ankara, Turkey
b
Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, USA

a r t i c l e i n f o a b s t r a c t

Article history: Data outsourcing or database as a service is a new paradigm for data management. The
Received 31 January 2013 third party service provider hosts databases as a service. These parties provide efficient
Received in revised form 17 July 2013 and cheap data management by obviating the need to purchase expensive hardware and
Accepted 1 October 2013
software, deal with software upgrades and hire professionals for administrative and main-
Available online 16 October 2013
tenance tasks. However, due to recent governmental legislations, competition among com-
panies and database thefts, companies cannot use database service providers directly. They
Keywords:
need secure and privacy preserving data management techniques to be able to use them in
Data outsourcing
Query processing
practice. Since data is remotely stored in a privacy preserving manner, there are efficiency
Data privacy and security related problems such as poor query response time. We propose a new framework that
provides efficient and scalable query response times by reducing the computation and
communication costs. Furthermore, the proposed technique uses several service providers
to guarantee the availability of the services while detecting the dishonest or faulty service
providers without introducing additional overhead on the query response time. The eval-
uations demonstrate that our data outsourcing framework is scalable and practical.
Ó 2013 Elsevier Inc. All rights reserved.

1. Introduction

Data outsourcing or database as a service is a new paradigm for data management in which a third party service provider
hosts database as a service. The service provides data management for its customers and thus obviates the need for the ser-
vice user to purchase expensive hardware and software, deal with software upgrades and hire professionals for administra-
tive and maintenance tasks. Since using an external database service promises reliable data storage at a low cost by
eliminating the need for expensive in-house data-management infrastructure, it is very attractive for companies. However,
recent governmental legislations, competition among companies and database thefts have pushed companies to use secure
and privacy preserving data management techniques. Using an external database service is a straightforward server–client
application in an environment where service providers and clients are honest and clients do not hesitate to share their data
with database service providers. However, this is usually not the case and thus the research challenge here is to build a ro-
bust and efﬁcient service to manage data in a secure and privacy preserving manner.
Current research has been focused only on how to index and query encrypted data [20,21,9]. Although one of the main
problems is querying the encrypted data efﬁciently, it is not the only problem in data outsourcing. Since thousands of clients
per database service provider are expected, the scalability of the proposed techniques and the availability of the services is a
very important problem. However, current proposals do not consider this issue and assume a simple scenario consisting of
an always available database service provider and a simple service user. Furthermore, they assume both of the parties are
honest and trust each other. For example, the service provider may corrupt the data and it would be impossible to recover

⇑ Corresponding author. Tel.: +90 312 5515000.

E-mail address: [email protected] (F. Emekci).

0020-0255/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.ins.2013.10.006
F. Emekci et al. / Information Sciences 263 (2014) 198–210 199

it for the service user. To be able to use external database service providers in real life, there should be a mechanism to re-
cover the data and also to prove that data has been corrupted. Providing a trust mechanism to push both database service
providers and clients to behave honestly is another important problem.
We propose a new data outsourcing framework providing efﬁcient and scalable query response times. In addition to this,
the proposed technique uses multiple service providers to guarantee the availability of the services and to be able to recover
from hardware failures. Furthermore, we propose a technique to identify the dishonest or faulty service providers.
Current proposals use encryption to hide the content from service providers [20,9]. However, the computational complex-
ity of encrypting and decrypting data to execute a query increase the query response time. Therefore, this complexity is one
of the bottlenecks in current solutions [3]. The proposed solution in this paper uses information theoretically secure tech-
niques similar to Shamir’s secret sharing mechanism [29] instead of computationally secure techniques such as encryption.
Furthermore, label-based ﬁltration is used to execute range queries [20,22]. However, a data provider reveals some informa-
tion about the underlying data by labeling a row. Therefore, the computational complexity of our solution is much less than
the current proposals using encryption. Therefore, there is a privacy performance tradeoff in these solutions. Our technique
does not reveal any information about the content of the data and only the required data is retrieved from the service
providers.
In this paper, we use multiple service providers for the fault tolerance. The fault tolerance in this context is the availability
of service providers and the ability to recovery from data corruption. Data corruption may happen due to either disk failures
or malicious service providers. Our solution deals with both these faults without incurring any additional overhead to the
query response time.
The rest of the paper is organized as follows: The model and the types of queries are introduced and also related work is
reviewed in Section 2. The basic attempts to solve the problem is discussed in Section 3. Section 4 presents the data distri-
bution technique. The query processing methods for our data distribution technique is studied in Section 5. Section 6 dis-
cusses the fault tolerance of the proposed technique. The query response time of the technique is analyzed in Section 7.
The last section discusses the future work and concludes the paper.

2. Solution overview and background

In this section, we define the problem and introduce the model. Then, we briefly discuss our solution and finally we re-
view the related work.

2.1. Model and problem formulation

Assume data source D wants to outsource its data to eliminate its database maintenance cost by using the database ser-
vice provided by database services DAS1, . . . , DASn. D needs to store and access its data remotely without revealing the con-
tent of the database to any of the database services. For the sake of this discussion, assume D has a single table
Employees(EID, name, lastname, department, salary) in its database and stores Employees using the services provided by DAS1, -
. . . , DASn. After storing Employees,D needs to query Employees without revealing any information about either the content of
the table or queries. Basically, D can pose any of the following queries over time:

1. Exact match queries such as ‘‘Retrieve all employees whose name is ‘John’ ‘‘.
2. Range queries such as ‘‘Retrieve all employees whose salary is between 10 K and 40 K’’.
3. Aggregate queries such as MIN/MAX, MEDIAN, SUM and AVERAGE (including aggregate queries over ranges).

There are several proposals addressing exact match queries and range queries [20,9,3], however, these proposals are not
complete and do reveal some information about the underlying data (e.g. the range of salaries of employees). In this paper,
we will propose a complete approach to execute exact match, range and aggregation queries in a privacy preserving manner.
Throughout the paper, we will assume that there are two kinds of attributes in tables namely numeric attributes (e.g. salary)
and non-numeric attributes (e.g. name). The solution ﬁrst will be presented for numeric attributes and then we will show how
to extend it for non-numeric attributes. Throughout this paper, we will develop the work in [20,21] referred to as data
encryption in parallel with our proposed technique referred to as secret dividing so as to show the differences and compare
them.
In our solution, data is divided into n shares and each share is stored in a different service provider. When a query is gen-
erated at a data source, it is rewritten and the relevant shares are retrieved from the service providers and the query answer
is reconstructed at the data source. In order to answer queries, any k of the service providers are needed to be available. n and
k are the system parameters and will be discussed later.

2.2. Other related work

Hacigumus et al. [20], Hore et al. [21] and Aggarwal et al. [3] propose using third parties as database service providers.
The differences between our work and these work is discussed and compared throughout the paper.
200 F. Emekci et al. / Information Sciences 263 (2014) 198–210

The authors of provided an extension to the work in [3] by splitting the columns in [19]. However, instead of splitting the
information of each column among several DAS providers, they split the columns among the service providers. To preserve
the privacy, the data source enforces privacy constraints expressible as combinations of columns that have to be split among
multiple service providers. The goal of using privacy constraints is to reduce the extent of encrypted columns. On the other
hand, coming up with the privacy constraints is a problem in its own, and the constrains are not easily understandable by the
end users who need privacy guarantees on their personal data. In the most conservative case, the system degenerates to the
case where all the columns are encrypted. In addition, partitioning columns among servers and identifying which columns to
encrypt (in order to cater for the workload) is a provable intractable problem. Most importantly, the scheme does not handle
the case of data corruption or malicious/curious service providers.
Storing and querying public health information where privacy is an important aspect is studied in [24]. The work focuses
on the case where external users can issue queries on the health records without identifying the identity of the patients and
the owners of the data. The authors make use of multiple trusted authorities to achieve scalability as well as privacy for both
the queries’ keywords and the results even in the existence of ‘‘honest-but-curious’’ service providers. The authors’ main fo-
cus is multi-dimensional authorized private keyword searches supporting a subset of conjunctive formulas with equality,
subset and a class of simple range queries. The authors enhance the efficiency of the query execution via hierarchical attri-
butes. Our work is different than this as we focus on the relational data and propose a framework to execute SQL like queries
including (select, join, production and aggregate queries).
The authors provided a general encryption-based architecture for cloud storage for data owners to store data on a cloud,
and share it with other users [23]. They employ a searchable encryption scheme that provides a way to encrypt a search in-
dex so that its contents are hidden (except to a party that is given appropriate tokens). Given a token for a keyword one can
retrieve pointers to the encrypted files that contain the keyword. The approach employs efficient asymmetric searchable
encryption in [1] to be able to support range queries, which makes the data vulnerable to dictionary attacks.
The authors proposed a distributed storage system called Secret Sharing Storage System (SSSS), which uses the (k, n) se-
cret sharing scheme, while also encrypting the file blocks [26]. That is, in the SSSS, files are secret data, and shares of files are
stored on storage nodes that are distributed on an ultra-fast network. The authors focus on agents for implementing the ba-
sic functions for realizing the distributed storage system. A client agent receives a user request, transfers it to a server agent,
and returns its result to the user. Client agents, which can communicate with arbitrary server agents and can switch server
agents according to server agent load conditions or the network state, provide a nonstop storage system to users. Whenever
there is a file fetch request from a client, a server agent collects together a total of k shares to decrypt the file, performs a
decryption, and returns the file to the client.
The work in [27] employs Shamir’s secret sharing to propose a multisecret sharing technique. The proposed technique
recursively constructs the shares in order to hide multiple secrets into the n shares, such that any k of the n shares surface
to recreate the secrets. While the algorithm is very efficient in terms of communication cost, We could not find a straight-
forward way to incorporate it and still choose the polynomials in a way that allows for range queries.
There are several other research topics in this area other than secure data outsourcing such as privacy preserving data
sharing and privacy preserving data mining. Agrawal et al. [5] and Emekci et al. [17,16] proposed techniques to share data
across private databases. Emekci et al. [17,16] used secret sharing schemes in privacy preserving data sharing. In addition to
this, Aggarwal et al. [4] show the challenges in finding the kth element in the union of more than two databases while pre-
serving privacy and propose an approximate solution. Furthermore, several high level design efforts and requirement spec-
ifications have been made to support the privacy of individual information while still supporting some degree of sharing
[2,6–8,12]. Although related, our work is orthogonal or complementary to privacy preserving data management in data min-
ing and information retrieval. In data mining, several efforts have been made to either preserve the privacy of individuals
using randomized techniques [10,11,18,28] or to preserve the privacy of the database while running data mining algorithms
over multiple databases [15,25] using cryptographic techniques such as secure multi-party computation and encryption. On
the other hand in privacy preserving information retrieval, the privacy of the query poser is preserved by hiding the record
he/she queried from the data source [13,14].

3. Simple solutions for outsourcing numeric attributes

Data source D divides the numeric value in the numeric attribute into n shares and stores them at service providers DAS1,
DAS2, . . . , DASn (one share for each of the service providers). The goal here is to divide a secret value into n shares to be stored
at n service providers such that they cannot ﬁgure out the secret even if they combine their shares. The solution is based on,
but slightly different than Shamir’s secret sharing method [29].
Our scheme allows data source D to distribute a secret value vs among n data service providers {DAS1, DAS2, . . . , DASn}, such
that knowledge of any k (k 6 n) service providers is required to reconstruct the secret in addition to some secret information,
X, known only by data source D. Since, even complete knowledge of k 1 peers cannot reveal any information about the
secret even though they know secret information X, this method is information theoretically secure [29]. Data source D
chooses a random polynomial q(x) of degree k 1 where the constant term is the secret value, vs, and secret information
X which is a set of n random points. Then, data source D computes the share of each service provider as q(xi) and sends it
to data service provider DASi. The method is summarized in Algorithm 1.
F. Emekci et al. / Information Sciences 263 (2014) 198–210 201

Algorithm 1. Secret Dividing Algorithm

1: Input:
2: vs: Secret value;
3: D: Data source of secret vs;
4: DAS: Set of service providers DAS1, . . . , DASn to distribute secret;
5: Output:
6: share1, . . . , sharen: Shares of secret, vs, for each service provider DASi;
7: Procedure:
8: D creates a random polynomial q(x) = ak1xk1 + + a1x1 + a0 with degree k 1 and a constant term a0 = vs.
9: D chooses secret information X which is n random points, x1, . . . , xn, such that xi – 0.
10: D computes share coming from vs for each service provider DASi, share(vs, i), where share(vs, i) = q(xi).

Data source D divides each secret value in its table using Algorithm 1 and stores them in different data service providers.
Since service providers do not know each other and secret information X, they cannot ﬁnd out the secret values (even if they
combine their shares). In order to reconstruct the secret value vs, any set of k peers will need to share the information they
have received and they need to know the set of secret points, X, used by D. Since only data source D knows X, only it can
reconstruct the secret after getting at least k shares from any k of the service providers. The shares coming from service pro-
viders can rewritten as follows at the data source:

k2
sharesðv s ; 1Þ ¼ qðx1 Þ ¼ axk1
1 þ bx1 . . . þ vs
k2
sharesðv s ; 2Þ ¼ qðx2 Þ ¼ axk1
2 þ bx2 . . . þ vs
..
.
k2
sharesðv s ; nÞ ¼ qðxn Þ ¼ axk1
n þ bxn . . . þ vs
The secret value can be reconstructed using any k of the above equations since there are k unknowns including the secret
value vs. The key observation is that at least k points and the corresponding shares are required in order to determine a un-
ique polynomial q(x) of degree k 1 along with secret information X.
After storing data with this method, when a query is posed data source D collects all relevant shares from all service pro-
viders, and then it calculates the corresponding secret values. Then, it executes the query using these secret values.

Example 1. Assume that data source D needs to outsource the salary attribute of the Employees table in using 3 data service
providers, DAS1, DAS2 and DAS3. In order to do this, it chooses 5 random polynomials degree of 1 for each salary in the table
whose constant term is the salary (n = 3 and k = 2). In addition, secret information X, X = {x1 = 2, x2 = 4, x3 = 1}, is also chosen
one for each data service provider. Therefore, the polynomials would be q10(x) = 100x + 10, q20(x) = 5x + 20, q40(x) = x + 40,
q60(x) = 2x + 60 and q80(x) = 4x + 80 for salaries {10, 20, 40, 60, 80} respectively. Then, it sends {q10(xi), q20(xi), q40(xi), q60(xi),
q80(xi)} to service provider DASi to store them. This is summarized in Fig. 1. Note that neither the polynomials nor the salaries
are stored at the service provider and Fig. 1 shows them for the sake of the illustration. The service providers on the hand
stores the shares coming from the salaries. When a query comes, it needs to retrieve all shares from all service providers, i.e.,
{q10(xi), q20(xi), q40(xi), q60(xi), q80(xi)} from DASi. After this, it needs to find out the coefficient of each polynomial q and thus
all secret salaries (note that receiving any k shares is enough for this since polynomials are degree of k 1). In our example,
data source D needs to receive shares from any 2 of the service providers and computes the coefficients of polynomials q10,
q20, q40, q60 and q80 and thus all salaries, 10, 20, 40, 60 and 80 to answer a query asking for salaries more than 40.

4. Practical solutions for secure data outsourcing

The solution proposed in Section 3 are impractical since the data source needs to retrieve all the information from the
service providers to execute a query. The communication and computation cost paid for query processing makes them
impractical. In this section, we will extend the techniques in Section 3 to be able to retrieve only the required data from ser-
vice providers.
The key observation to achieve this is that the order of the values in the domain DOM = {v1, v2, . . ., vn} needs to remain the
same in the shares of the service providers. In other words, if data source D needs to outsource secret values from domain
DOM and v1 < v2 < < vn, the shares of a service provider DASi,share(v1, i),share(v2, i), . . ., share(vn, i), derived from v1, v2, . . ., vn
respectively need to preserve the order (i.e., share(v1, i) < share(v2, i) < < share(vn, i)). Since the order of the shares at the ser-
vice provider is not preserved in the solution in Section 3, data service providers cannot ﬁlter data. However, if we had a
mechanism to construct the polynomials used in Section 3 calculating shares in an order preserving manner for a speciﬁc
domain, then data source D could retrieve only the required tuples instead of a superset to answer a query. In this section,
202 F. Emekci et al. / Information Sciences 263 (2014) 198–210

Fig. 1. Demonstration of Example 1.

we propose an order preserving polynomial building technique to achieve this goal. For the sake of this discussion without
loss of generality, we will assume that polynomials are of degree 3 and in the following form ax3 + bx2 + cx + d (i.e., k = 4).
Given any two secret values v1 and v2 from a domain DOM, we need to construct two polynomials
pv 1 ðxÞ ¼ a1 x3 þ b1 x2 þ c1 x þ v 1 and pv 2 ðxÞ ¼ a2 x3 þ b2 x2 þ c2 x þ v 2 for these values such that pv 1 ðxÞ < pv 2 ðxÞ for all x points
if v1 < v2. The key observation for our solution is that pv 1 ðxÞ < pv 2 ðxÞ for all positive x values if a1 < a2,b1 < b2,c1 < c2 and
v1 < v2. We ﬁrst present a simple approach to construct a set of order preserving polynomials and show why it is not secure
in Section 4.1. Then, we will present a secure way constructing order preserving polynomials (Section 4.2).

4.1. Simple order preserving polynomials

A straightforward method to form a set of order preserving polynomials for a specific domain is to use using monotonic
increasing functions of the secret values to determine the coefficients of the polynomials. In this scheme, we need three
2
monotonic increasing functions fa, fb and fc to find the coefficients of the polynomial pv s ¼ ax3 þ bx þ cx þ v s which is used
to divide the secret value vs. The coefficients of the polynomial pv s are the values of the monotonic increasing functions of the
secret value vs where a = fa(vs), b = fb(vs) and c = fc(vs). Therefore, for two secret values v1 and v2 (v1 < v2) and their respective
polynomials pv 1 ðxÞ ¼ fa ðv 1 Þx3 þ fb ðv 1 Þx2 þ fc ðv 1 Þx þ v 1 and pv 2 ðxÞ ¼ fa ðv 2 Þx3 þ fb ðv 2 Þx2 þ fc ðv 2 Þx þ v 2 , the value of pv 1 ðxÞ is al-
ways less than the value of polynomial pv 2 ðxÞ for all x values. Since any service provider DASi gets the value of the polyno-
mials at point xi, the share coming from secret value v1, share(v1, i) would always be less than the share coming from the
secret value v2, share(v2, i) (i.e., p1(xi) < p2(xi)).
However, this solution is not secure enough to protect secret values from the service providers. For example, assume the
following monotonic functions are used: fa(vs) = 3vs + 10, fb(vs) = vs + 27 and fc(vs) = 5vs + 1. Then, the share of data source DASi

from secret value v1 would be p1 ðxi Þ ¼ ð3v 1 þ 10Þx3i þ ðv 1 þ 27Þx2i þ ð5v 1 þ 1Þxi þ v 1 which is p1 ðxi Þ ¼ 3x3i þ x2i þ 5xi þ 1 v 1 þ

10x3i þ 27x2i þ xi . Basically, the secrets are multiplied by the same constants and the other same constant is added to compute
the share of a service provider for all secret values. Therefore, a service provider breaking this method for only one secret item can
figure out all of the secret values and thus this method is easy to break. Instead of simple monotonic functions, more complex
monotonic functions can be used. However, again an adversary by breaking for a single secret can figure out all the secret items.

4.2. Order preserving polynomial construction

Since the method used in Section 4.1 to construct an order preserving polynomial is not secure enough, we will propose
another scheme to build order preserving polynomials for values from a specific domain.
In particular, we propose a secure method using different coefficients for each secret value so that service providers
cannot know the relation between secret values except the order.
In polynomial construction, the coefficients a, b and c are chosen from the domains DOMa, DOMb and DOMc. Since the
coefficients can be real numbers, the sizes of the coefficient domains are independent from the data domain size. For finite
F. Emekci et al. / Information Sciences 263 (2014) 198–210 203

domain DOM = {v1, v2,. . . , vn}, the domains DOMa, DOMb and DOMc are divided into n equal sections. For example DOMa is divided
h i h i h i
into n slots: 1; jDom
n
aj
for v 1 ; jDom
n
aj
þ 1; 2 jDom
n
aj
for v2, . . . , ðn 1Þ jDom
n
aj
þ 1; jDoma j for vn. After this, coefﬁcient av i for value vi is
h i
selected from the slot ði 1Þ jDom n
aj
þ 1; i jDomn
aj
with the help of hash function ha which maps vi to a value from
h i
jDoma j jDoma j
ði 1Þ n þ 1; i n . The other coefﬁcients bv i and cv i are computed similarly with the hash functions from domains Domb
and Domc. Finally, the polynomial used to divide the secret value vi into shares would be pv ðxÞ ¼ av x3 þ bv x2 þ cv x þ v i .
i i i i

Example 2. Assume data domain is DOM = {1, 2, 3, 4, 5}, and we want to construct order preserving polynomials of degree 3
for this domain. In order to do this, we need to find 3 coefficients a, b and c for each value in DOM. Furthermore, assume
coefficients a, b and c are chosen from the domains Doma = [1 25], Domb = [1, 15], and Domc = [1, 50] respectively. The domain
of each coefficient is divided into 5 equal pieces since we have 5 elements in domain DOM. For example Doma is divided into
5 pieces: [1, 5], [6, 10], [11, 15], [16, 20] and [21, 25]. The other domains are divided into similar slots as shown in Fig. 2.
Coefficients a, b and c for secret item 3 are selected from the third slots in domains Doma, Domb and Domc respectively with
the help of the hash functions ha, hb and hc. Assume for the sake of this example hash function ha maps secret value 3–13
which is in the third slot of Doma, hb maps it to 7 in the third slot of Domb, and hc maps it to 23 in the third slot of Domc. Then,
the resulting polynomial for secret value 3 would be p3(x) = 13x3 + 7x2 + 23x + 3. Similarly, the polynomial for secret value 5,
p5(x) = 24x3 + 14x2 + 44x + 5, can be constructed with the same method using the values from the 5th slot of each domain.
The main observation here is that the value of polynomial p5(x) is always greater than the value of polynomial p3(x) for all
positive x values, since the values in the 5th slot are bigger than the values in the 3rd slot.
After constructing the polynomial for the secret value v i ; pv i , data source D divides the secret value vi into n pieces to be
sent to each of the service providers. In other words, D stores pv i ðx1 Þ at DAS1, pv i ðx2 Þ at DAS2, . . . , pv i ðxn Þ at DASn. The secret
value vi is reconstructed as described in Section 3 after getting these shares from the service providers. The service provider
DASi storing pv i ðxi Þ for the secret value vi cannot know the secret value vi. Because it does not know xi and anything about the
domains of the coefficients Doma, Domb and Domc.

4.3. Properties of the proposed polynomial construction

We now will discuss the security of the proposed polynomial construction technique. Basically, we will discuss what a
service provider can infer from the stored data and then show that it cannot know the content of the data with the inferred
information.
From the stored data, service provider DASi can know an upper bound on the sum of the domain sizes (i.e., jDOMj + jDoma-
j + jDombj + jDomcj). This can only happen when it stores the last secret value from DOM and the coefﬁcients are mapped to
the last slots of the domains for the last secret value vn in the domain. Let us assume this worst case happened for now. Then,
the polynomial for secret value vn would be P v n ðxÞ ¼ jDoma jx3 þ jDomb jx2 þ jDomc jx þ v n and the share of DASi would be
shareði; v n Þ ¼ P v n ðxi Þ ¼ jDoma jx3i þ jDomb jx2i þ jDomc jxi þ v n . From this share, DASi can only know an upper bound on the
sum of the sizes of the domains and that upper bound is too lose to infer something about the content of the data. Therefore,
we can derive the following lemma.

Lemma 1. Data service providers can only know an upper bound on the sum of the domains, the data domain and the coefficient
domains, from the stored information.
Furthermore, data service provider DASi cannot know each domain size or the exact value of the sum of the coefficient
domain sizes even if it knows the secret point xi, in the worst case scenario described above. Because, there are 4 unknows,
Doma, Domb, Domc and vn, in the share of DASi ; shareði; v n Þ ¼ P v n ðxi Þ ¼ jDoma jx3i þ jDomb jx2i þ jDomc jxi þ v n (assuming xi is
known). Thus, these unknowns cannot be found.
After the worst case scenario, we now discuss the general case. If data service provider DASi knows the secret point xi and
the sum of the coefficient domain sizes jDomaj + jDombj + jDomcj, it cannot infer anything about the secret items (even with

Fig. 2. Demonstration of order preserving polynomial construction.

204 F. Emekci et al. / Information Sciences 263 (2014) 198–210

simple hash functions mapping the secret values to the first values in the slot are used). Thus, the coefficients of polynomial
pv i ðxÞ with these simple hash functions would be a ¼ v i jDom
n
aj
, b ¼ v i jDom
n
bj
and c ¼ v i jDom
n
cj
(Hash functions ha, hb and hc maps
secret values always the first values in each slot). Then, the share of DASi would be:
jDoma j 3 jDomb j 2 jDomc j
pv i ðxi Þ ¼ v i xi þ v i xi þ v i xi þ v i
n n n
In addition to its share, if DASi knows the sum of the sizes of the domains which is jDomaj + jDombj + jDomcj, there are 5 un-
knowns (jDomaj,jDombj,jDomcj,jDOMjand vi) and 2 equations. Therefore, the unknowns and thus the secret value is not re-
vealed to data service provider DASi even with these simple hash functions. In our scheme, a service provider can only
derive an upper bound on the sum of the domains jDomaj + jDombj + jDomcj + jDOMjbut not the secret point xi. In addition,
the hash functions map secret values to any value not only the first value in the slot. The following lemma can be concluded
from this discussion.

Lemma 2. Data service provider DASi cannot know the secret value vi even if it knows the secret point xi.
The service provider DASi storing fpv 1 ðxi Þ; pv 2 ðxi Þ,. . .,pv n ðxi Þg cannot know the secret values {v1, v2,
. . ., vn}. From these
information, service provider DASi may learn an upper bound value (not tight) for the sum of the sizes of the domains
(i.e., jDOMj + jDomaj + jDombj + jDomcj). In order to ﬁnd the secret values, DASi needs more information such as xi and the size
of the each domain (the sizes of domains DOM, Doma, Domb and Domc). Thus, the following lemma concludes the security of
the proposed scheme.

Lemma 3. Data service provider DASi storing pv 1 ðxi Þ; pv 2 ðxi Þ,. . .,pv n ðxi Þ cannot know the secret values v1, v2, . . . , vn.
In addition to the security guarantees, for two secret values vi and vj from the same domain, data source DASi will get its
shares shareðv i ; iÞ ¼ pv i ðxi Þ (share of DASi from vi) and shareðv j ; iÞ ¼ pv j ðxj Þ. If vi < vj then share(vi, i) < share(vj, i). The reason for
this is that how the polynomials are constructed. The following lemma formalizes this discussion.

Lemma 4. For any two secret values vi and vj from the same domain, the shares of data source DASi, shareðv i ; iÞ ¼ pv i ðxi Þ (share of
DASi from vi) and shareðv j ; iÞ ¼ pv j ðxj Þ, preserves the order (i.e., if vi < vj then share(vi, i) < share(vj, i)).

5. Query processing
In this section, we will discuss how to process queries in the Encryption with Labeling (EL) [20,21] and Secret Dividing (SD)
techniques discussed in Section 4. The queries are Exact Match Queries, Range Queries and Aggregation Queries.

5.1. Exact Match Queries

An example of an Exact Match Query would be ‘‘retrieve the employees whose salary is 20’’. In EL, the execution of the
query is straightforward and it can be executed by simple encryption/decryption.
In SD, data source D needs to retrieve shares from all service providers, DAS1,DAS2, . . .,DASn. Therefore, it rewrites n queries
one for each service provider. For example, the rewritten query for DASi would be: Retrieve all employees whose salary is
share(20,i), where share(20,i) is the share of service provider DASi for the secret value 20. In order to ﬁnd share(20,i), data
source D ﬁrst constructs the polynomial for secret item 20,p20(x), and then it computes the shares, share(20,i) = p20(xi). After
getting these answers from all of the service providers, data source D computes the secret values as described in Section 3.2.
The computation and communication is performed for those tuples who are needed to answer the query as in EL. The com-
munication cost is n times greater than the EL method. However, solving a polynomial is less computationally expensive
than decrypting an item.

5.2. Range Queries

In order to process range queries, the labels are used to ﬁnd a superset of the answer in the EL method. For example, to
execute the query asking for the employees whose salaries are between 20 and 50, the service provider sends all the tuples
which may be needed to answer the query using the labels. The efﬁciency of the EL method depends on how much privacy is
preserved with labeling. Therefore, there is a tradeoff between privacy and performance in this method.
In order to answer the same query in the SD method, data source D rewrites n queries (one for each of the service pro-
vider). For example, the query sent to service provider DASi is: All employees whose salaries are between share(20, i) and
share(50, i). In order to compute shares, share(20, i) and share(50, i), two order preserving polynomials, p20(x) and p50(x),
are constructed (share(20, i) = p20(xi) and share(50, i) = p50(xi)). Service provider DASi, then, sends all employees whose salaries
are between share(20, i) and share(50, i). Since we have an order preserving polynomial construction technique for the
domain, DASi can send only the required tuples. After getting this information from the service providers, data source D
executes the query by solving the polynomials. Therefore, the computation and communication is performed for those tuples
which are required to answer the query. In addition to this, no information about the underlying data is revealed to the
service providers as oppose to the EL method.
F. Emekci et al. / Information Sciences 263 (2014) 198–210 205

5.3. Aggregation Queries

We consider Sum/Average, Min/Max/Median aggregation queries and how to process them in EL and SD methods. We clas-
sify aggregation queries in two class: (1) Aggregations over Exact Matches. (2) Aggregation over ranges. We will present
aggregation query processing techniques with the following example queries:

QUERY-I: Sum/Average of the salaries of the employees whose name is ‘John’ (Sum/Average over Exact Match).
QUERY-II: Sum/Average of the salaries of the employees whose salary is between 20 and 40 (Sum/Average over
Ranges).
QUERY-III: Min/Max/Median of the salaries of the employees whose name is ‘John’ (Min/Max/Median over Exact
Match).
QUERY-IV: Min/Max/Median of the all salaries of the employees whose salary is between 20 and 40 (Min/Max/Median
over Ranges).

5.3.1. Aggregation query processing with EL

Processing aggregation queries in the EL method is straightforward but inefficient. Labels are used to find the candidate
tuples who can be Max/Median/Median and they are sent to the data source. Then, the data source decrypts the incoming
tuples and finds the answer while processing the queries QUERY-III and QUERY-IV. Therefore, the amount of computation
and communication depends on the size of the superset and the size is usually much bigger than the necessary (this is
needed to protect the privacy).
In order to execute queries QUERY-I and QUERY-II, the service provider sends all the tuples satisfying the conditions of
the queries such as name is John and salary is between 20 and 40. It uses labels while finding the salaries between 20 and 40.
After getting these tuples, the data source decrypts them and performs the summation after that. The query response time
strictly depends on the computation and communication performed and thus the number of tuples needed to execute the
query.

5.3.2. Aggregation query processing with SD

In the SD method, the query execution consists of two steps. In the ﬁrst step, service providers receive the rewritten que-
ries from the data source and perform an intermediate computation. In the second step, the data source receives the inter-
mediate results from all of the service providers and computes the ﬁnal answer. The above queries are rewritten as follows
and sent to the service provider DASi:

QUERY-I: Sum/Average of the salaries of the employees whose name is share (‘John’, i).
QUERY-II: Sum/Average of the salaries of the employees whose salary is between share(20, i) and share(40, i).
QUERY-III: Min/Max/Median of the salaries of the employees whose name is share(0 John0 , i).
QUERY-IV: Min/Max/Median of the all salaries of the employees whose salary is between share(20, i) and
share(40, i).

Then, DASi ﬁnds the tuples needed to answer these queries and performs an intermediate computation over them, which
will be discussed later. These intermediate results are then sent to the data source D. After getting all of these intermediate
results, data source D computes the ﬁnal answer. In this scheme, only the intermediate results need to be sent by
service providers while a superset of the required tuples needs to be sent in EL method. Therefore, the communication cost
is negligible e.g., sending a single value referring to the shared sum. Thus the query response time is much faster in this
scheme.
Assume data source D has secret values (e.g. salaries) V = {v1, v2, . . . , vn}. Recall that, in order to store them the data source
D constructs a set of order preserving polynomials (av j xk1 þ bv j xk2 þ . . . þ v j to hide each secret value vj). After generating
these polynomials, it sends the shares of the service providers by computing the share of DASi as av j xik1 þ bv j xik2 þ . . . þ v j Þ
for each secret vj and stores them at DASi.
The Execution of QUERY-I and QUERY-II: To answer QUERY-I asking for the sum of the l secret values {v1, v2, . . . , vl} from
P
V, DASi computes the intermediate result, INTRESi ¼ lm¼1 ðshareðv m ; iÞÞ. Hence INTRESi can be written as follows:

a1 xik1 þ b1 xk2
i . . . þ v 1þ
a2 xik1 þ b2 xk2
i . . . þ v 2þ
..
.
al xik1 þ bl xik2 . . . þ v l ¼ ða1 þ þ al Þxk1
i þ ðb1 þ þ bl Þxk2
i þ SUM

Thus, service provider DASi sends the intermediate result, INTRESi.

206 F. Emekci et al. / Information Sciences 263 (2014) 198–210

Data source D receives n intermediate results from the service providers and writes the following equations for the inter-
mediate results:

INTRES1 ¼ ða1 þ a2 þ þ al Þxk1

1 þ þ SUM
INTRES2 ¼ ða1 þ a2 þ þ al Þxk1
2 þ þ SUM
..
.
INTRESn ¼ ða1 þ a2 þ þ al Þxk1
n þ þ SUM

Since X = {x1, x2, . . . , xn} is known by the data source, there are a total of k unknown coefﬁcients including SUM and n P k
equations. Therefore, SUM can be found by solving any k of the above equations.
Pl
ðshareðv m ;iÞÞ
For the average query, DASi sends INTRESi ¼ m¼1 l . Then data source formulates and writes the following equation
INTRESi ¼ ða1 þa2 þ...þa
l
xi þ . . . : þ AVGÞ where AVG ¼ v 1 þv 2lþ...þv l . Therefore, data source D receives n results from the service
l Þ k1

providers:
ða1 þ a2 þ þ al Þ k1
INTRES1 ¼ x1 þ þ AVG
l
ða1 þ a2 þ þ al Þ k1
INTRES2 ¼ x2 þ þ AVG
l
..
.
ða1 þ a2 þ þ al Þ k1
INTRESn ¼ xn þ þ AVG
l
Again since X = {x1, x2, . . . , xn} is known by the data source, there are k unknown coefficients including AVG and n P k equa-
tions. Therefore, AVG can be found by using any k of the above equations.
In order to answer QUERY-II, service provider DASi first finds the shares it stores between share(20, i) and share(40, i). Since
the polynomials are order preserving, DASi can find those shares in this range. Then, the operation performed for QUERY-I is
performed for these tuples to compute the sum of them.
The Execution of QUERY-III and QUERY-IV: The key observation for computing the answers to this set of queries is that
if v1 < v2 < < vn, the shares of service provider DASi,share(v1, i), share(v2, i), . . . , share(vn, i), coming from v1, v2, . . . , vn respec-
tively preserve the order (share(v1, i) < share(v2, i) < < share(vn, i)). This result follows from the fact that order preserving
polynomials are used to compute the shares.
Assume l of the secret values, v1, v2, . . . , vl satisfy the condition of QUERY-III (employees whose name is John). Depending
on the query, service provider DASi returns the minimum/ maximum/median of its shares, share(v1, i),share(v2, i), . . . , share(vl, -
i). Without loss of generality, we can assume that the query asks for minimum. Then, service providers send the minimum of
their shares to the data source D.
After the service providers send in the results back, data source D computes the value of the minimum by using the re-
sults of the service providers. The intermediate result of the service provider DASi is in the following form:

INTRESi ¼ axk1
i þ þ MIN:
Thus, MIN could be found similar to sum/ average queries. Data source D receives n intermediate results from the service
providers:

INTRES1 ¼ axk1
1 þ þ MI
INTRES2 ¼ axk1
2 þ þ MIN
..
.
INTRESn ¼ axk1
k þ þ MIN

Since X = {x1, x2, . . . , xn} is known by the data source D, the minimum value can be computed by solving any k of the above
equations.
In order to answer QUERY-IV, service provider DASi first finds the shares between share(20, i) and share(40, i). Since the
polynomials are order preserving, DASi can find those shares in this range. Then, the operation performed for QUERY-III is
applied for these tuples to compute the answer.

5.4. Discussion

We have considered only numeric attributes so far and the proposed technique is for numeric attributes. In order to apply
our scheme for non-numeric attributes, we need to convert them to numeric attributes. This conversion is straightforward.
For example, the attribute name length of 5 characters (i.e., VARCHAR (5)), can be represented as a numeric attribute
although it is in fact a non-numeric attribute. For the sake of this discussion, assume the characters in names can be one
F. Emekci et al. / Information Sciences 263 (2014) 198–210 207

of the letters in the English alphabet and they can be shorter than 5 characters. Thus, the regular expression for this attribute
is (AjBj . . . .jZj⁄)5 where ⁄ represents blank. The name attribute consists of a combination of 29 possible characters which are
enumerated (⁄ = 0, A = 1, B = 2, C = 3 . . . , Z = 29). and thus, each name can represent a number in a number system of base 29.
For example, name ‘‘ABC**’’ can be rewritten as (12300)29 which is equal to 21998878 in decimals. With this simple enumer-
ating technique, nonumeric attributes can be converted into numeric attributes and then the proposed outsourcing tech-
nique can directly be applied. With the proposed enumeration technique execution of widely used queries over non-
numeric attributes can be handled easily. For example, a query asking for employees whose name starts with ‘‘AB’’ or a query
asking employees whose name is between ‘‘Albert’’ and ‘‘Jack’’ can be converted into range queries and executed with the
range query processing technique in this paper.
Moreover, we assumed data sources have only one table for the sake of the presentation and thus did not consider join of
tables. If they had more than one table in their schemas, they may need to join these tables. Our technique can be applied if
these tables are related to each other through referential keys and join is based on these keys. Consider a simple schema
consisting of two tables:
Employees (EID, Name, Lastname, Department, Salary).
Managers (EID, ManegerID, ManagerUserName, Password).
A possible query may ask for the salaries of all managers. To execute this query, these two tables should be joined using
the attribute EID. Our scheme can be directly applied to execute this query since join is based on two attributes which are
from the same domain and our polynomials are constructed for each domain not for each attribute. Therefore, this join can
be done by the service provider at the service provider site. However, if a join is based on two attributes from different do-
mains such as Name and ManagerUserName, then the approach in this paper cannot be used for this kind of joins. Thus, the
query asking for the salaries of the managers whose name is the same as the ManagerUserName cannot be answered with
the proposed scheme.
Finally, if we need to compare two attributes from different domains to execute a query, the proposed technique cannot
be applied. For example, a query asking for employees whose salary is 10 times their ages cannot be answered efficiently
with our technique. On the other hand, a query asking for the employees whose salary is more than the salary of their man-
agers can be executed efficiently. Furthermore, a query asking for the employees whose salary is 2 times the salary of their
managers can be executed efficiently too. The execution of these queries are straightforward with the basic methods in this
section. In order to answer all kinds of queries efficiently with the proposed technique, we need to represent all attributes
with a universal domain. If we had such a domain, we can compare all attributes with each other and join tables based on any
subset of the attributes. We leave forming a universal domain issue as a future work.

6. Fault tolerance

There are two issues related to the fault tolerance: (1) Service availability and (2) Malicious service providers. Both of
these issues are very important in using database services.
Data sources always need to answer their queries. In our scheme, a polynomial of degree k 1 is used to divide the secret
and thus k shares and parties are needed to compute the secret. Therefore, in the secret dividing scheme if k of the n service
providers are available, the queries can be answered using the shares coming from these service providers.
Another important problem is dealing with malicious service providers. These malicious service providers may corrupt
the shares they store (intentionaly or unintentionaly). Therefore, there must be a mechanism to detect the malicious behav-
iors and to execute queries correctly in spite of their existence. In this section, we will explore the fault tolerance of the pro-
posed data outsourcing technique. In response to a query, each of the n service providers send their shares or intermediate
results to the data source. Then the data source solves the linear system and computes the secret values which are the an-
swer of the queries. Since the results retrieved from service providers are k consistent (i.e, any k of n equations give the same
value for the answer of the query), solving the linear system for any k of them is sufficient if all service providers are honest.
However, some of the service providers may be malicious and may send incorrect values, thus solving one linear system of k

n
equations may not be sufficient in this case. There are different possible groups of linear systems that could be used to
k

nt
find the answer. If there are n t honest service providers, then the solutions of linear systems would give the same
k
value, which is the answer of the query. However, any other linear system with at least one malicious third party would give
a different value. This follows from Lemma 5.

Lemma 5. Two different equation sets (i.e., two different linear systems) each with at least one malicious service provider produce
the same value as the result with a very low probability.

Proof. Two different linear systems Ax1 = a and Bx2 = b, produce the same solution, i.e., x1 = x2, and x1 = x2 if b = BA1a. The
probability that the solution of two linear systems is the same is equal to the probability of receiving the same b with ran-
domly chosen numbers, which is 1/jDomjk. This is because providers do not know the matrices A and B. The probability of
two sets of results coming from service providers giving the same incorrect value is, therefore, infinitely small for large
domains. h
208 F. Emekci et al. / Information Sciences 263 (2014) 198–210

nt
Therefore, if k is chosen to be smaller than the number of honest service providers, n t, then, results would be the
k
same while the rest of the results could be different if at least one malicious provider is involved. Having such a mechanism
would push service providers to behave honestly since a data source can prove the dishonesty of the malicious service
providers.
It is possible to optimize this scheme using the result of Lemma 5. Finding the same value at least twice is sufficient to say
that the value is correct for two different linear systems. Throughout query processing, the data source can determine the
possible trustworthy service providers and can use two such sets to execute the queries. Whenever there is a conflict be-
tween the two sets, the query poser would use the other sets to compute the final result.

7. Evaluation

In this section, we will compute the query response time of the two techniques EL and SD for exact match, range and
aggregation queries such as sum and average.
Let Cd be the cost of encryption, B be the bandwidth, T be the number of tuples required to answer the query, and S be the
selectivity of the filtration. To answer the query in EL method, data source D retrieves S T tuples and decrypts all of them. If
the size of each tuple is b, then the communication cost would be: STbB
. And the cost of computation would be S T Cd.
Hence the query response time for all queries is
ST b
þ S T Cd:
B
The selectivity ratio S is equal to 1 in exact match queries and thus the query response time for exact match queries is:
T b
þ T Cd:
B
For the SD method, data source D needs to retrieve T tuples (i.e., shares) from n service providers for the exact match and
range queries. Let Cp be the cost of computation of coefficients of the polynomial. Then the query response time for exact
and range queries in the SD method is
nT b
þ T Cp :
B
where nTb
B
is the communication cost and T Cp is the computation cost. The query response time for aggregation queries is
quite different in SD method. For these queries, service providers perform an intermediate computation which is T times
addition. If the cost of addition is Ca and the size of the intermediate result is b, the query response time for aggregation
queries is
nb
þ T Ca þ Cp
B
where nb
B
is the cost of retrieving intermediate results and Cp is the cost of final computation to find the query result from
intermediate results.
We simulate the two systems and compare the query response times. with the following parameters b = 1024 bits and
B = 60 Kb/s. For the SD method, we used 3 service providers and the polynomials are of degree 3 and symmetric encryption
is used for the EL method. Fig. 3, shows the query response time for exact queries. The query response time varies with the
number of tuples in the query answer. We varied the number of tuples from 100 to 1000. The query response times of two
methods are close to each other as shown in Fig. 3.
55
EL
50 SD
The query response time

45
40
35
30
25
20
15
10
5
100 200 300 400 500 600 700 800 900 1000
The number of tuples

Fig. 3. The query response time (in s) for exact match queries.
F. Emekci et al. / Information Sciences 263 (2014) 198–210 209

120
EL S= 1.25
EL S= 1.5
100

The query response time

EL S= 1.75
EL S= 2
SD
80

0
100 200 300 400 500 600 700 800 900 1000
The number of tuples

Fig. 4. The query response time (in s) for range queries.

120
EL S= 1.25
EL S= 1.5
100 EL S= 1.75
The query response time

EL S= 2
SD
80

0
100 200 300 400 500 600 700 800 900 1000
The number of tuples

Fig. 5. The query response time (in s) for aggregate queries.

The query response times for range queries are shown in Fig. 4. We varied the number of tuples in the query result from
100 to 1000 and selectivity ratio S from 1.25 to 2 for EL method. The query response times are shown in the Fig. 4. Again,
query response times of EL methods are very close to the SD method. However, the privacy leakage is more in the EL method
due to labeling.
The query response time for aggregate queries are shown in Fig. 5. We varied the number of aggregated tuples in the
query from 100 to 1000 and selectivity ratio S from 1.25 to 2 for the EL method. The query response time of the SD method
is much more efﬁcient than the query response time of the EL method since the cost of communication is less in the SD
method (only the intermediate results are sent instead of all tuples).
The query response time for our proposal SD is comparable to the EL method for exact queries and slightly better than the
EL method for range queries while preserving more privacy than the EL method. Note that we assume there is a slow com-
munication mean between the service providers and the data source (B = 60 Kbits/s). Since our proposal is less computation
intensive, the query response time can be improved by increasing the bandwidth more than the EL method. The SD method
gives very efﬁcient query response times for the aggregation queries compare to the EL method since the cost of computation
is almost zero.

8. Conclusion

We proposed a novel privacy preserving data outsourcing framework in this paper. The proposed data outsourcing frame-
work provides efficient and scalable query response times by introducing new efficient methods to store data at several
service providers and also query them in a privacy preserving manner. Since the proposed technique uses several service
providers, it guarantees the availability of the services. Furthermore, the dishonest or faulty service providers can be de-
tected without overhead on the query response time. However, there are several issues left as future work such as dealing
with infinite data domains, forming a universal data domain and transaction management.
210 F. Emekci et al. / Information Sciences 263 (2014) 198–210

References

[1] Advances in cryptology – crypto 2007, in: A. Menezes, (Ed.), 27th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 19–23,
2007, Proceedings, CRYPTO, Volume 4622 of Lecture Notes in Computer Science, Springer, 2007.
[2] G. Aggarwal, M. Bawa, P. Ganesan, H. Garcia-Molina, K. Kenthapadi, N. Mishra, R. Motwani, U. Srivastava, D. Thomas, J. Widom, Y. Xu, Enabling privacy
for the paranoids, in: Proc. of the 30th Int’l Conference on Very Large Databases VLDB, August 2004, pp. 708–719.
[3] G. Aggarwal, M. Bawa, P. Ganesan, H. Garcia-Molina, K. Kenthapadi, R. Motwani, U. Srivastava, D. Thomas, Y. Xu, Two can keep a secret: a distributed
architecture for secure database services, in: CIDR, 2005, pp. 186–199.
[4] G. Aggarwal, N. Mishra, B. Pinkas, Privacy-preserving computation of the k’th-ranked element, in: Proc. of IACR Eurocrypt, 2004, pp. 40–55.
[5] R. Agrawal, A. Evﬁmievski, R. Srikant, Information sharing across private databases, in: Proc. of the 2003 ACM SIGMOD International Conference on
Management of Data, 2003, pp. 86–97.
[6] R. Agrawal, P.J. Haas, J. Kiernan, A system for watermarking relational databases, in: Proc. of the 2003 A CM SIGMOD International Conference on
Management of Data, ACM Press, 2003. 674–674.
[7] R. Agrawal, J. Kiernan, R. Srikant, Y. Xu, Hippocratic databases, in: 28th Int’l Conf. on Very Large Databases (VLDB), Hong Kong, August 2002.
[8] R. Agrawal, J. Kiernan, R. Srikant, Y. Xu, Implementing p3p using database technology, in: Proc. of the 19th Int’l Conference on Data Engineering,
Bangalore, India, March 2003.
[9] R. Agrawal, J. Kiernan, R. Srikant, Y. Xu, Order preserving encryption for numeric data, in: SIGMOD ’04: Proceedings of the 2004 ACM SIGMOD
International Conference on Management of Data, ACM Press, New York, NY, USA, 2004, pp. 563–574.
[10] R. Agrawal, R. Srikant, Privacy-preserving data mining, in: Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, ACM
Press, 2000, pp. 439–450.
[11] S. Agrawal, J.R. Haritsa, A framework for high-accuracy privacy-preserving mining, in: ICDE, 2005, pp. 193–204.
[12] E. Bertino, B.C. Ooi, Y. Yang, R.H. Deng, Privacy and ownership preserving of outsourced medical data, in: ICDE, 2005.
[13] C. Cachin, S. Micali, M. Stadler, Computationally private information retrieval with polylogarithmic communication, Lecture Notes in Computer Science
1592 (1999) 402–414.
[14] B. Chor, N. Gilboa, Computationally private information retrieval (extended abstract), in: Proc. of the Twenty-Ninth Annual ACM Symposium on Theory
of Computing, ACM Press, 1997, pp. 304–313.
[15] C. Clifton, M. Kantarcioglu, J. Vaidya, X. Lin, M.Y. Zhu, Tools for privacy preserving distributed data mining, SIGKDD Exploration Newsletter 4 (2) (2002)
28–34.
[16] F. Emekci, D. Agrawal, A.E. Abbadi, Abacus: A distributed middleware for privacy preserving data sharing across private data warehouses, in: ACM/IFIP/
USENIX 6th International Middleware Conference, 2005.
[17] F. Emekci, D. Agrawal, A.E. Abbadi, A. Gulbeden, Privacy preserving query processing using third parties, in: ICDE, 2006.
[18] A. Evﬁmievski, R. Srikant, R. Agrawal, J. Gehrke, Privacy preserving mining of association rules, in: Proc. of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, ACM Press, 2002, pp. 217–228.
[19] V. Ganapathy, D. Thomas, T. Feder, H. Garcia-Molina, R. Motwani, Distributing data for secure database services, Transactions on Data Privacy 5 (1)
(2012) 253–272.
[20] H. Hacigumus, B.R. Iyer, C. Li, S. Mehrotra, Executing SQL over encrypted data in the database service provider model, in: SIGMOD Conference, 2002.
[21] B. Hore, S. Mehrotra, G. Tsudik, A privacy-preserving index for range queries, in: Proc. of the 30th Int’l Conference on Very Large Databases VLDB, 2004,
pp. 720–731.
[22] B. Hore, S. Mehrotra, G. Tsudik, A privacy-preserving index for range queries, in: VLDB, 2004, pp. 720–731.
[23] S. Kamara, K. Lauter, Cryptographic cloud storage, in: Financial Cryptography Workshops, 2010, pp. 136–149.
[24] M. Li, S. Yu, N. Cao, W. Lou, Authorized private keyword search over encrypted data in cloud computing, in: ICDCS, 2011, pp. 383–392.
[25] Y. Lindell, B. Pinkas, Privacy preserving data mining, in: Proc. of the 20th Annual International Cryptology Conference on Advances in Cryptology,
Springer-Verlag, 2000, pp. 36–54.
[26] T. Miyamoto, S. Doi, H. Nogawa, S. Kumagai, Autonomous distributed secret sharing storage system, Systems and Computers in Japan 37 (6) (2006) 55–
63.
[27] A. Parakh, S. Kak, Recursive secret sharing for distributed storage and information hiding, CoRR, abs/1001.3331 (2010).
[28] S. Rizvi, J.R. Haritsa, Maintaining data privacy in association rule mining, in: Proc. of the 28th Int’l Conference on Very Large Databases, August 2002,
pp. 682–693.
[29] A. Shamir, How to share a secret, Communications of the ACM 22 (11) (1979) 612–613.

CIS250 Final Exam Questions
100% (1)
CIS250 Final Exam Questions
5 pages
BPC 10.1 Exercises - Workshop PDF
No ratings yet
BPC 10.1 Exercises - Workshop PDF
77 pages
DS Lab Manual
No ratings yet
DS Lab Manual
76 pages
Data Archiving Essentials What Every Administrator Needs To Know
0% (1)
Data Archiving Essentials What Every Administrator Needs To Know
26 pages
SAP BW Interview Questions
No ratings yet
SAP BW Interview Questions
14 pages
BW Finance
100% (1)
BW Finance
107 pages
Normalization of Database Tables: Examples of Functional Dependencies
No ratings yet
Normalization of Database Tables: Examples of Functional Dependencies
5 pages
RAC To Single Instance Physical Standby
No ratings yet
RAC To Single Instance Physical Standby
6 pages
Acuerdo-025-De-2013 Estatuto de Rentes e Impuestos de Villa Del Rosario
100% (1)
Acuerdo-025-De-2013 Estatuto de Rentes e Impuestos de Villa Del Rosario
418 pages
M.MARKS: 30 Duration: 3 Hrs Split Up of Marks During Practical Exam According To Cbse
No ratings yet
M.MARKS: 30 Duration: 3 Hrs Split Up of Marks During Practical Exam According To Cbse
2 pages
Chapter9 (Databases)
No ratings yet
Chapter9 (Databases)
8 pages
IRJCS:: Information Security in Big Data Using Encryption and Decryption
No ratings yet
IRJCS:: Information Security in Big Data Using Encryption and Decryption
6 pages
JeetendraMittal - Thesisreport 1
No ratings yet
JeetendraMittal - Thesisreport 1
83 pages
Outsourced Similarity Search On Metric Data Assets
No ratings yet
Outsourced Similarity Search On Metric Data Assets
9 pages
Outsourced Similarity Search On Metric Data Assets
No ratings yet
Outsourced Similarity Search On Metric Data Assets
9 pages
Measuring The Accuracy of Diagnostic Systems
No ratings yet
Measuring The Accuracy of Diagnostic Systems
9 pages
Thesis Final Version
No ratings yet
Thesis Final Version
37 pages
Krishnan Privateclean Final v1
No ratings yet
Krishnan Privateclean Final v1
15 pages
Private Data Indexes For Selective Access To Outsourced Data
No ratings yet
Private Data Indexes For Selective Access To Outsourced Data
11 pages
Topic: A Comprehensive Framework For Secure Query Processing On Relational Data in The Cloud
No ratings yet
Topic: A Comprehensive Framework For Secure Query Processing On Relational Data in The Cloud
4 pages
Ijrim Volume 2, Issue 2 (February 2012) (ISSN 2231-4334) Hill Climbing Algorithm For Data Distribution in Secure Database Services
No ratings yet
Ijrim Volume 2, Issue 2 (February 2012) (ISSN 2231-4334) Hill Climbing Algorithm For Data Distribution in Secure Database Services
9 pages
Enhancing Confidentiality and Privacy of Outsourced Spatial Data
No ratings yet
Enhancing Confidentiality and Privacy of Outsourced Spatial Data
6 pages
Views
No ratings yet
Views
88 pages
Privacy Preserving Query Processing Using Third Parties
No ratings yet
Privacy Preserving Query Processing Using Third Parties
10 pages
Data Warehouse Overview Slide Deck Bi360
No ratings yet
Data Warehouse Overview Slide Deck Bi360
15 pages
Sub Unit Ii Website Analytics 2.1 Definition
No ratings yet
Sub Unit Ii Website Analytics 2.1 Definition
4 pages
Fathan Mubina - 185150400111055 - Lat1
No ratings yet
Fathan Mubina - 185150400111055 - Lat1
5 pages
Mongoose Eloquent For Laravel: CRUD Simply and Easily
No ratings yet
Mongoose Eloquent For Laravel: CRUD Simply and Easily
2 pages
JP 6 2 Practice Solution
No ratings yet
JP 6 2 Practice Solution
4 pages
Efficient and Private Access To Outsourced Data: 2011 31st International Conference On Distributed Computing Systems
No ratings yet
Efficient and Private Access To Outsourced Data: 2011 31st International Conference On Distributed Computing Systems
10 pages
Genesys Interactive Insights: End-To-End Visibility Into Your Contact Center Performance
No ratings yet
Genesys Interactive Insights: End-To-End Visibility Into Your Contact Center Performance
2 pages
DS & DBMS Course
No ratings yet
DS & DBMS Course
8 pages
Data Redundancy Risk To Data Integrity Data Isolation Difficult Access To Data Unsatisfactory Security Measure Concurrent Access
No ratings yet
Data Redundancy Risk To Data Integrity Data Isolation Difficult Access To Data Unsatisfactory Security Measure Concurrent Access
80 pages
Epsolute: Efficiently Querying Databases While Providing Differential Privacy
No ratings yet
Epsolute: Efficiently Querying Databases While Providing Differential Privacy
15 pages
This Tutorial Has Been Provided at The Request of Microsoft by Barry Williams
No ratings yet
This Tutorial Has Been Provided at The Request of Microsoft by Barry Williams
15 pages
TEDAS: A Twitter-Based Event Detection and Analysis System
No ratings yet
TEDAS: A Twitter-Based Event Detection and Analysis System
4 pages
Access Control Aware Data Retrieval For Secret Sharing
No ratings yet
Access Control Aware Data Retrieval For Secret Sharing
30 pages
6 BSTs and AVL Trees
No ratings yet
6 BSTs and AVL Trees
12 pages
1 s2.0 S1877050919317065 Main
No ratings yet
1 s2.0 S1877050919317065 Main
8 pages
7 Query Authentication
No ratings yet
7 Query Authentication
66 pages
33 July2021
No ratings yet
33 July2021
12 pages
Unit 5 File Management PDF
No ratings yet
Unit 5 File Management PDF
40 pages
DDMS
No ratings yet
DDMS
7 pages
Codaspy 2010
No ratings yet
Codaspy 2010
12 pages
Picking A Vector Database - A Comparison and Guide For 2023
No ratings yet
Picking A Vector Database - A Comparison and Guide For 2023
3 pages
My Datastage Notes - SCD
No ratings yet
My Datastage Notes - SCD
4 pages
Bank Management System
No ratings yet
Bank Management System
26 pages
PPQP Ieee 04092014
No ratings yet
PPQP Ieee 04092014
6 pages
10 A Privacy-Chain Based Homomorphic Encryption Scheme and Statistical Method
No ratings yet
10 A Privacy-Chain Based Homomorphic Encryption Scheme and Statistical Method
21 pages
Chand Mohammad Resume
No ratings yet
Chand Mohammad Resume
2 pages
Securing Privacy and Maintaining Data Confidentiality Are Fundamental Principles For Establishing A Trustworthy Database IJERTV12IS120090
No ratings yet
Securing Privacy and Maintaining Data Confidentiality Are Fundamental Principles For Establishing A Trustworthy Database IJERTV12IS120090
4 pages
AZ-900 Microsoft Azure Fundamentals: Exam Prep Question Bank
From Everand
AZ-900 Microsoft Azure Fundamentals: Exam Prep Question Bank
Krumu Publisher
No ratings yet
Consise Cloud Compute: It Professionals’ Handbook
From Everand
Consise Cloud Compute: It Professionals’ Handbook
Vijay
No ratings yet
Cloud computing: Moving IT out of the office
From Everand
Cloud computing: Moving IT out of the office
BCS, The Chartered Institute for IT
No ratings yet
Automated Network Technology: The Changing Boundaries of Expert Systems
From Everand
Automated Network Technology: The Changing Boundaries of Expert Systems
Carl P. Catalano Ph.D.
No ratings yet
Implementation of a Central Electronic Mail & Filing Structure
From Everand
Implementation of a Central Electronic Mail & Filing Structure
Patapios Tranakas
No ratings yet
The Pandemic: Driven New Age of Cloud Computing
From Everand
The Pandemic: Driven New Age of Cloud Computing
VNS Surendra Chimakurthi
No ratings yet
Comptia Network+ V6 Study Guide - Indie Copy
From Everand
Comptia Network+ V6 Study Guide - Indie Copy
Matthew Bennett
5/5 (1)
Cloud Computing: The Untold Origins of Cloud Computing (Manipulation, Configuring and Accessing the Applications Online)
From Everand
Cloud Computing: The Untold Origins of Cloud Computing (Manipulation, Configuring and Accessing the Applications Online)
William Cormier
No ratings yet
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
AZURE AZ 500 STUDY GUIDE-1: Microsoft Certified Associate Azure Security Engineer: Exam-AZ 500
From Everand
AZURE AZ 500 STUDY GUIDE-1: Microsoft Certified Associate Azure Security Engineer: Exam-AZ 500
Mamta Devi
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
Azure Fundamentals Exam Insights
From Everand
Azure Fundamentals Exam Insights
Priyanka Banerjee
No ratings yet
Cybersecurity in Cloud Computing
From Everand
Cybersecurity in Cloud Computing
Akula Achari
No ratings yet
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
From Everand
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
Aniruddha Deswandikar
No ratings yet
Cloud: Get All The Support And Guidance You Need To Be A Success At Using The CLOUD
From Everand
Cloud: Get All The Support And Guidance You Need To Be A Success At Using The CLOUD
John Hawkins
No ratings yet
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
Azure Fundamentals Success Kit
From Everand
Azure Fundamentals Success Kit
PRIYANKA
No ratings yet
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
From Everand
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
The Cloud Computing Revolution: From Virtualization to Automation: Unveiling the Cloud Computing Revolution
From Everand
The Cloud Computing Revolution: From Virtualization to Automation: Unveiling the Cloud Computing Revolution
Lisa Carter
No ratings yet
Data Mining 101: Core Concepts and Algorithms
From Everand
Data Mining 101: Core Concepts and Algorithms
Swarnalata Verma
No ratings yet
AWS Certified Solutions Architect #1 Audio Crash Course Guide To Master Exams, Practice Test Questions, Cloud Practitioner and Security
From Everand
AWS Certified Solutions Architect #1 Audio Crash Course Guide To Master Exams, Practice Test Questions, Cloud Practitioner and Security
Jamie Murphy
No ratings yet
Comptia Cloud+ CV0 - 004: 715 Questions and Explanation
From Everand
Comptia Cloud+ CV0 - 004: 715 Questions and Explanation
Arabella Kushner
No ratings yet
Data Entry Operator: Skills, Software, Career Tips, and Interview Q&A
From Everand
Data Entry Operator: Skills, Software, Career Tips, and Interview Q&A
Sumitra Kumari
No ratings yet
AZ-900 Azure Fundamentals Practice Paper 5: AZ-900 Azure Fundamentals, #5
From Everand
AZ-900 Azure Fundamentals Practice Paper 5: AZ-900 Azure Fundamentals, #5
Tech Interviews
No ratings yet
Cloud Computing
From Everand
Cloud Computing
Dr. Nirvikar Katiyar
No ratings yet
AZ-900 Azure Fundamentals Practice Paper 4: AZ-900 Azure Fundamentals, #4
From Everand
AZ-900 Azure Fundamentals Practice Paper 4: AZ-900 Azure Fundamentals, #4
Tech Interviews
No ratings yet
Computer Science Self Management: Fundamentals and Applications
From Everand
Computer Science Self Management: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cloud Computing Essentials: A Practical Guide with Examples
From Everand
Cloud Computing Essentials: A Practical Guide with Examples
William E. Clark
No ratings yet
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Enterprise Data Protection with Rubrik: Definitive Reference for Developers and Engineers
From Everand
Enterprise Data Protection with Rubrik: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Enterprise Data Protection with Veritas Technologies: Definitive Reference for Developers and Engineers
From Everand
Enterprise Data Protection with Veritas Technologies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
InfluxDB Essentials: Definitive Reference for Developers and Engineers
From Everand
InfluxDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Preparation with AWS Glue DataBrew: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Preparation with AWS Glue DataBrew: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Integration with Blendo: Definitive Reference for Developers and Engineers
From Everand
Data Integration with Blendo: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
IGNOU MCS 227 Cloud Computing and IoT Previous Years Solved Papers
From Everand
IGNOU MCS 227 Cloud Computing and IoT Previous Years Solved Papers
Manish Soni
No ratings yet
Microsoft Azure Text Book
From Everand
Microsoft Azure Text Book
Manish Soni
No ratings yet
Deepset Cloud for Intelligent Search and Question Answering: The Complete Guide for Developers and Engineers
From Everand
Deepset Cloud for Intelligent Search and Question Answering: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Metaplane for Data Reliability Engineering: The Complete Guide for Developers and Engineers
From Everand
Metaplane for Data Reliability Engineering: The Complete Guide for Developers and Engineers
William Smith
No ratings yet

Aaemw 2

Uploaded by

Aaemw 2

Uploaded by

Information Sciences 263 (2014) 198–210

Contents lists available at ScienceDirect

Dividing secrets to secure data outsourcing

⇑ Corresponding author. Tel.: +90 312 5515000.

2. Solution overview and background

2.1. Model and problem formulation

2.2. Other related work

3. Simple solutions for outsourcing numeric attributes

Algorithm 1. Secret Dividing Algorithm

4. Practical solutions for secure data outsourcing

Fig. 1. Demonstration of Example 1.

4.1. Simple order preserving polynomials

4.2. Order preserving polynomial construction

4.3. Properties of the proposed polynomial construction

Fig. 2. Demonstration of order preserving polynomial construction.

5.1. Exact Match Queries

5.2. Range Queries

5.3. Aggregation Queries

5.3.1. Aggregation query processing with EL

5.3.2. Aggregation query processing with SD

Thus, service provider DASi sends the intermediate result, INTRESi.

INTRES1 ¼ ða1 þ a2 þ þ al Þxk1

The query response time

Fig. 4. The query response time (in s) for range queries.

Fig. 5. The query response time (in s) for aggregate queries.

You might also like