Secure Data Transfer and Deletion From Counting Bloom Filter
Secure Data Transfer and Deletion From Counting Bloom Filter
Secure Data Transfer and Deletion From Counting Bloom Filter
On
BACHELOR OF TECHNOLOGY
in
Y. SHRAVANI (17271A05A2)
G. SOUMYA (17271A05A6)
V. SWAPOORVA (17271A05B2)
G. SWATHI (17271A05B3)
Under the Esteemed guidance of
Mr. P. BALAKISHAN
Associate Professor (CSE Dept.)
This is to certify that the Project Report entitled “SECURE DATA TRANSFER AND DELETION
FROM COUNTING BLOOM FILTER” is being submitted by Y.SHRAVANI (17271A05A2),
G.SOUMYA (17271A05A6), V.SWAPOORVA (17271A05B2), G.SWATHI (17271A05B3) in partial
fulfillment of the requirements for the award of the Degree of Bachelor of Technology in Computer
Science & Engineering to the Jyothishmathi Institute of Technology & Science, Karimnagar, during
academic year 2020-21, is a bonafide work carried out by him/them under my guidance and supervision.
The results presented in this Project Work have been verified and are found to be satisfactory. The results
embodied in this Project Work have not been submitted to any other University for the award of any other degree
or diploma.
External Examiner
ACKNOWLEDGEMENT
We would like to express our sincere gratitude to our advisor, Mr. P. Balakishan, Associate Professor,
CSE Dept., whose knowledge and guidance has motivated us to achieve goals we never thought possible. The
time we have spent working under his supervision has truly been a pleasure.
The experience from this kind of work is great and will be useful to us in future. We thank Dr. R.
Jegadeesan, Professor & HOD, CSE Dept. for his effort, kind cooperation, guidance and encouraging us to
do this work and also for providing the facilities to carry out this work.
It is a great pleasure to convey our thanks to our principal Dr. G. Lakshmi Narayana Rao, Principal,
Jyothishmathi Institute of Technology & Science and the College Management for permitting us to undertake
this project and providing excellent facilities to carry out our project work.
We thank all the Faculty members of the Department of Computer Science & Engineering for sharing
their valuable knowledge with us. We extend out thanks to the Technical Staff of the department for their
valuable suggestions to technical problems.
Finally, Special thanks to our parents for their support and encouragement throughout our life and this
course. Thanks to all our friends and well-wishers for their constant support.
DECLARATION
We hereby declare that the work which is being presented in this dissertation entitled, “ SECURE
To the best of our knowledge and belief, this project bears no resemblance with any report submitted
to JNTUH or any other University for the award of any degree or diploma.
Y. SHRAVANI (17271A05A2)
G. SOUMYA (17271A05A6)
V. SWAPOORVA (17271A05B2)
G. SWATHI (17271A05B3)
Date:
Place: Karimnagar
ABSTRACT
With the rapid development of cloud storage, an increasing number of data owners prefer to outsource
their data to the cloud server, which can greatly reduce the local storage overhead. Because different cloud
service providers offer distinct quality of data storage service, e.g., security, reliability, access speed and
prices, cloud data transfer has become a fundamental requirement of the data owner to change the cloud service
providers. Hence, how to securely migrate the data from one cloud to another and permanently delete the
transferred data from the original cloud becomes a primary concern of data owners. To solve this problem, we
construct a new counting Bloom filter-based scheme in this project. The proposed scheme not only can achieve
secure data transfer but also can realize permanent data deletion. Additionally, the proposed scheme can satisfy
the public verifiability without requiring any trusted third party. Finally, we also develop a simulation
implementation that demonstrates the practicality and efficiency of our proposal.
Table of Contents
LIST OF TABLES i
LIST OF FIGURES ii
1. INTRODUCTION 1-5
1.1 Introduction 1-2
1.2 Existing System 2
1.3 Problem Statement 2-3
1.3.1 System Framework 2-3
1.3.2 Design Goals 3
1.4 Proposed System 3-5
1.4.1 Proposed system 3-5
1.4.2 Objectives 5
REFERENCES
LIST OF TABLES
i
LIST OF FIGURES
FIGURE DESCRIPTION PAGE NO
ii
CHAPTER-1
INTRODUCTION
1.1. Introduction
Cloud computing, an emerging and very promising computing paradigm, connects large-scale
distributed storage resources, computing resources and network bandwidths together. By using these
resources, it can provide tenants with plenty of high-quality cloud services. Due to the attractive advantages,
the services (especially cloud storage service) have been widely applied, by which the resource-constraint data
owners can outsource their data to the cloud server, which can greatly reduce the data owners’ local storage
overhead. According to the report of Cisco, the number of Internet consumers will reach about 3.6 billion in
2019, and about 55 percent of them will employ cloud storage service. Because of the promising market
prospect, an increasing number of companies (e.g., Microsoft, Amazon, Alibaba) offer data owners cloud
storage service with different prices, security, access speed, etc. To enjoy more suitable cloud storage service,
the data owners might change the cloud storage service providers. Hence, they might migrate their outsourced
data from one cloud to another, and then delete the transferred data from the original cloud. According to
Cisco, the cloud traffic is expected to be 95% of the total traffic by the end of 2021, and almost 14% of the
total cloud traffic will be the traffic between different cloud data centers. Foreseeably, the outsourced data
transfer will become a fundamental requirement from the data owners’ point of view.
To realize secure data migration, an outsourced data transfer app, Cloudsfer, has been designed
utilizing cryptographic algorithm to prevent the data from privacy disclosure in the transfer phase. But there
are still some security problems in processing the cloud data migration and deletion. Firstly, for saving network
bandwidth, the cloud server might merely migrate part of the data, or even deliver some unrelated data to cheat
the data owner. Secondly, because of the network instability, some data blocks may lose during the transfer
process. Meanwhile, the adversary may destroy the transferred data blocks. Hence, the transferred data may
be polluted during the migration process. Last but not least, the original cloud server might maliciously reserve
the transferred data for digging the implicit benefits. The data reservation is unexpected from the data owners’
point of view. In short, the cloud storage service is economically attractive, but it inevitably suffers from some
serious security challenges, specifically for the secure data transfer, integrity verification, verifiable deletion.
These challenges, if not solved suitably, might prevent the public from accepting and employing cloud storage
service.
Contributions in this work, we study the problems of secure data transfer and deletion in cloud storage,
and focus on realizing the public verifiability. Then we propose a counting Bloom filter-based scheme, which
not only can realize provable data transfer between two different clouds but also can achieve publicly verifiable
data deletion. If the original cloud server does not migrate or remove the data honestly, the verifier (the data
1
owner and the target cloud server) can detect these malicious operations by verifying the returned transfer and
deletion evidences. Moreover, our proposed scheme does not need any Trusted third party (TTP), which is
different from the existing solutions. Furthermore, we prove that our new proposal can satisfy the desired
design goals through security analysis. Finally, the simulation experiments show that our new proposal is
efficient and practical.
In the following, we briefly introduce the system framework, and security goals.
2
Fig-1.3.1: The System framework
In our scenario, the resource-constraint data owner might outsource his large-scale data to the cloud
server A to greatly reduce the local storage overhead. Besides, the data owner might require the cloud A to
move some data to the cloud B, or delete some data from the storage medium. The cloud A and cloud B
provide the data owner with cloud storage service. We assume that the cloud A is the original cloud, which
will be required to migrate some data to the target cloud B, and remove the transferred data. However, the
cloud A might not execute these operations sincerely for economic reasons. Moreover, we assume that the
cloud A and cloud B will not collude together to mislead the data owner because they belong to two different
companies. Hence, the two clouds will independently follow the protocol. Furthermore, we assume that the
target cloud B will not maliciously slander the original cloud A.
Overview:
In the proposed work, the system studies the problems of secure data transfer and deletion in cloud
storage, and focus on realizing the public verifiability. Then the system proposes a counting Bloom filter-
based scheme, which not only can realize provable data transfer between two different clouds but also can
achieve publicly verifiable data deletion. If the original cloud server does not migrate or remove the data
honestly, the verifier (the data owner and the target cloud server) can detect these malicious operations by
verifying the returned transfer and deletion evidences.
Firstly, the data owner encrypts the data and outsources the ciphertext to the cloud A. Then he checks
the storage result and deletes the local backup. Later, the data owner may change the cloud storage service
provider and migrate some data from cloud A to cloud B. After that the data owner wants to check the transfer
result. Finally, when the data transfer is successful, the data owner requires the cloud A to remove the
transferred data and check the deletion result.
The cloud A stores D and generates storage proof. Then the data owner checks the storage result and
deletes the local backup.
4) Data transfer:
When the data owner wants to change the service provider, he migrates some data blocks, even the
whole file from the cloud A to the cloud B.
4
5) Transfer check:
The cloud B wants to check the correctness of the transfer and returns the transfer result to the data
owner.
6) Data deletion:
The data owner might require the cloud A to delete some data blocks when they have been transferred
to the cloud B successfully.
To solve this problem, we propose a new counting Bloom filter-based scheme in this paper.
The proposed scheme not only can achieve secure data transfer but also can realize
permanent data deletion.
Additionally, the proposed scheme can satisfy the public verifiability without requiring any
trusted third party. Here we use a new counting bloom filter-based scheme.
The cloud storage service provider must authenticate the data owner.
5
CHAPTER-2
LITERATURE SURVEY
A verifiable data deletion has been well studied for a long time, resulting in many solutions. Xue et al.
studied the goal of secure data deletion and put forward a key-policy attribute-based encryption scheme, which
can achieve data fine grained access control and assured deletion. They reach data deletion by removing the
attribute and use Merkle hash tree (MHT) to achieve verifiability, but their scheme requires a trusted authority.
Du et al. designed a scheme called Associated deletion scheme for multi-copy (ADM), which uses pre-deleting
sequence and MHT to achieve data integrity verification and provable deletion. However, their scheme also
requires a TTP to manage the data keys. In 2018, Yang et al. presented a Block chain-based cloud data deletion
scheme, in which the cloud executes deletion operation and publishes the corresponding deletion evidence on
Blockchain. Then any verifier can check the deletion result by verifying the deletion proof. Besides, they solve
the bottleneck of requiring a TTP.
Although these schemes all can achieve verifiable data deletion, they cannot realize secure data
transfer. To migrate the data from one cloud to another and delete the transferred data from the original cloud,
many methods have been proposed. In 2015, Yu et al. presented a Provable data possession (PDP) scheme
that can also support secure data migration. To the best of our knowledge, their scheme is the first one to solve
the data transfer between two clouds efficiently, but it’s inefficient in data deletion process since they reach
deletion by re-encrypting the transferred data, which requires the data owner to provide many information.
Xue et al. designed a provable data migration scheme, which characterized by PDP and verifiable deletion.
The data owner can check the data integrity through PDP protocol and verify the deletion result by Rank-based
Merkle hash tree (RMHT). However, Liu et al. pointed out that there exists a security flaw in the scheme and
they designed an improved scheme that can solve the security flaw. In 2018, Yang et al. adopted vector
commitment to design a new data transfer and deletion scheme, which offers the data owner the ability to
verify the transfer and deletion results without any TTP. Moreover, their scheme can realize data integrity
verification on the target cloud.
1. B. Varghese and R. Buyya, “Next generation cloud computing: New trends and research directions”:
The Landscape od computing has significantly changed over the last decade. Not only have more
providers and service offerings crowded the space, but also cloud infrastructure that was traditionally limited
to single provider data centers is now evolving. In this, we firstly discuss the changing cloud infrastructure
and consider the use of infrastructure away from data centers. These trends have resulted in the need for a
variety of new computing architectures that will be offered by future cloud infrastructure.
6
2. W. Shen, J. Qin, J. Yu, et al, “Enabling identity-based integrity auditing and data sharing with
sensitive information hiding for secure cloud storage”:
With cloud storage services, users can remotely store their data to the cloud and realize the data sharing
with others. Remote data integrity auditing is proposed to guarantee the integrity of the data stored in the
cloud. In some common cloud storage systems such as the electronic health records system, the cloud file
might contain some sensitive information. The sensitive information should not be exposed to others when the
cloud file is shared. Encrypting the whole shared file can realize the sensitive information hiding, but will
make this shared file unable to be used by others.
3. R. Kaur, I. Chaua and J. Bhattacharya, “Data deduplication techniques for efficient cloud storage
management: A systematic review”:
The exponential growth of digital data in cloud storage systems is a critical issue presently as a large
amount od duplicate data in the storage systems exerts an extra load on it. Deduplication is an efficient
technique that has gained attention in large-scale storage systems. Deduplication elements reduction data,
improves storage utilization and reduces storage cost. This paper presents a broad methodical literature review
of existing data deduplication techniques along with various existing taxonomies of deduplication that have
been based on cloud storage.
4. K. Ren, C. Wang, and Q. Wang, “Security challenges for the public cloud”:
Cloud computing represents today's m ost exciting computing paradigm shift in
information technology. However, security and privacy are perceived as primary obstacles to its wide
adoption. Here, the authors outline several critical security challenges and motivate further
investigation of security solutions for a trustworthy public cloud environment.
5. U. Adhikari, T. H. Morris, and S. Pan, “Applying non-nested generalized exemplars classification for
cyber-power event and intrusion detection”:
Non-nested generalized exemplars (NNGEs) are a state-of-the-art data mining algorithm which uses
distance between a new example and a set of exemplars for classification. The state extraction method (STEM)
preprocesses power system wide area measurement system data to reduce data size while maintaining critical
patterns. Together NNGE+STEM make an effective event and intrusion detection system which can
effectively classify power system events and cyber-attacks in real time. This paper documents the results of
two experiments in which NNGE+STEM was used to classify cyber power contingency, control action, and
cyber-attack events.
7
Searchable symmetric encryption (SSE) allows a party to outsource the storage of his data to another
party in a private manner, while maintaining the ability to selectively search over it. This problem has been
the focus of active research and several security definitions and constructions have been
proposed. In this paper we begin by reviewing existing notions of security and propose new and
stronger security definitions. We then present two constructions that we show secure under our new
definitions. Interestingly, in addition to satisfying stronger security guarantees, our constructions are more
efficient than all previous constructions. Further, prior work on SSE only considered the setting where only
the owner of the data is capable of submitting search queries. We consider the natural extension
where an arbitrary group of parties other than the owner can submit search queries. We formally define
SSE in this multi-user setting, and present an efficient construction.
8
CHAPTER-3
REQUIREMENTS & DOMAIN INFORMATION
9
clouds, predominant today, often have functions distributed over multiple locations from central servers. If the
connection to the user is relatively close, it may be designated an edge server.
Cloud computing is the delivery of computing services including servers, storage, databases,
networking, software, analytics, and intelligence over the Internet (“the cloud”) to offer faster innovation,
flexible resources, and economies of scale. You typically pay only for cloud services you use, helping lower
your operating costs, run your infrastructure more efficiently and scale as your business needs change. Cloud
computing is named as such because the information being accessed is found remotely in the cloud or a virtual
space. Companies that provide cloud services enable users to store files and applications on remote servers
and then access all the data via the Internet. This means the user is not required to be in a specific place to gain
access to it, allowing the user to work remotely.
Client Server
With the varied topic in existence in the fields of computers, Client Server is one, which has generated
more heat than light, and also more hype than reality. This technology has acquired a certain critical mass
attention with its dedication conferences and magazines. Major computer vendors such as IBM and DEC, have
declared that Client Servers is their main future market. A survey of DBMS magazine revealed that 76% of
its readers were actively looking at the client server solution. The growth in the client server development
tools from $200 million in 1992 to more than $1.2 billion in 1996.
Client server implementations are complex but the underlying concept is simple and powerful. A client
is an application running with local resources but able to request the database and relate the services from
separate remote server. The software mediating this client server interaction is often referred to as
MIDDLEWARE.
The typical client either a PC or a Work Station connected through a network to a more powerful PC,
Workstation, Midrange or Main Frames server usually capable of handling request from more than one client.
However, with some configuration server may also act as client. A server may need to access other server in
order to process the original client request.
The key client server idea is that client as user is essentially insulated from the physical location and
formats of the data needs for their application. With the proper middleware, a client input from or report can
transparently access and manipulate both local database on the client machine and remote databases on one or
more servers. An added bonus is the client server opens the door to multi-vendor database access indulging
heterogeneous table joins.
Time-sharing changed the picture. Remote terminal could view and even change the central data,
subject to access permissions. And, as the central data banks evolved in to sophisticated relational database
with non-programmer query languages, online users could formulate adhoc queries and produce local reports
without adding to the MIS applications software backlog. However remote access was through dumb
terminals, and the client server remained subordinate to the Slave\Master.
The entire user interface is planned to be developed in browser specific environment with a touch of
Intranet-Based Architecture for achieving the Distributed Concept. The browser specific components are
designed by using the HTML standards, and the dynamism of the designed by concentrating on the constructs
of the Java Server Pages.
11
About Java(J2EE)
Initially the language was called as “oak” but it was renamed as “Java” in 1995. The primary
motivation of this language was the need for a platform-independent (i.e., architecture neutral) language that
could be used to create software to be embedded in various consumer electronic devices.
Except for those constraints imposed by the Internet environment, Java gives the programmer,
full control.
Finally, Java is to Internet programming where C was to system programming.
Java has had a profound effect on the Internet. This is because; Java expands the Universe of objects
that can move about freely in Cyberspace. In a network, two categories of objects are transmitted between
the Server and the Personal computer. They are: Passive information and Dynamic active programs. The
Dynamic, Self-executing programs cause serious problems in the areas of Security and probability. But, Java
addresses those concerns and by doing so, has opened the door to an exciting new form of program called
the Applet.
An application is a program that runs on our computer under the operating system of that computer.
It is more or less like one creating using C or C++. Java’s ability to create applets makes it important. An
Applet is an application designed to be transmitted over the Internet and executed by a Java –compatible web
browser. An applet is actually a tiny Java program, dynamically downloaded across the network, just like an
image. But the difference is, it is an intelligent program, not just a media file. It can react to the user input and
dynamically change.
Java Script
JavaScript is a script-based programming language that was developed by Netscape Communication
Corporation. JavaScript was originally called Live Script and renamed as JavaScript to indicate its relationship
with Java. JavaScript supports the development of both client and server components of Web-based
applications. On the client side, it can be used to write programs that are executed by a Web browser within
the context of a Web page. On the server side, it can be used to write Web server programs that can process
information submitted by a Web browser and then updates the browser’s display accordingly. Even though
12
JavaScript supports both client and server Web programming, we prefer JavaScript at Client-side
programming since most of the browsers supports it. JavaScript is almost as easy to learn as HTML, and
JavaScript statements can be included in HTML documents by enclosing the statements between a pair of
scripting tags.
<SCRIPTS>...</SCRIPT>.
JavaScript statements
</SCRIPT>
Here are a few things we can do with JavaScript:
Validate the contents of a form and make calculations.
Add scrolling or changing messages to the Browser’s status line.
Animate images or rotate images that change when we move the mouse over them.
Detect the browser in use and display different content for different browsers.
Detect installed plug-ins and notify the user if a plug-in is required.
We can do much more with JavaScript, including creating entire application.
Hypertext Markup Language (HTML), the languages of the World Wide Web (WWW), allows users
to produces Web pages that include text, graphics and pointer to other Web pages (Hyperlinks).HTML is not
a programming language but it is an application of ISO Standard 8879, SGML (Standard Generalized
Markup Language), but specialized to hypertext and adapted to the Web. The idea behind Hypertext is that
instead of reading text in rigid linear structure, we can easily jump from one point to another point. We can
navigate through the information based on our interest and preference. A markup language is simply a series
of elements, each delimited with special characters that define how text or other items enclosed within the
elements should be displayed. Hyperlinks are underlined or emphasized works that load to other documents
or some portions of the same document.
HTML can be used to display any type of document on the host computer, which can be geographically
at a different location. It is a versatile language and can be used on any platform or desktop.
HTML provides tags (special codes) to make the document look attractive. HTML tags are not case-
sensitive. Using graphics, fonts, different sizes, color, etc., can enhance the presentation of the document.
Anything that is not a tag is part of the document itself.
13
JDBC is a Java API for executing SQL statements. (As a point of interest, JDBC is a trademarked
name and is not an acronym; nevertheless, JDBC is often thought of as standing for Java Database
Connectivity. It consists of a set of classes and interfaces written in the Java programming language. JDBC
provides a standard API for tool/database developers and makes it possible to write database applications using
a pure Java API.
Using JDBC, it is easy to send SQL statements to virtually any relational database. One can write a
single program using the JDBC API, and the program will be able to send SQL statements to the appropriate
database. The combinations of Java and JDBC lets a programmer write it once and run it anywhere.
What Does JDBC Do?
At this point, Microsoft's ODBC (Open Database Connectivity) API is that probably the most widely
used programming interface for accessing relational databases. It offers the ability to connect to almost all
databases on almost all platforms.
1. ODBC is not appropriate for direct use from Java because it uses a C interface. Calls from Java to
native C code have a number of drawbacks in the security, implementation, robustness, and automatic
portability of applications.
2. A literal translation of the ODBC C API into a Java API would not be desirable. For example, Java has
no pointers, and ODBC makes copious use of them, including the notoriously error-prone generic
pointer "void *". You can think of JDBC as ODBC translated into an object-oriented interface that is
natural for Java programmers.
3. ODBC is hard to learn. It mixes simple and advanced features together, and it has complex options
even for simple queries. JDBC, on the other hand, was designed to keep simple things simple while
allowing more advanced capabilities where required.
4. A Java API like JDBC is needed in order to enable a "pure Java" solution. When ODBC is used, the
ODBC driver manager and drivers must be manually installed on every client machine. When the
JDBC driver is written completely in Java, however, JDBC code is automatically installable, portable,
and secure on all Java platforms from network computers to mainframes.
14
CHAPTER-4
SYSTEM METHODOLOGY
15
4.2 . MODULES
4.2.1. Multi-cloud:
Lots of data centers are distributed around the world, and one region such as America, Asia, usually
has several data centers belonging to the same or different cloud providers. So technically all the data centers
can be access by a user in a certain region, but the user would experience different performance. The latency
of some data centers is very low while that of some ones may be intolerable high. System chooses clouds for
storing data from all the available clouds which meet the performance requirement, that is, they can offer
acceptable throughput and latency when they are not in outage. The storage mode transition does not impact
the performance of the service. Since it is not a latency-sensitive process, we can decrease the priority of
transition operations, and implement the transition in batch when the proxy has low workload.
16
cloud, thus altogether circumventing storage encryption schemes. We present our design for a new cloud
storage encryption scheme that enables cloud storage providers to create convincing fake user secrets to protect
user privacy. Since coercers cannot tell if obtained secrets are true or not, the cloud storage providers ensure
that user privacy is still securely protected. Most of the proposed schemes assume cloud storage service
providers or trusted third parties handling key management are trusted and cannot be hacked; however, in
practice, some entities may intercept communications between users and cloud storage providers and then
compel storage providers to release user secrets by using government power or other means. In this case,
encrypted data are assumed to be known and storage providers are requested to release user secrets. we aimed
to build an encryption scheme that could help cloud storage providers avoid this predicament. In our approach,
we offer cloud storage providers means to create fake user secrets. Given such fake user secrets, outside
coercers can only obtained forged data from a user’s stored ciphertext. Once coercers think the received secrets
are real, they will be satisfied and more importantly cloud storage providers will not have revealed any real
secrets. Therefore, user privacy is still protected. This concept comes from a special kind of encryption scheme
called deniable encryption.
17
4.3. System Design
Level-0:
In level0 of data flow diagram, the data owner can upload files in cloud servers. And here send
transaction details like log for accessing the data and uploading, etc., to the proxy server.
18
Level-1:
In level1, The receiver request file to the cloud servers. Here cloud server checks the File name and
secret key of the file. If it is entered correct, then the authorized file is sent to the receiver. If it is wrong it will
show enter correct file name and secret key to the receiver.
19
4.3.2.UML Diagrams
A UML diagram is a diagram based on the UML (Unified Modeling Language) with the purpose
of visually representing a system along with its main actors, roles, actions, artifacts or classes, in order to better
understand, alter, maintain, or document information about the system. UML is a modern approach to
modelling and documenting software. In fact, it’s one of the most popular business process modelling
techniques. It is based on diagrammatic representations of software components.
20
Owner module is to upload their files using some access policy. and performs View Owner’s VMs
Details and purchase, Browse and enc file and upload, Transfer data from one to another cloud based on the
price, Check all cloud VM details and Price list. Cloud servers can perform authorizing files, storing files. And
also cloud servers can show the owner files and registered users. The end user can request files from cloud
server and receiving files. The attacker modifying a file without cloud server respond.
Class diagram describes the attributes and operations of a class and also the constraints imposed on
the system. The class diagrams are widely used in the modelling of object-oriented systems because they are
the only UML diagrams, which can be mapped directly with object-oriented languages.
These are the main building block in object-oriented which shows different attributes and
methods(operations) and the relationship among data owner depending upon cloud server’s, proxy server’s
21
and end user’s functionalities. The attacker can attack on cloud server to view and modify data i.e., hack the
server and misuse or steal the data.
The data owner registered to the cloud and the registration is successful the login to the cloud. The data
owner request for VM to the cloud. Here, the data owner can upload files. The cloud servers can view user
files in cloud. The data owner can verify the data integrity and checks file storage confirmation. The end user
registered to proxy server and if registration is successfully completed then login.
22
The cloud servers may authorize files, view the user files and view users in cloud. The end user can
send request file open to the cloud. The cloud servers can respond to request. Here the data owner checks the
file integrity. The proxy server automatically checks the MAC value. The data owner can transfer file from
one cloud to another cloud. After successfully transferring file to another cloud then delete the file from
existing cloud VM. The end user can view blocked users and also can unblock the users.
The flowchart shows the steps as boxes of various kinds, and their order by connecting the boxes with
arrows. This diagrammatic representation illustrates a solution model to a given problem. Flowcharts are used
in analysing, designing, documenting or managing a process or program in various fields.
23
Firstly, the data owner can register to the cloud. If the data owner successfully registered to the cloud,
then login and assign memory and threshold to VM, and browsing and upload the files. The proxy server may
check the number of files in cloud. Here, the data owner verifies the data integrity in cloud servers. The end
user can request file open to data owner. The owner can respond to that request. Then the end user can open
the file and download the file. The data owner can transfer file from one cloud to another cloud, after successful
transfer of file then existing file can deleted from the cloud.
24
CHAPTER-5
EXPERIMENT ANALYSIS
5.1. Experimentation
i) Firstly, the data owner computes encryption key k = H (tagf ||SKO), and then uses k to encrypt
the file C = Enck(F), where Enc is an IND-CPA secure encryption algorithm. After that the
data owner divides the ciphertext C into n ′ blocks, meanwhile, inserts n − n ′ random blocks
into the n ′ blocks at random positions, which can guarantee that the CBF will not be null after
data transfer and deletion. Then the data owner records these random positions in a table P F.
ii) For every data block Ci, the data owner randomly chooses a distinct integer ai as the index of
Ci, and computes the hash values Hi = H (tagf ||ai ||Ci). Thus, the outsourced data set can be
denoted as D = ((a1, C1), · ·, (an, Cn)). Finally, the data owner sends D to the cloud A, along
with the file tag tagf.
25
3. Data outsourcing:
The cloud A stores D and generates storage proof. Then the data owner checks the storage result and
deletes the local backup.
i) Upon receiving data set D and file tag tagf, the cloud A stores D, and uses the indexes (a1, a2,
· · ·, an) to construct a counting Bloom filter CBFs, where i = 1, 2, · · ·, n. Meanwhile, the
cloud A stores tagf as the index of D. Finally, the cloud A computes a signature sigs = SignSKA
(storage||tagf ||CBFs||Ts), and sends the proof λ = (CBFs, Ts, sigs) to the data owner, where
Sign is a ECDSA signature algorithm, Ts is a timestamp.
ii) ii) On receipt of storage proof λ, the data owner checks its validity. Specifically, the data owner
first checks the validity of the signature sigs. If sigs is invalid, the data owner quits and outputs
failure; otherwise, the data owner randomly chooses half of the indexes from (a1, a2, · · ·, an)
to check the correctness of the CBFs. If the CBFs is not correct, the data owner quits and outputs
failure; otherwise, the data owner deletes the local backup.
4. Data transfer:
When the data owner wants to change the service provider, he migrates some data blocks, even the
whole file from the cloud A to the cloud B.
i) Firstly, the data owner generates the index set of block indices ϕ, which will identify the data
blocks that need to be migrated. Then the data owner computes a signature sigt = SignSKO
(transfer||tagf ||ϕ||Tt), where Tt is a timestamp. After that the data owner generates a transfer
request Rt = (transfer, tagf, ϕ, Tt, sigt), and then sends it to the cloud A. Meanwhile, the data
owner sends the hash values {Hi}i∈ϕ to the cloud B.
ii) ii) On receipt of the transfer request Rt, the cloud A checks the validity of Rt. If Rt is not valid,
the cloud A quits and outputs failure; otherwise, the cloud A computes a signature sigta =
SignSKA (Rt||Tt), and sends the data blocks {(ai, Ci)} i∈ϕ to the cloud B, along with the
signature sigta and the transfer request Rt.
5. Transfer check:
The cloud B wants to check the correctness of the transfer and returns the transfer result to the data
owner.
i) Firstly, the cloud B checks the validity of the transfer request Rt and signature sigta. If not both
of them are valid, the cloud B quits and outputs failure; otherwise, the cloud B checks that
whether the equation Hi = H (tagf ||ai ||mi) holds, where i ∈ ϕ. If Hi ̸= H (tagf ||ai ||Ci), the cloud
B requires the cloud A to send (ai, Ci) again; otherwise, the cloud B goes to Step ii).
ii) The cloud B stores the blocks {(ai, Ci)} i∈ϕ, and uses the indexes {ai}i∈ϕ to construct a new
counting Bloom filter CBFb. Then the cloud B computes a signature sigtb = SignSKB
26
(success||tagf ||ϕ||Tt||CBFb). Finally, the cloud B returns the transfer proof π = (sigta, sigtb,
CBFb) to the data owner.
iii) Upon receipt of π, the data owner checks the transfer result. To be specific, the data owner
checks the validity of the signature sigtb. Meanwhile, the data owner randomly chooses half of
the indexes from set ϕ to verify the correctness of the counting Bloom filter CBFb. If and only
if all the verifications pass, the data owner trusts the transfer proof is valid, and the cloud B
stores the transferred data honestly.
6. Data deletion:
The data owner might require the cloud A to delete some data blocks when they have been transferred
to the cloud B successfully.
i) Firstly, the data owner computes a signature sigd = SignSKA (delete|| tagf ||ϕ||Td), where Td is
a timestamp. Then the data owner generates a data deletion request Rd = (delete, tagf, ϕ, Td,
sigd) and sends it to cloud A.
ii) ii) Upon receiving Rd, the cloud A checks Rd. If Rd is invalid, the cloud A quits and outputs
failure; otherwise, the cloud A deletes the data blocks {(ai, Ci)} i∈ϕ by overwriting. Meantime,
the cloud A removes indexes {aq}q∈ϕ from the CBFs and obtains a new counting Bloom filter
CBFd. Finally, the cloud A computes a signature sigda = Sign(delete||Rd||CBFd), and returns
the data deletion evidence τ = (sigda, CBFd) to the data owner.
iii) After receiving τ, the data owner checks the signature sigda. If sigda is invalid, the data owner
quits and outputs failure; otherwise, the data owner randomly chooses half of the indexes from
ϕ to check the equations CBF (aq) = 0 and determines if aq belongs to the CBFd. If the equations
hold, the data owner trusts τ is valid.
5.2. Algorithms
5.2.1. Counting Bloom Filter (CBF):
Bloom filter (BF), a space-efficient data structure, conceived by Burton Howard Bloom in 1970, that
is used to test that if a set contains a specified element. This designed to tell, rapidly and memory-efficiently,
whether an element is present in a set. BF costs constant time overhead to insert an element or verify that
whether an element belongs to the set, no matter how many elements the set and the BF includes.
A BF initially represents a bit of array of m bits, all set to 0. The insertion takes an element and inputs
it to k different hash functions each mapping the element to one of the m array positions, which are then set to
1. When querying the BF on an element, it is considered to be in the BF if all positions obtained by evaluating
the hash evaluations are set to 1. The initial secret key sk output by the generation algorithm of a BFE scheme
corresponds to an empty BF. Encryption takes a message M and the public key pk, samples a random element
27
s (acting as a tag for the ciphertext) corresponding to the universe U of the BF and encrypts a message using
pk with respect to the k positions set in the BF by s.
A BF can be viewed as a m length bit array with k hash functions: hi (·): {0, 1} ∗ → {0, 1, …… ...,
m}. To insert an element, we need to set the group of k bits to 1, the positions of these bits are determined by
hash values h1(x), ………..., hk(x). Membership tests are implemented by executing the same hash
calculations and outputting success if all of the corresponding positions are one, as shown in Fig 5.2.1.2.
Note that there is a false positive in the BF, which means that even all the k bits related to we are one,
but w does not belong to the set with a small probability. However, we can choose appropriate parameters to
reduce the probability, e.g., the number of the hash functions k, the length of the BF m and the number of the
elements n. Further, the probability will be so small that it can be negligible if the parameters are suitable.
Besides, BF cannot delete an element from the data set. As a variant of BF, CBF uses a counter cell count to
replace every “bit” position, as illustrated in Fig 5.2.1.2 To insert an element y, we require to increase the k
related counters by one, the indexes of the counters are also determined by the hash values h1(y), h2(y), · · ·,
hk(y). On the contrary, the element deletion operation is simply to decrease the k corresponding counters by
one.
The data owner uses Counting Bloom Filter, When the data owner wants to change the service
provider, he migrates some data blocks, even the whole file from one cloud to the other cloud based on the
services provided like resources, threshold VMs, prices and memory. On uploading the data, the data is
encrypted that is cipher text is generated along with a secret key. To transfer or access the data, secret key is
required.
28
If user wants to transfer data, request is sent by the user to data owner. The data owner checks for the
request and responds to the request by providing the secret key to transfer the data to other cloud. The other
cloud wants to check the correctness of the transfer and returns the transfer result to the data owner. If and
only if all the verifications pass, the data owner trusts the transfer proof is valid, and the other cloud stores the
transferred data honestly. On transfer, the file is downloaded. The data owner requires the previous cloud to
delete some data blocks when they have been transferred to the other cloud successfully.
5.2.2. AES Algorithm:
Advanced Encryption Standards (AES) is a symmetric-key algorithm. MAC uses block cipher
algorithm. A block cipher is an algorithm that encrypts and decrypts the data using 128/192/256-bit keys into
128-bit blocks. Symmetric key algorithms are sometimes referred to as secret key algorithms. This is because
these types of algorithms generally use one key that is kept secret by the systems engaged in the encryption
and decryption processes. Symmetric key algorithms are algorithms for cryptography that use the same
cryptographic keys for both the encryption of plain text and the decryption of cipher text. The keys may be
identical, or there may be a simple transformation to go between the two keys. The keys, in practice represent
a shared secret between two or more parties that can be used to maintain a private information link. The
requirement that both parties’ have access to the secret key is one of the main drawbacks of symmetric-key
encryption, in comparison to public-key encryption (Also known as asymmetric-key encryption).
5.3. Testing
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of components,
sub-assemblies, assemblies and/or a finished product It is the process of exercising software with the intent of
ensuring that the Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific testing requirement .
29
5.3.1.2. Integration testing:
Integration tests are designed to test integrated software components to determine if they actually run
as one program. Testing is event driven and is more concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the components were individually satisfaction, as shown by
successfully unit testing, the combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of components.
Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test phase of the software
lifecycle, although it is not uncommon for coding and unit testing to be conducted as two distinct phases .
Test strategy and approach
Field testing will be performed manually and functional tests will be written in detail.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental integration testing of two or more integrated software
components on a single platform to produce failures caused by interface defects.
The task of the integration test is to check that components or software applications, e.g. components
in a software system or – one step up – software applications at the company level – interact without error .
Test Results:
All the test cases mentioned above passed successfully. No defects encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant participation by the
end user. It also ensures that the system meets the functional requirements.
Test Results:
All the test cases mentioned above passed successfully. No defects encountered.
31
5.3.2. Other Testing Methodologies
Validation Checking
Validation checks are performed on the following fields.
Text Field:
The text field can contain only the number of characters lesser than or equal to its size. The text fields
are alphanumeric in some tables and alphabetic in other tables. Incorrect entry always flashes and error
message.
Numeric Field:
The numeric field can contain only numbers from 0 to 9. An entry of any character flashes an error
message. The individual modules are checked for accuracy and what it has to perform. Each module is
subjected to test run along with sample data. The individually tested modules are integrated into a single
system. Testing involves executing the real data information is used in the program the existence of any
program defect is inferred from the output. The testing should be planned so that all the requirements are
individually tested.
A successful test is one that gives out the defects for the inappropriate data and produces and output
revealing the errors in the system.
Artificial test data are created solely for test purposes, since they can be generated to test all
combinations of formats and values. In other words, the artificial data, which can quickly be prepared by a
data generating utility program in the information systems department, make possible the testing of all login
and control paths through the program.
The most effective test programs use artificial test data generated by persons other than those who wrote the
programs. Often, an independent team of testers formulates a testing plan, using the systems specifications.
The package “Virtual Private Network” has satisfied all the requirements specified as per software
requirement specification and was accepted.
5.5. Results
5.5.1. Results:
The time cost of data encryption:
In encryption phase, we increase the file from 1MB to 8MB with a step for 1MB, and the number of
the data blocks is fixed in 8000, then the time cost comparison is shown in Fig.5.5.1.1. We can find that the
time cost of the three schemes will increase with the size of the encrypted data. However, the growth rate of
our scheme is relatively lower than that of the scheme, and almost the same with the scheme. Note that the
time cost of our scheme is less than that of the other two schemes because the scheme needs much more hash
computations to generate encryption keys, and the scheme needs more encryption operation to generate the
MAC. Hence, we think our scheme is more efficient to encrypt the file.
36
5.5.2. Screenshots:
Home Page:
Cloud Server:
37
Fig-5.5.2.2.2. Migration details in Cloud
Proxy Server:
38
Data Owner:
End User:
39
Fig-5.5.2.5.1: End User menu page
Searching a File:
Downloading a File:
40
Verifying a File:
41
CHAPTER-6
CONCLUSION & FUTURE SCOPE
6.1. Conclusion:
In cloud storage, the data owner does not believe that the cloud server might execute the data transfer
and deletion operations honestly. To solve this problem, we propose a CBF-based secure data transfer scheme,
which can also realize verifiable data deletion. In our scheme, the cloud B can check the transferred data
integrity, which can guarantee the data is entirely migrated. Moreover, the cloud A should adopt CBF to
generate a deletion evidence after deletion, which will be used to verify the deletion result by the data owner.
Hence, the cloud A cannot behave maliciously and cheat the data owner successfully.
In the proposed scheme, the user can flexibly delete the unnecessary data blocks, while the useful data
blocks still remain on the physical medium. Meanwhile, the proposed scheme can achieve (public and private)
verifiability of data deletion result. That is, any verifier who owns the data deletion evidence can verify the
data deletion result. If the cloud server does not honestly execute the data deletion command and generate the
deletion evidence, the verifier can easily detect the malicious data reservation with an overwhelming
probability. Moreover, we provide the security analysis and efficiency evaluation, which respectively
demonstrate the security and practicality of the proposed scheme. Finally, the security analysis and simulation
results validate the security and practicability of our proposal, respectively.
Similar to all the existing solutions, our scheme considers the data transfer between two different cloud
servers. However, with the development of cloud storage, the data owner might want to simultaneously
migrate the outsourced data from one cloud to the other two or more target clouds. However, the multi-target
clouds might collude together to cheat the data owner maliciously. Hence, the provable data migration among
three or more clouds requires our further exploration.
42
REFERENCES
[1] C. Yang and J. Ye, “Secure and efficient fine-grained data access control scheme in cloud computing”,
Journal of High-Speed Networks, Vol.21, No.4, pp.259–271, 2020.
[2] X. Chen, J. Li, J. Ma, et al., “New algorithms for secure outsourcing of modular exponentiations”, IEEE
Transactions on Parallel and Distributed Systems, Vol.25, No.9, pp.2386–2396, 2018.
[3] P. Li, J. Li, Z. Huang, et al., “Privacy-preserving outsourced classification in cloud computing”, Cluster
Computing, Vol.21, No.1, pp.277–286, 2018.
[4] B. Varghese and R. Buyya, “Next generation cloud computing: New trends and research directions”, Future
Generation Computer Systems, Vol.79, pp.849–861, 2019.
[5] W. Shen, J. Qin, J. Yu, et al., “Enabling identity-based integrity auditing and data sharing with sensitive
information hiding for secure cloud storage”, IEEE Transactions on Information
Forensics and Security, Vol.14, No.2, pp.331–346, 2019.
[6] R. Kaur, I. Chana and J. Bhattacharya J, “Data deduplication techniques for efficient cloud storage
management: A systematic review”, The Journal of Supercomputing, Vol.74, No.5, pp.2035–2085, 2018.
[7] Cisco, “Cisco global cloud index: Forecast and methodology, 2014–2019”, available at:
https://www.cisco.com/c/en/us- /solutions/collateral/service-provider/global-cloud-index-gci/ white-paper-
c11-738085.pdf, 2019-5-5.
[8] Cloudsfer, “Migrate & backup your files from any cloud to any cloud”, available at:
https://www.cloudsfer.com/, 2019-5-5.
[9] Y. Liu, S. Xiao, H. Wang, et al., “New provable data transfer from provable data possession and deletion
for secure cloud storage”, International Journal of Distributed Sensor Networks, Vol.15, No.4, pp.1–12, 2019.
[10] Y. Wang, X. Tao, J. Ni, et al., “Data integrity checking with reliable data transfer for secure cloud
storage”, International Journal of Web and Grid Services, Vol.14, No.1, pp.106–121, 2018.
[11] L. Xue, Y. Yu, Y. Li, et al., “Efficient attribute based encryption with attribute revocation for assured
data deletion”, Information Sciences, Vol.479, pp.640–650, 2019.
[12] L. Du, Z. Zhang, S. Tan, et al., “An Associated Deletion Scheme for Multi-copy in Cloud Storage”, Proc.
of the 18th International Conference on Algorithms and Architectures for Parallel Processing, Guangzhou,
China, pp.511–526, 2018.
[13] C. Yang, X. Chen and Y. Xiang, “Blockchain-based publicly verifiable data deletion scheme for cloud
storage”, Journal of Network and Computer Applications, Vol.103, pp.185–193, 2018.
[14] C. Yang, J. Wang, X. Tao, et al., “Publicly verifiable data transfer and deletion scheme for cloud storage”,
Proc. of the 20th International Conference on Information and Communications Security (ICICS 2018), Lille,
France, pp.445–458, 2018.
[15] F. Hao, D. Clarke and A. F. Zorzo, “Deleting secret data with public verifiability”, IEEE Transactions on
Dependable and Secure Computing, Vol.13, No.6, pp.617–629, 2015
43