0% found this document useful (0 votes)
40 views

Secure Data Objects Replication in Data Grid

This document discusses secure data replication in data grids. It proposes using secret sharing and erasure coding approaches to partition data and combine them with dynamic replication for confidentiality, integrity and availability while achieving performance goals. Two heuristic algorithms are developed - one to determine which clusters should maintain share replicas (OIRSP) and another to determine the number and placement of replicas within a cluster (OISAP). Experimental results show the heuristic algorithms reduce communication costs and find near-optimal solutions.

Uploaded by

Nandha Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Secure Data Objects Replication in Data Grid

This document discusses secure data replication in data grids. It proposes using secret sharing and erasure coding approaches to partition data and combine them with dynamic replication for confidentiality, integrity and availability while achieving performance goals. Two heuristic algorithms are developed - one to determine which clusters should maintain share replicas (OIRSP) and another to determine the number and placement of replicas within a cluster (OISAP). Experimental results show the heuristic algorithms reduce communication costs and find near-optimal solutions.

Uploaded by

Nandha Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

Secure Data Objects Replication in Data Grid

ABSTRACT:
Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data fragmentation approaches can be combined ith dynamic replication. !n this paper, e consider data partitioning "both secret sharing and erasure coding# and dynamic replication in data grids, in hich security and data access performance e investigate the problem of are critical issues. $ore specifically,

optimal allocation of sensitive data objects that are partitioned by using secret sharing scheme or erasure coding scheme and%or replicated. The grid topology e consider consists of t o layers. !n the upper layer, ithin each cluster is represented by a tree multiple clusters form a net or& topology that can be represented by a general graph. The topology graph. 'e decompose the share replica allocation problem into t o subproblems( the Optimal !ntercluster Resident Set )roblem "O!RS)# that determines hich clusters need share replicas and the Optimal !ntracluster Share *llocation )roblem "O!S*)# that determines the number of share replicas needed in a cluster and their placements. 'e develop t o heuristic algorithms for the t o sub problems. +,perimental studies sho that the heuristic algorithms achieve good

performance in reducing communication cost and are close to optimal solutions.

Index Terms:
Secure data, secret sharing, erasure coding, replication, data grids

OBJECTIVES:
Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data fragmentation approaches can be combined replication. ith dynamic

EXISTING SYSTEM:
Security and data access performance are critical issues in e,isting system. The severe problem in e,isting system is optimal allocation of sensitive data objects. +,isting system doesn-t achieve data survivability, security, and access performance.

*lso the communication cost is higher.

PROPOSED SYSTEM:
!n this paper, e consider data partitioning "both secret sharing and erasure coding# and dynamic replication in data grids, in security and data access performance are critical issues. 'e investigate the problem of optimal allocation of sensitive data objects that are partitioned by using secret sharing scheme or erasure coding scheme and%or replicated. 'e develop t o heuristic algorithms for the t o sub problems.

hich

The O!RS) determines replicas.

hich clusters need to maintain share

*nd the O!S*) determines the number of share replicas needed in a cluster and their placements. that the heuristic algorithms achieve

+,perimental studies sho optimal solutions.

good performance in reducing communication cost and are close to

SYSTEM SPECIFICATION

.*RD'*R+ /O01!G2R*T!O0
.ard dis& R*$ )rocessor $onitor ( ( ( ( 34 G5 678mb )entium !9 7:--/olor $onitor

SO1T'*R+ /O01!G2R*T!O0
1ront +nd Operating System 5ac& +nd ( ( ( ;ava 'indo s <). $yS=>

MODULES:

O!RS) Specification. O!S*) Specification.

* .euristic *lgorithm. )erformance of the O!RS) .euristic *lgorithm. The +fficiency of the O!S*) SD)-Tree *lgorithm.

Heuristic Alg rit!":


It refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a good enough solution, where an exhaustive search is impractical.

.Heuristics are intended to gain computational performance or conceptual simplicity, potentially at the cost of accuracy or precision

In computer science a heuristic is a technique designed to solve a problem that ignores whether the solution can be proven to be correct, but which usually produces a good solution or solves a simpler problem that contains or intersects with the solution of the more complex problem. Most real-time, and even some on-demand, anti-virus scanners use heuristic signatures to look for specific attributes and characteristics for detecting viruses and other forms of malware. Heuristic algorithms are often employed because they may be seen to work without having been mathematically proven to meet a given set of requirements. !ne common pitfall in implementing a heuristic method to meet a requirement comes when the engineer or designer fails to reali"e that the current data set does not necessarily represent future system states. #hile the existing data can be pored over and an algorithm can be devised to successfully handle the current data, it is imperative to ensure that the heuristic method employed is capable of handling future data sets. $his means that the engineer or designer must fully understand the rules that generate the data and develop the algorithm to meet those requirements and not %ust address the current data sets. If one seeks to use a heuristic as a means of solving a search or knapsack problem, then one must be careful to make sure that the heuristic function which one is choosing to use is an admissible heuristic. &iven a heuristic function labeled as'

h"vi,vg# which is meant to approximate the true optimal distance


directed graph G containing ntotal nodes or vertexes labeled (dmissible means that for all "vi,vg# where

to the goal node vg in a . .

If a heuristic is not admissible, it might never find the goal, by ending up in a dead end of graph G or by skipping back and forth between two nodes vi and vj where .

OISAP S#eci$ic%ti &(


The O!S*)

the "Optimal !ntercluster Resident Set )roblem#

determines the

number of share replicas needed in a cluster and their placements. 'hen e consider allocation problem ithin a cluster .,, e can isolate the cluster and consider the problem independently. The all read re?uests from remote clusters can be vie ed as read re?uests from the root node. *lso, the / updates in the entire system can be e can simplify the

considered as updates done at the root node of the cluster. Thus, notation hen discussing allocation

ithin ., by referring to everything in the cluster

ithout the cluster

OIRSP S#eci$ic%ti &(


The O!RS)

"Optimal !ntracluster Share *llocation )roblem #

determines hich clusters need to maintain share replication. 'e define the first problem, O!RS), as the optimal resident set problem in a general graphintercluster level graph ith an $asterSlave/luster .$S/. Our goal is to

determine the optimal R/ that yields minimum access cost at the cluster level

A Heuristic Alg rit!" $ r t!e OIRSP:

The goal of O!RS) is to determine the optimal resident set R/"Read cost# in G/. G/ is a general graph. +ach edge in G/ is considered as one hop. The optimal resident set problem in a general graph is an instance of the problem . !t has been sho n that the problem is 0)-complete. Thus, e develop a heuristic algorithm to find a near-optimal solution. Our

approach is to first build a minimal spanning tree in G/ ith R/ being the root and then identify the cluster to be added to R/ based on the tree structure. The clusters in G/ access data hosted in R/ along the shortest paths, and these paths and the clusters form a set of the shortest path trees. Since all the nodes in R/ are connected, e vie them as one virtual node S. Then, S, all

clusters that are not in R/, and all the shortest access paths form a tree rooted at S in the graph.

The +fficiency of the O!S*) SD)-Tree *lgorithm(

The performance of SD)-tree algorithm is compared

ith the optimal allocation

algorithm and the randomi@e $-replication algorithm. !n the e,periments, the trees are generated randomly by using the topology generator ith changing 0, D, and read%update ratio, here 0 is

the total number of nodes in the cluster, D is the ma,imum node degree, and read%update is the ratio of the average number of read re?uests in the cluster to the total number of update re?uests in the system

You might also like