IRJCS:: Information Security in Big Data Using Encryption and Decryption
IRJCS:: Information Security in Big Data Using Encryption and Decryption
ISSN: 2393-9842
www.irjcs.com
ISSN: 2393-9842
www.irjcs.com
We can identify four different types of users, namely four user roles, in a typical data mining scenario (see Fig. 1.2):
FIGURE 1.2 A simple illustration of the application scenario with data mining at the core.
Data Provider: The user who owns some data that are desired by the data mining task.
Data Collector: The users who collects data from data providers and then publish the data to the data miner.
Data Miner: The user who perfo0rms data mining tasks on the data.
Decision Maker: The user who makes decisions based on the data mining results in order to achieve certain goals.
The data that is provided by the user can be breached or can be get by other users of the database since there is less
security in data base the data provided by the user is not safe and sensitive data is not fully secured we have to developed an
application that encrypts the data and then stores the data in database so that other unauthorized user cannot get the data and does
not know the data that is been hidden in the encrypted data.
2. SYSTEM ANALYSIS
PPDP mainly studies anonymization approaches for publishing useful data while preserving privacy. The original data is
assumed to be a private table consisting of multiple records. Each record consists of the following 4 types of attributes:
Identifier (ID): Attributes that can directly and uniquely identify an individual, such as name, ID number and mobile number.
Quasi-identifier (QID): Attributes that can be linked with external data to re-identify individual records, such as gender, age
and zip code.
Sensitive Attribute (SA): Attributes that an individual wants to conceal, such as disease and salary.
Non-sensitive Attribute (NSA): Attributes other than ID, QID and SA.
Before being published to others, the table is anonym zed, that is, identifiers are removed and quasi-identifiers are
modified. As a result, individual's identity and sensitive attribute values can be hidden from adversaries.
The standard security techniques in database management system, such as username and password or access control
mechanisms, does not provide full security to the data that is been provided by the data provider.
DISADVANTAGE
Security computations in distributed programming frameworks
Security best practices for non-relational data stores
Secure data storage and transactions logs
End-point input validation/filtering
Real-time security monitoring
Scalable and compostable privacy-preserving data mining and analytics
Granular access control
Granular audits
Data provenance
IMPLEMENTATION
The basic idea of this project is that the data is to be secured with the help of encryption and decryption technique. Since
the data is been encrypted and then stored in the database system the unauthorized users cannot know the data if the data is
breached. Here I have used SHA3 algorithm for the implementation of encryption and decryption technique.For example: How the
data table should be anonymized mainly depends on how much privacy we want to preserve in the anonymized data. Different
privacy models have been proposed to quantify the preservation of privacy. Based on the attack model which describes the ability
of the adversary in terms of identifying a target individual, privacy models can be roughly classified into two categories.
_________________________________________________________________________________________________________
2014-15, IRJCS- All Rights Reserved
Page -66
ISSN: 2393-9842
www.irjcs.com
The first category considers that the adversary is able to identify the record of a target individual by linking the record to data from
other sources, such as liking the record to a record in a published data table (called record linkage), to a sensitive attribute in a
published data table (called attribute linkage), or to the published data table itself (called table linkage). The second category
considers that the adversary has enough background knowledge to carry out a probabilistic attack, that is, the adversary is able to
make a confident inference about whether the target's record exist in the table or which value the target's sensitive attribute would
take. Typical privacy models. Includes k-anonymity (for preventing record linkage), l-diversity (for preventing record linkage and
attribute linkage), t-closeness (for preventing attribute linkage and probabilistic attack), epsilon-differential privacy (for preventing
table linkage and probabilistic attack), etc.
ISSN: 2393-9842
www.irjcs.com
A cipher is more cryptographically secure would display a rather flat distribution, which gives no information to a
cryptanalyst
ISSN: 2393-9842
www.irjcs.com
The system implementation I have used VB.NET and SQL DATABASE softwares to handle the data. The user is provided two
types of options to secure the data they are ENCRYPTION and DECRYPTION The user also provided more operation to
manipulate the data available in the database they are ADD NEW, UPDATE, DELETE, EDIT, AND SEARCH. Thus this Application
provides more security for the data and if any security breach occurs then the attacker will not get any data from the database. This
application is less cost and easy to access the data.
DATABASE DIAGRAM
FIGURE 5.2: log in page for student data base management system
ISSN: 2393-9842
www.irjcs.com
_________________________________________________________________________________________________________
2014-15, IRJCS- All Rights Reserved
Page -70