Cybersecurity in Big Data Era From Securing Big Data To Data-Driven Security
Cybersecurity in Big Data Era From Securing Big Data To Data-Driven Security
Abstract—‘‘Knowledge is power” is an old adage that has been found to be true in today’s information age. Knowledge is derived
from having access to information. The ability to gather information from large volumes of data has become an issue of relative
importance. Big Data Analytics (BDA) is the term coined by researchers to describe the art of processing, storing and gathering
large amounts of data for future examination. Data is being produced at an alarming rate. The rapid growth of the Internet,
Internet of Things (IoT) and other technological advances are the main culprits behind this sustained growth. The data generated
is a reflection of the environment it is produced out of, thus we can use the data we get out of systems to figure out the inner
workings of that system. This has become an important feature in cybersecurity where the goal is to protect assets. Furthermore,
the growing value of data has made big data a high value target. In this paper, we explore recent research works in cybersecurity
in relation to big data. We highlight how big data is protected and how big data can also be used as a tool for cybersecurity. We
summarize recent works in the form of tables and have presented trends, open research challenges and problems. With this
paper, readers can have a more thorough understanding of cybersecurity in the big data era, as well as research trends and open
challenges in this active research area.
Index Terms—Big data security, big data driven security, IDS/IPS, data analytics
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2056 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
Fig. 3. Big data (analytics) as a security solution and security attacks that
done on big data enabled security and securing big data are unique to big data in a typical big data enabled systems.
(which are categorically presented in Fig. 3).
Although there are related survey papers [4], [5], [6], [7],
challenges and future directions in this area. Finally, we
[8], [9], [10], [11], [12], [13], [14], [15], [16] on big data secu-
summarize the paper in Section 6.
rity (further details, please refer to Section 4), we present
more up to date approaches, insights, perspectives and
recent trends on the rapidly advancing research field of big 2 SECURITY USING BIG DATA
data in the cybersecurity domain. Our approach to this cov-
Top security companies joined forces to share information
ers the research work done on how big data is used as a
with each other in an attempt to gather intelligence from the
security tool and the emergence of big data as high value
shared data (SecIntel Exchange). Their goal was to provide
asset resulting in research work done on how to secure big
reliable security tools for their clients, and to achieve that,
data. Specifically, the main contributions of this paper
they had to learn as much as possible from evolving threats
include:
that were developed each day. They understood the power
Presenting a comprehensive study on security aspe- of collaboration for the greater good. This was needed
cts of big data by categorizing it into two parts: secu- because with the rise of polymorphic malware and other
rity using big data and big data driven security. evolving threats, they needed a lot information on these
Presenting a summary of attacks and countermeas- threats in order to fully understand what they were dealing
ures for big data in a tabular form for a side-by-side with and how to counteract against it. The traditional
comparison. approaches of classifying malware were proving to be futile.
Presenting a discussion of research challenges, rece- SecIntel Exchange data provided them with the opportunity
nt trends, insights and open problems for big data in to derive actionable insights from voluminous data. Human
cybersecurity. analysis and traditional methods such as database storage
The remainder of this paper is organized as follows. We could however not keep up with the pace of the data that
first classify our work into two major sections (Sections 2 and was being generated [17]. There was the need to adopt mod-
3). We provide a comprehensive study of security using big ern approaches. As seen in a case study conducted by Zions
data as well as securing big data. For each category, we pres- Bancorporation [18], it would take their traditional Security
ent the related recent state-of-the-art literature for the differ- Information and Event Management (SIEM) systems between
ent approaches. Section 2 focuses on the use of big data as a 20 minutes to an hour to query a month’s worth of security
security mechanism. Section 3 tackles how big data is being data. However, when using tools with Hadoop technology, it
protected. Section 4 presents relevant survey papers along would only take about one minute to achieve the same
the line of this paper and the distinction of this paper from results. As such BDA has become an important tool in cyber-
the rest of the surveys. Section 5 presents some research security. Several studies have shown that the traditional
approaches and human analysts can not keep up with the big
data. BDA is one of the best solutions to combat these issues.
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2057
enough about our enemy, but it is definitely possible to know APTs are very hard to detect, and the challenge of detect-
all that we can about ourselves and the assets we protect. To ing and preventing advanced persistent threats may be
do that, we have to gather facts about the asset. This is made answered by using big data analysis. These techniques
possible by the data it generates. This data needs to be ana- could play a key role in helping detect threats at an early
lyzed and insights need to be drawn. BDA can help prepare, stage, especially when it uses a sophisticated pattern analy-
clean, and query heterogeneous data with incomplete and/or sis, that works on different heterogeneous data sources.
noisy records [19], something that would be hard for humans Given the numerous number of APT attacks that organiza-
to do. Analyzing data tends to be hard when the data is het- tions face today, an APT security protective framework has
erogeneous as [20] discovered. In their work, they presented a been presented in [26]. The proposed framework integrates
platform targeted at achieving real time detection and visuali- deep and 3D defense strategies. To protect against APT
zation of cyber threats which they called OwlSight. The plat- attacks, the system classifies data based on the level of confi-
form had several building blocks (data sources, big data dential data. Botnet attacks is also another area where
analytics, web services and visualization) and had the ability big data and machine learning techniques are deployed in.
to collect large amounts of information from a variety of sour- The work in [27] studied techniques for mitigating botnet
ces, analyze the data and output the findings on insightful attacks by using big data Analytics. The Advanced Cyber
dashboards. They did face some issues with the heterogeneity Defense Center (ACDC) orchestrated the sharing of gath-
of the data. However, for machines to do the work effectively, ered cybersecurity information on botnet attacks with the
they need to have some form of human element. Understand- aim of defending through botnets. The work [28] proposed
ing a problem is half the problem solved. The authors in [21] an architecture to address the current issue of botnet detec-
understood this and addressed this issue by coming up with tion. They explored the possibility of employing a Self Orga-
an approach that merged big data analytics with semantic nizing Map as an unsupervised learning approach to label
methods with the aim of trying to gain further insights on the unknown traffic. Financial sector is another area where big
heterogeneous data by understanding it semantically. BDA data analytics is used to prevent malicious actions or cyber
can be used to gather insights making it an essential tool in attacks. The work in [29] studied using data fusion and
cybersecurity. However, the features of big data (four V’s) visualization techniques in Network Forensic Analysis.
also make deriving insights a hard task to accomplish. Also, Cybersecurity Insurance (CI) is becoming more popu-
In the 2017 Data Breach Investigations Report done by Veri- lar because of the increase of loss mitigation for cyber inci-
zon, it was reported that attacks tend to come from different dents for financial firms. Big data has now been employed
sources. 62 percent of the attacks involved hacking, 51 percent in cybersecurity insurance, and the work in [30] proposed a
used malware and 43 percent were social attacks. 14 percent framework which uses a big data approach in CI to analyze
were a result of human errors. As such, the attacker sometimes cyber incidents to gain insights in order to make better
relies on human factor in order to execute a successful attack. strategic decisions based on the information gathered. [31]
In such scenarios, people instead of technology become investigated privacy and security issues associated with the
the target of an attack. Email scams and phishing are the sharing of financial data between institutions.
most common form of these attacks. In a recent study [22], The work in [32] studied a novel Network Functions
52 percent of successful email attacks get their victims to click Virtualization-based (NFV) cybersecurity framework for
within an hour and 30 percent within 10 minutes. The authors providing security-as-a-Service in an evolved telco environ-
in [23] looked into the role of big data in such attack scenarios. ment. The framework is known as SHIELD. This framework
To gain further insights, the authors conducted two studies. leverages BDA for detecting and mitigating threats in real
The first study involved the Enron email dataset. The second time. [33] studied the idea of the construction of security
study was carried out on undergraduate students to obser- monitoring systems for Internet of Things, which is based
ve how email phishing broke security systems based on on parallel processing of data using the Hadoop platform.
user behaviours. The collected data was then analyzed using The proposed systems architecture has different compo-
Enronic software which was followed by the categorization of nents for the collection of data, storage of data, normaliza-
email topics. The authors found that, phishers or attackers tion and analysis, and visualization of data. Storage of data
can understand the behavior of email users using big data is done on Hadoop to improve the reliability and efficiency
analytics, and therefore are able to generate phishing emails of processing of data requests. The work in [34] proposed a
that created security threats based on the insights they gath- Security Information Management (SIM) enhancements
ered. The authors planned on proposing a framework for using BDA. They devised a blueprint for a big data
addressing security threat in email communication in the enhanced SIM, and field tested it using real network secu-
future. In another work, a big data enabled framework was rity logs. The work in [35] proposed a big data analytics
proposed in [24] with the aim of defending against spam and model for protecting virtualized infrastructure in cloud
phishing emails by using a global honeynet. Their framework computing. A Hadoop Distributed File system was used for
collected data from different sources such as pcap files, logs the collection and storage of network logs and application
from a honeynet, black listed sites and social networks for logs from a guest virtual machine. Attack features were
analysis. The framework used Hadoop and Spark for the proc- then extracted using graph-based event correlation and
essing of the collected heterogeneous data which was stored MapReduce parser identification of the potential paths of
in Hadoop Distributed File System (HDFS). However, this attack. A two-step machine learning algorithm using logis-
framework does not provide real-time analysis for big data. tic regression and belief propagation were then applied to
Another form of attack is Advanced Persistent Threats determine the presence of attacks. SIEM is an important tool
(APT) which are sophisticated, well-planned attacks [25]. in cybersecurity information analytics and a good source of
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2058 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
TABLE 1
Securing Big Data
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2059
TABLE 2
Research on Access Control and Encryption Techniques on Big Data
data. The tool developed in [36] analyzes big data (gotten They highlight the importance of graph analytics when it
from SIEM) of a Fortune 500 company in order to gain comes to intuitively understanding of business needs. Based
insights about security threats through anomaly detection. on this, they apply graph analysis in anomaly detection by
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2060 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
TABLE 3
Research on Alternative Approaches to Securing Big Data
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2061
TABLE 3
(Continued )
adding additional important capabilities of existing tools system thus making it unreliable and unusable. The IDS con-
to their new tool, and then to visualize the network ins and cept has been around for two decades but has recently seen a
outs. Finally, another use case of big data for security rea- dramatic rise in the popularity and incorporation into the
sons involves a method for analyzing the security of overall information security infrastructure [38]. IDSs are used
RC4 [37]. Since attacks are diverse and come in multiple to determine if there has been a breach or an interference in
forms, BDA has been used as a cybersecurity tool to miti- the network [39]. An IDS is often regarded as a second-line
gate those attacks. security solution after authentication, firewall, cryptography,
An area in cybersecurity where big data is used a lot and authorization techniques. Similarly, IPS can be classified
is in Intrusion detection and prevention systems (IDS/IPS) into two categories: Network-based IPS and Host-based IPS.
research. Intrusion attempts are done to usually access infor- In network intrusion, prediction and detection is time sensi-
mation, interfere with the information or to tamper with a tive, and needs highly efficient big data technologies to deal
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2062 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2063
against different attacks. Typical techniques for securing big numerical data. The output from the system was a result got-
data are shown in Fig. 5. When data gets really big, securing ten from the computation of the encrypted data which is simi-
it becomes really difficult. In [58], authors studied the secu- lar to that of the plaintext. In the future, the authors will look
rity issues associated with big data and cloud computing. into the improvement of this anonymized data protection
They identified the fact that most organizations outsource approach. The work in [62] explored the use of Format-Pre-
database in the form of big data in to the cloud. Cloud com- serving Encryption (FPE) data masking scheme for volumi-
puting however still has many risks associated with it. The nous data. The approach chooses various FPE algorithms
goal in [58] was to find security vulnerabilities in the cloud in depending on the type of data and what needs to be done.
order to inform vendors about recent vulnerabilities. They Spark framework was used. The authors chose FPE encryp-
noted that confidentiality, integrity and availability in that tion technique because the ciphertext of FPE will still retain
order as the most important security issues a cloud provider the original format of the plaintext instead of unreadable
faces. Confidentiality in this scenario would mean the pro- binary string. The ciphertext will now not contain any sensi-
tection of data against unauthorized interference or usage. tive information. This approach however has its drawbacks.
Integrity would be the prevention of unauthorized and The encryption speed is slow when compared to other tradi-
improper data modification. Availability would be akin to tional symmetric algorithms. In one FPE method call, the algo-
data recovery from hardware, software and system errors, rithm calls the block cipher many times thereby making it
and also from data access denials. However, confidentiality inefficient. Another commonly used encryption is Attribute
is the most important aspect when it comes to big data pro- Based Encryption (ABE). The work in [63] presented a frame-
tection. Several data confidentiality techniques exist with the work for fine-grained data access control to Personal Health
most notable ones being access control and encryption. Records (PHR) in the cloud that uses Attribute-Based Encryp-
tion as an encryption method to ensure that each patient has a
3.1 Access Control and Encryption Techniques unique key based on his/her attributes. The data could be
for Big Data accessed under multi owner settings. It was not only free of
Encryption and access control are similar in the sense that errors, but also protected the data from malicious parties aim-
they are both synonymous with privacy and prevention. A ing at deceiving the data users. In another paper involving
notable difference however is that, encryption usually deals ABE, Yang et al. [64] addressed some of the shortcomings of
with the confidentiality of data. Data can be available to ABE encryption in cloud data storage services. They proposed
either a trusted or untrusted entity. Encryption ensures that a variant of ABE which is a novel distributed, scalable
only authorized trusted entities can view the data. Access and fine-grained access control scheme based on the classifica-
control however tries to limit the access to data. The data tion attributes of the cloud storage object. Their goal was to
limitations usually happens amongst trusted parties. For improve on the shortcomings of ABE by taking into account
this reason, encryption techniques have to be stronger than the relationships among the attributes. The work in [65] inves-
access control techniques. Encryption imposes very strong tigated a hybrid approach that combines symmetric cryptog-
limitations over data confidentiality. However, encryption raphy and ABE to secure big data. They wanted to combine
is not an easy task. It tends to be computationally expensive the flexibility of attribute-based cryptography and the effi-
and it has scalability issues (many users requiring access to ciency of symmetric cryptography. They use Ciphertext-
the same data). Access control tends to be more flexible, and Policy Attribute-Based Encryption (CP-ABE) and AES encry-
is easier to implement. When Big Data is transmitted to the ption. In another form of big data encryption scheme, the
cloud, a security issue emerges. Most organizations would work in [66] proposed an encrypted MongoDB which utilizes
not want their data in the hands of another organization, a homomorphic asymmetric cryptosystem which can be used
thus the need for encryption. A common approach is the for the encryption of user data and in achieving privacy pro-
use of data masking schemes. When the data is transmitted, tection. Thus, the FPE, FHE and ABE are the more popular
it is not encrypted because the approaches used to transmit researched big data encryption techniques.
the data requires that the data be decrypted. This exposes A model for encrypting both symmetric and asymmetric
the data to attacks. Confidentiality breach is the biggest data was presented in [67] which sought to overcome the
threat to big data thus the encryption could be used as the limitation of asymmetric encryption techniques such as key
primary big data protection technique. exchange problem and the limited size of data and which in
In [59], authors studied the data transmission issues. They turn made it irrelevant for big data applications. Their pro-
proposed computing on masked data to solve this. They pro- posed technique was known as BigCrypt which uses a prob-
posed an incremental work to improve upon the already exist- abilistic approach to Pretty Good Privacy Technique (PGP).
ing Fully Homomorphic Encryption (FHE) and other data BigCrypt encrypts the message with a symmetric key and
masking techniques by decreasing the over head associated encrypts the symmetric key using a public receiver key
with other FHE techniques. The work in [60] also tried to which is then attached to the message. The message is then
improve on a Fully Homomorphic Encryption scheme for big sent. At the receiver end, the symmetric key is extracted
data. It attempted to do this by reducing the public key size and then asymmetrically decrypted and used for decrypting
with the aim of making their scheme more efficient. The work the main message. The proposed model was tested on local,
in [61] proposed a model for the protection of data privacy web, and cloud server and was found to be efficient. Fur-
using a fully homomorphic non-deterministic encryption. The thermore, a framework for securing the sharing of sensitive
proposed data protection model ensured the prior encryption data on a big data platform was proposed in [68]. Sharing
of data before it was transmitted and therefore avoidance of sensitive data securely reduces the cost of providing users
the loss of data. The proposed system however only accepted with personalized services in addition to providing value-
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2064 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
added data services. The proposed scheme secures the dis- clinical data through the use of the art encryption scheme
tributed data, securely delivers it, stores it, and ensures and attribute based authorization framework.
secure usage. Semi-trusted big data is also destroyed. The For the access control and privacy of big data, the work
proposed scheme uses a proxy re-encryption algorithm that in [77] presented a hybrid approach based framework that
is based on heterogeneous cipher-text transformation. The composes and enforces privacy policies to capture privacy
scheme also utilizes a user process protection method based requirements in an access control system. Gao et al. [78]
on a virtual machine monitor that supports other system presented a cloud security control mechanism based on big
functions. This framework ensures data security while data. Cloud computing was observed to have increased the
ensuring it is shared safely and securely. Sharma and amount of data in the network. Due to this, big data leaks and
Sharma in [69] discussed the protection of big data using losses occurred. Therefore, there was the need to provide the
neural and quantum cryptography. Neural cryptography necessary level of protection. To that end, they conducted an
incorporates the concept of artificial neural networks with analysis on big data, analyzed the current big data situation.
classical cryptographic algorithms while quantum cryptog- Gupta et al. [79] proposed a security compliance model for
raphy makes use of the phenomenon of quantum physics big data systems. The model provides security and access
for securing communications. The authors also provided a control to big data systems at the initial stage. The proposed
comparative analysis between quantum and neural cryp- system has four models; the library, low critical log, high criti-
tography based on the methodologies that both techniques cal log, and a self-assurance system. The design of this system
employ. From the analysis, the authors showed that a quan- ensures real time analysis of big data. The initial level of secu-
tum computer makes use of quantum mechanisms for com- rity provided by the model is facilitated by its web directory
putation which are very powerful and can therefore crack and its self-assuring framework that identifies and differenti-
complicated problems such as discrete logarithmic problem ates genuine users and critical users. The relationship analy-
in a small duration. Neural key exchange protocol is also sis tool of the users blocks users who are deemed not to be
shown not to depend on any number theory. The analysis genuine. In [76], the authors proposed a framework for pri-
also indicates that neural networks probably have higher vacy policy for big data security. The proposed framework
protection. The work in [70] proposed an efficient group makes use of different techniques including security policy
key transfer protocol necessary for ensuring secure group manager, fragmentation approach, encryption approach, and
communication on big data. The proposal does not use security manager. The characteristics of the privacy policy
an online key generation centers (KGC) which is based on required flexibility, integration, customizability, and context-
3-LSSS (Linear secret sharing scheme) in that three modular awareness. The framework works by receiving data from
multiplications are needed. Additionally, the protocol uses the customer and then analyzing it. It is then followed by the
Diffie-Hellman key agreement. The proposed group key extraction of the privacy policy and finally the identification
transfer scheme consists of two sections; two party secret of sensitive data. Once sensitive data has been identified, a
establishment section and a section for the group session fragmentation algorithm was executed on the sensitive data.
key transfer. The proposed group key transfer scheme was The security modules play the role of identifying sensitive
analyzed to verify its elements of key freshness, key confi- data from non-sensitive data and then regulating its access.
dentiality, and key authentication. Furthermore, the work The work in [80] proposed a privacy protection technology
in [71] proposed a new encryption scheme that can be used and control mechanism for medical big data. The proposed
on big data that uses double hashing instead of a single framework has four main phases; the setup phase, Encrypt
hash. Double hashing they claim eliminates the threat of and Upload phase, Download phase, and Share File phase.
known cryptanalysis attacks. The work in [72] discussed The system first de-identifies the patient personal privacy
primarily about the enhancement of CAST block algorithm data, encrypts it using digital signature mechanisms to pro-
for the security of big data. Their contribution to the tect data confidentiality and the authentication of the data.
enhancement of the cast block algorithm involved the use of The communication security of the data in the system is
one S-box instead of 6, and an approach to make it more protected using the Diffie-Hellman session key while the
dynamic. The work in [73] presented a framework that is integrity of the medical records is protected using a digital
Light-weight Encryption using Scalable Sketching(LESS) signature scheme. Access control is not as big as it used to be
for reducing and encrypting the processing of big data on due to the evolving threats landscape but is still an important
low power platform. This contains two kernels.“sketching” research area in big data security today.
and “sketch-reconstruction”. Orthogonal Matching Pursuit
(OMP) algorithm is implemented on the domain-specific 3.2 Alternative Approaches to Securing Big Data
Power Efficient Nano Cluster platform that acts as a hard- Encryption and access control were the mainstream
ware accelerator and ARM CPU for big data processing. approaches for big data security. However, researchers
Finally, the work [74], discuss the security issues of hetero- have tried other approaches that may or may not involve
geneous, multimedia big data. They tackle resource con- some form of encryption. The nature of big data makes it
straint issues such as limited computation and energy difficult to protect everything. Some researchers have tried
resources. They proposed data encryption models that deals to determine the important parts of big data to protect those
with this issue by reducing the computation overload parts only. The work in [81] tried to tackle the issue of secur-
on weak nodes and by replacing the current encryption ing personal health records by proposing a framework that
models with an improved version based on SAFE encryp- classifies data based on a person’s societal importance and
tion scheme to improve it. The work in [75] mentioned a determining the sensitivity levels of the data. Furthermore,
new approach for the privacy and security protection of [82] tried to secure the attributes of big data that are really
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2065
important/valuable because protecting everything is a diffi- trees structure of the transform of TSH for coding, encrypt-
cult task. They use data masking to protect these high val- ing, and fingerprinting of JPEG2000. Finally, in [88], authors
ued attributes. To determine the attributes that are of value, discussed the use of traditional security framework for pro-
they use a ranking algorithm that prioritizes attributes for tection of the smart grid comes with several disadvantages
big data security. Authors in [83] proposed an attribute such as late detection of attacks when damage has already
selection method for protecting the value of big data by occurred. To address this problem, the authors in this study
determining attributes that have higher relevance using a proposed a security awareness mechanism based on the
ranking algorithm, and providing security measures. In the analysis of big data in the smart grid. The model has three
paper [84], the authors focused on the characteristics of big main parts which include the extraction of network security
data and proposed the protection of big data using a secu- situation factors, network situational assessment and net-
rity hardening methodology that makes use of attribute work situational prediction. The method works by integrat-
relationships. The relationship between the various attrib- ing fuzzy cluster based analytical tools, reinforcement
utes are expressed using nodes and edges. The proposed learning and game theory. The integration of these compo-
model works by limiting the attribute to protect value. The nents facilitates security situational analysis in the smart
model works by first extracting all the attributes of the tar- grid. Simulation tests and experiments showed the pro-
geted big data. The nodes are then arranged circularly fol- posed system to have high efficiency and low error rate.
lowed by the establishment of the relationship between the Sometimes, we have to protect data from the people and
nodes. The relationship is then set based on either the the systems that interact with it. Pissanetzky [89] examined
domain specific criteria or the universal criteria. Finally, the the problem of software vulnerability and the accumulation
protecting nodes are selected followed by the determination of unprocessed information in big data. According to the
of how to protect the selected nodes. Thus, protecting every- authors, these problems are created by human interventions.
thing in big data is hard. An easier approach is to find what To solve these problems, the author proposed the complete
is important and protect that part only. elimination of human intervention. In this approach a causal
Encryption has been used with other techniques as well. set was taken as the universal language of all information and
The work in [85] proposed a method to secure Multimedia computations. Additionally, the author also proposed the con-
Big Data (MBD) in the healthcare cloud by using a Decoy finement of the use of programming languages to the human
Multimedia Big Data (DMBD). The DMBD uses fog comput- interface and therefore a creation of an inner layer of mathe-
ing and a pairing based cryptography that will be used to matical code that is expressed as a causal set. Furthermore,
secure the MBD. Fog Computing was utilized for the stor- this paper also includes experiments and computational veri-
age of the decoy files. In their method, the decoy files are fications of the theory and proposed applications of this
retrieved at the onset unlike other methods that usually approach to science and technology, computer intelligence,
waits until there is an attack before the decoy files are called. and machine learning. Also, [90] researched on how to protect
Thus, both attacker and legitimate users both see a decoy both the data and the program that processes the data while
file until the legitimacy is confirmed. Aynur in [86] pre- taking into consideration the big data processing require-
sented a new technique for securing big data in medical ments. They propose a model that aims to address the issue
applications. The methodology combines three major tech- by hiding operations performed using steganography and
niques that include data hiding, image cryptography and FHE in order to meet the security requirements necessary to
steganography. These techniques facilitate safe and de- protect outsourced data. However, the user’s computation
noised transmission of data. A stream cipher algorithm is cost is somewhat high and the solution does not apply to all
used for encrypting the original image. Patient information applications. The work in [91] addressed the use of cloud com-
is then embedded in the encrypted image by means of a puting and how it provides an organization with various serv-
lossless data embedding technique together with a key for ices for meeting their various needs. However, data storage in
hiding data to enhance the security of data. Steganography cloud computing could be accessed by cloud operators and
is then applied in embedded image with a private key. therefore compromise information privacy and security. In
When the message gets to the receiver, it is decrypted using this respect, this study proposed an approach for splitting and
inverse methods in reverse order. Efficiently securing big separating the stored data on distributed cloud servers and
data continues to become a difficult challenge because of therefore prevent access by cloud operators. The proposed
big data’s variety, volume and veracity issues. The ability to model was known as Security-Aware Efficient Distributed
deal with space and time issues by correlating events would Storage (SA-EDS) and was based on two algorithms; the Effi-
play an important role in securing big data. [87] discussed cient Data Conflation (EDCon) algorithm and Secure Efficient
the growth of social media network such as Facebook and Data Distributions (SED2) algorithm. These algorithms were
cloud computing, and how sharing of multimedia big data tested and proved to be efficient. The authors of [92] proposed
has become easier than ever. However, its increased used is a Field Programmable Gate Arrays (FPGA) based solution for
faced with issues of piracy problems, illegal copying, and running BLAST algorithm in a secure manner in MapReduce
misappropriation. To address these challenges, the authors framework using cloud computing. The proposed system pro-
in this study proposed a system for protecting multimedia tects data from cloud service provider (CSP) through leverag-
big data distribution in social networks. The scheme utilizes ing on bitstream encryption mechanism and FPGAs tamper
a Tree-Structured Harr (TSH) transform. In this scheme, a resistant property. The authors also put into consideration the
homomorphic encrypted domain for fingerprinting by risks that arise from keys distribution and propose counter-
means of social media network analysis is applied. The measures for handling it. The work in [93] studied an
scheme aims at mapping hierarchical social networks into approach that assesses the risk behind various applications
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2066 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
and provides an explanation of the ability of the application to require tremendous amount of hash power in order to be
protect data using a specific security classification level. The achieved. The centralized way of storing data is prone to
proposed method has three main components; Automatic data breaches and hacks [109]. This method is susceptible to
Risk assessment of the Application, Automatic Generation of single point of failure of problems as well. Distributed data
Criteria for storage of specific data, and Automatic Reporting. storage tries to take data away from the hands of these
The report facilitates the recommendation of the appropriate centralized authorities, thereby taking away various secu-
security level. The work in [94] proposed a hadoop system rity risks. The work in [106] proposed a model for security
that would both secure and maintain the privacy of big data. sharing based on blockchain technology to address trust
They tried to do this by using four encryption techniques ran- issues often associated with circulation of data. The pro-
domly. However, these encryption techniques are time con- posed model provides a credible platform for sharing data
suming, thus they proposed a buffer system where the buffer between data producers and demand parties though build-
stores information whilst the system works on the previous ing a decentralized security system for the circulation of
data stored in order to prevent information loss. data. The security system is built using blockchain and
Knowing the characteristics of the data is an important smart contract. While blockchain technology ensures the
aspect of protecting the data. Singh [95] studied the value of traceability of data, the automated execution of smart con-
real-time BDA and the security challenge that comes with tract provides the security for data security sharing. The
protecting big data. Singh notes that, proper protection of decentralized architecture ensures the data provider does
big data should focus on volume, velocity, and variety of not suffer from the risks of sharing data from a centralized
big data. Multilevel security for big data should be provided storage system. On the user’s side, transparency in the col-
at the application, operating systems, and network levels. lection of information is assured by the blockchain opera-
However, using the traditional protection mechanism is tion model and thereby bringing stronger user privacy
challenging for large volumes of data that is changing con- protection. [110] proposed a system called MeDShare which
tinuously. For this reason, Singh recommends the use of is a blockchain based and provides data source auditing,
machine learning for protection of big data with focus on and control for shared medical data in cloud repositories.
supervised and unsupervised learning. Yang [96] examined The MeDShare system helps to transfer and share data from
the visualization of network security under the big data one source to another, and are recorded in a tamper-proof
environment. The authors first look at the 5V characteristics manner. The marriage of blockchain and Big Data is immi-
of big data including volume, velocity, variety, value, and nent as blockchain ensures data integrity.
visualize. These 5V features are then mapped onto network The work in [102] proposed a secure privacy-preserving
security data followed by a description of the visualization scheme using dot product in mobile big data. Privacy-pre-
of the data security technology. The network visualization serving dot product has been used in data mining for a long
technology proposed include the use of radial traffic analy- time as it helps in curbing statistical analysis attacks. It is
ser and SRNET. They also proposed safety visualization now being used in big data for its anonymous private pro-
using ClockMap and discussed diversified technologies for file matching. The paper was just an exploratory research
visualization of big data. With the increasing volume of big on its use in mobile big data. There is however still room for
data, security and privacy issues also continue to increase. further improvement. The work in [103] explored the idea
Peer to peer (P2P) protocols such as BitTorrent are now of a data anonymization technique to support merging of
being used to widen the transfer of big data. However, this encrypted data. The technique ensures the protection of pri-
increase has also attracted widespread security challenges. vacy in the collection and merging of data and secures
Research indicate that P2P are sophisticated in data transfer multi-party sharing of data without the involvement of
but experience challenges when distributing big data. Ban third parties. The merging result as proposed in this study
et al. [97] presented a study on the early identification of does not lead to the violation of the privacy of the individ-
attacks using the darknet. The system works by first explor- ual. Additionally, the proposed mechanism allows for stor-
ing the regularities in communications from the attackers. age of different datasets from different parties in multiple
This is achieved using an itemset mining engine. It then third-party centers without leaking the identity of owners
characterizes the activity level of each pattern of attack cre- of that data. The anonymized data can be joined securely
ating a time series. A clustering algorithm is then applied to within a reasonable time. Experiments conducted by the
extract the most prominent patterns of attack. The attack authors indicated that 100,000 entries of data can be merged
patterns are clustered into groups having similar activities. in about 1.4 seconds using the optimized secure merging
Visual hints on the relationship of the various attacks is procedure. To answer the question of how security classifi-
then provided using a dimension reduction technique. cation can be managed on a system. In addition, the work in
Attacks that feature prominently are then the picked up for [104] proposed a cloud based security intelligence system
further analysis by experts. The authors showed that the for big data processing. The authors provide a highly scal-
proposed system was efficient in early attack detection. able plug-in based solution that monitors big data systems
The union of blockchain and big data will make sure that in real time and therefore reduced the impact of attacks
the data that is generated from the blockchain is trustwor- or threats on a distributed infrastructure. The solution
thy. This is because the provenance of the data is known. proposed here was named Advanced Persistent Security
Also, the likelihood of the data being interfered with is very Insights System (APSIS). APSIS works by taking advantage
low. This is made possible through the blockchain’s consen- of a SIEM system including aggregation, correlation, alert-
sus mechanism and its secure cryptographic hash function ing, and forensic analysis. This is exposed to big data but
which ensures data immutability. Data manipulation would with security intelligence to provide accurate results. APSIS
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2067
monitors all devices on the network that generate log files literature review covering security and privacy for big data
and therefore assures security. In the future, the authors by categorizing approaches in terms of confidentiality, data
aim at exploring the proof of concept to evaluate the robust- integrity, privacy, data analysis, visualization, data format,
ness of the proposed architecture. The work in [105] started and stream processing. Miloslavskaya et al. [10] examined
by looking at security threats of the new multimedia hetero- the need for Security Operation Centres (SOCs) for organiza-
geneous big data. The first threat was lack of effective mech- tions that want to achieve the highest protection for their
anisms for the protection of this new media ownership as data. The work in [111] looked at security intelligence centres
DRM is facing challenges. Second, there is lack of a clean (SICs) for processing of big data. The work in [112] proposed
environment for the consumption of new media. To over- a framework which combined the techniques of security
come these challenges, Lu proposed the use of “blocking as intelligence and big data analytics to support human ana-
loose” technology for protection, intelligent cleaning of new lysts for prioritization. The work in [113] studied the security
media big data, and the mining of big data in a safe manner. issues identified within the field of multimedia applications.
[98] summarized how bloom filter and its variants are used In [11], Arora et al. performed a survey on big data and its
to secure big data. After various experiments, they con- security. The work in [114] highlighted the pros of big data,
cluded that, bloom filter can be used for efficiency reasons and then later tackles the challenges faced in China. In [115],
because there are space and time issues when it comes to Zou analyzed major issues associated with big data and
analyzing and indexing big data which would in turn lead especially the breach of personal information, the potential
to better security analytics. The research work in [99] pro- security risks, and the reduction of control rights of users
posed a framework for secure and private management and over their personal information.
mining of data that addresses both security and privacy Mondek et al. [116] discussed security analytics in this
issues in health-care data management especially in out- era of big data and the reality of information security.
sourced databases. The solution works by executing SQL Mahmood and Afzal [12] presented a review of big data
queries on encrypted data and returning deferentially-dif- analytics trends, tools, and techniques. The study of security
ferent answers on the outsourced databases. Laplace mech- analytics is motivated by the inadequacy of existing cyber-
anism are used to illustrate the computation of private security solutions to counteract cybersecurity attacks asso-
queries. Private decision tree learning is also discussed. An ciated with big data. Jayasingh, Patra, and Mahesh [117]
experimental evaluation of the proposed solution shows the discussed security issues and challenges that faces security
system incurs small communication and computation over- analysts in big data analytics and visualization. In [118], the
head. For this reason, the authors in this study [100] pro- authors discussed six changes in the information technology
posed a scheme for secure and efficient distribution of big sector that they believe will be the game changers for
data on BitTorrent networks. The proposed scheme is built the next 15 years. The work in [13], [14] presented security
inside the BitTorrent protocol and thus allowing the servers solutions for the big data in health-care industry. Health-
to regulate and trace user’s behavior and sensitivity of data. care generates a lot of data from diverse sources and thus
making it difficult to analyze. Similarly, in [119], Patil and
Seshadri presented security and privacy issues in big data
4 EXISTING SURVEYS ON BIG DATA IN
relating to the health-care security policies. The work in
CYBERSECURITY [120] summarized the current health-care security scenarios
Bertino [4] presented the security and privacy issues for big in big data environments in the USA.
data concerning the confidentiality, privacy, and trustwor- The work in [15] put forward a model of big data security
thiness. In data confidentiality, the challenges identified service for data providers, users, and cloud service pro-
were merging large number of access control policies and viders. The work in [121] looked at opportunities, challenges,
enforcing control policies in big data sources. Cybersecurity and security concerns associated with the use of big data in
tasks such as user authentication, access control, and user cloud computing. Furthermore, the work in [122] proposed
monitoring are noted to be key in identifying threats and integrated auditing for securing big data in the cloud. The
stopping them. The author noted that both security and pri- authors presented their study by reviewing the characteris-
vacy can be achieved by using advanced technologies such tics of big data and security challenges in the cloud. The
as cryptography. Mishra and Singh [5] examined security works in [123], [124], [125] proposed a security measure for
and privacy challenges associated with big data analytics for big data, virtualization, and the cloud infrastructure and
protecting database storage and transaction log files, and cloud based big data storage systems. Big data is making its
secure computations in distributed frameworks. The authors way in the power industry. Smart grid has unique character-
in [6] highlighted the benefits of big data analytics and istics peculiar to it. The work in [126], [127] highlighted
reviewed security and privacy challenges in big data envi- different articles that discuss the peculiarities of smart grid
ronments using various BDA tools such as Hadoop, MapRe- big data and how to properly handle it. Authors in [128]
duce, and HDFS. Security and privacy challenges associated looked at security issues brought by big data applications in
with big data environments were also listed as random dis- the telecommunication industry and especially associated
tribution, security of big data computations, and access con- with mobile network operators. In [129], authors surveyed
trol. [7] examined big data emerging issues of security and three different techniques, namely homomorphic encry-
privacy in relation to the use of big data analytic tools such ption, verifiable computation and multi-party computation.
as Hadoop. The work in [8] presented a review of big data They discuss relevant security threats in the cloud, and a
security and privacy challenges while storing, searching computation model that captures a large class of big data
and analyzing. In [9], the authors conducted a systematic uses cases. The work in [130] studied the impact of security
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2068 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2069
monopolized data, therefore bringing in the most revenue. other methods than encryption and access control in trying
New blockchain startups are now basing their business mod- to secure other aspects of the CIA triad. We make it easier on
els on how to disrupt these monopolies by highlighting the readers by summarizing the problem each paper addresses
value of data to the public. The selling point for these start- and their approach to solve it in tabular form. Furthermore,
ups is that the data stored in a centralized fashion is suscepti- we present future trends in big data security that we foresee,
ble to attacks (Facebook, Yahoo, and Equifax hackings) as and the challenges associated with it.
evident in recent years. A distributed approach to storing
data is the safer way to prevent attacks is what is being evan- ACKNOWLEDGMENTS
gelized. This method is not susceptible to single point of fail- This work is supported in part by the U.S. National Science
ure problems as well. Companies such as the GAFAs store Foundation (NSF) under grants CNS-1658972, CNS-1650831
huge amount of data and they can correctly be termed as and HRD 1828811. However, any opinion, finding,
data silos. Distributed data storage try to take data away and conclusions or recommendations expressed in this
from the hands of these data silos, thereby taking away vari- document are those of the authors and should not be inter-
ous security risks. Furthermore, the union of blockchain and preted as necessarily representing the official policies, either
big data will make sure that the data that is generated from expressed or implied, of the NSF.
the blockchain is trustworthy. There are ongoing research
and open challenges on decentralized and context-aware
REFERENCES
data storage for big data.
[1] D. Laney, “3d data management: Controlling data volume, veloc-
ity and variety,” META Group Res. Note, vol. 6, no. 70, p. 1, 2001.
5.4 Applying Artificial Intelligence [2] N. Miloslavskaya and A. Tolstoy, “Application of big data, fast data,
Artificial Intelligence based polymorphic malware is on the and data lake concepts to information security issues,” in Proc. IEEE
Int. Conf. Future Internet Things Cloud Workshops, 2016, pp. 148–153.
rise. Now, there is an application that can alter malware to [3] D. Rawat and K. Z. Ghafoor, Smart Cities Cybersecurity and Pri-
trick machine learning antivirus software. In an experiment vacy. Amsterdam, The Netherlands: Elsevier, Dec. 2018.
done by Endgame (a security company), they found out [4] E. Bertino, “Big data-security and privacy,” in Proc. IEEE Int.
that AI has blind-spots that can be found out by other AI Congr. Big Data, 2015, pp. 757–761.
[5] A. D. Mishra and Y. B. Singh, “Big data analytics for security and
applications. This is evident as seen in Generative Adver- privacy challenges,” in Proc. Int. Conf. Comput. Commun. Autom.,
sarial Networks discovered by google researchers. This 2016, pp. 50–53.
shows that organizations should not view machine learning [6] Y. Gahi, M. Guennoun, and H. T. Mouftah, “Big data analytics:
Security and privacy challenges,” in Proc. IEEE Symp. Comput.
as a fool proof way of defending against malware. More Commun., 2016, pp. 952–957.
research work is needed in this area because of the rise of [7] K. Abouelmehdi, A. Beni-Hssane, H. Khaloufi, and M. Saadi,
GANs. Also, an immediate approach to solve this would be “Big data emerging issues: Hadoop security and privacy,” in
to combine humans and AI in the malware detection Proc. 5th Int. Conf. Multimedia Comput. Syst., 2016, pp. 731–736.
[8] B. Matturdi, Z. Xianwei, L. Shuai, and L. Fuhong, “Big data
approach. AI is not fool proof yet, and we see research security and privacy: A review,” China Commun., vol. 11, no. 14,
trends gearing towards human in the Loop approaches to pp. 135–145, 2014.
detect polymorphic malware. [9] B. Nelson and T. Olovsson, “Security and privacy for big data: A
systematic literature review,” in Proc. IEEE Int. Conf. Big Data,
2016, pp. 3693–3702.
5.5 Scalability for Cybersecurity Techniques in Big [10] N. Miloslavskaya, A. Tolstoy, and S. Zapechnikov, “Taxonomy
Data Era for unsecure big data processing in security operations centers,”
in Proc. IEEE Int. Conf. Future Internet Things Cloud Workshops,
In big data, protecting everything is hard. The easier approach 2016, pp. 154–159.
is to find what is important and protect it. Traditional [11] S. Arora, M. Kumar, P. Johri, and S. Das, “Big heterogeneous
approaches for securing data might not work in a straightfor- data and its security: A survey,” in Proc. Int. Conf. Comput., Com-
ward way. Thus, finding an optimal approach that is scalable mun. Autom., 2016, pp. 37–40.
[12] T. Mahmood and U. Afzal, “Security analytics: Big data analytics
for big data enabled systems is still an active research topic. for cybersecurity: A review of trends, techniques and tools,” in
Proc. 2nd Nat. Conf. Inf. Assurance, 2013, pp. 129–134.
[13] S. Rao, S. Suma, and M. Sunitha, “Security solutions for big data
6 CONCLUSION analytics in healthcare,” in Proc. 2nd Int. Conf. Adv. Comput. Com-
In this paper, we have surveyed state of the art literature on mun. Eng., 2015, pp. 510–514.
[14] I. Olaronke and O. Oluwaseun, “Big data in healthcare: Pros-
big data in cybersecurity. We segmented the work into two pects, challenges and resolutions,” in Proc. Future Technol. Conf.,
parts. The first part was research work involving the use of 2016, pp. 1152–1157.
big data for security purposes. The second part is the [15] H.-T. Cui, “Research on the model of big data serve security in
research work done on securing big data. We present current cloud environment,” in Proc. IEEE Int. Conf. Comput. Commun.
Internet, 2016, pp. 514–517.
trends on the use of BDA as security tool. We also addressed [16] E. Damiani, “Toward big data risk analysis,” in Proc. IEEE Int.
the role of machine learning in this area and some of the chal- Conf. Big Data, 2015, pp. 1905–1909.
lenges machine learning has to overcome before it becomes [17] C. Sinclair, L. Pierce, and S. Matzner, “An application of machine
an important feature in the cybersecurity toolkit. Further- learning to network intrusion detection,” in Proc. 15th Annu.
Comput. Secur. Appl. Conf., 1999, pp. 371–377.
more, we discussed current literature on techniques used to [18] E. Chickowski, “A case study in security big data analysis,” Dark
secure big data. The confidentiality of big data is usually the Reading, vol. 9, 2012. https://www.darkreading.com/analytics/
main focus thus making encryption and access control tech- security-monitoring/a-case-study-in-security-big-data-analysis/
d/d-id/1137299
niques the main research areas when it comes to big data [19] M. C. Raja and M. A. Rabbani, “Big data analytics security issues
security. We also discussed the alternative approaches used in data driven information system,” IJIRCCE, vol. 2, no. 10,
to secure big data where the proposed approaches rely on pp. 6132–6134, 2014.
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2070 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
[20] V. S. Carvalho, M. J. Polidoro, and J. P. Magalh~aes, “Owlsight: [40] S. Suthaharan, “Big data classification: Problems and challenges
Platform for real-time detection and visualization of cyber in network intrusion prediction with machine learning,” ACM
threats,” in Proc. IEEE 2nd Int. Conf. Big Data Security on Cloud/ SIGMETRICS Perform. Eval. Rev., vol. 41, no. 4, pp. 70–73, 2014.
IEEE Int. Conf. High Perform. Smart Comput/IEEE Int. Conf. Intell. [41] H.-M. Chen, R. Kazman, I. Monarch, and P. Wang, “Predicting
Data Secur., 2016, pp. 61–66. and fixing vulnerabilities before they occur: a big data
[21] Y. Yao, L. Zhang, J. Yi, Y. Peng, W. Hu, and L. Shi, “A framework approach,” in Proc. 2nd ACM Int. Workshop BIG Data Softw. Eng.,
for big data security analysis and the semantic technology,” in 2016, pp. 72–75.
Proc. 6th Int. Conf. IT Convergence Secur., 2016, pp. 1–4. [42] S. Marchal, X. Jiang, R. State, and T. Engel, “A big data architec-
[22] ProofPoint.com, “The human factor report people-centered ture for large scale security monitoring,” in Proc. IEEE Int. Congr.
threats define the landscape,” 2018. https://cdn2.hubspot.net/ Big Data, 2014, pp. 56–63.
hubfs/508286/blog-files/The%20Human%20Factore%20Report [43] B. Yang and T. Zhang, “A scalable meta-model for big data secu-
%202018.pdf rity analyses,” in Proc. IEEE 2nd Int. Conf. Big Data Secur. Cloud/
[23] T. Zaki, M. S. Uddin, M. M. Hasan, and M. N. Islam, “Security IEEE Int. Conf. High Perform. Smart Comput/IEEE Int. Conf. Intell.
threats for big data: A study on enron e-mail dataset,” in Proc. Data Secur., 2016, pp. 55–60.
Int. Conf. Res. Innovation Inf. Syst., 2017, pp. 1–6. [44] P. Casas, F. Soro, J. Vanerio, G. Settanni, and A. D’Alconzo,
[24] P. H. Las-Casas, V. S. Dias, W. Meira, and D. Guedes, “A big data “Network security and anomaly detection with big-dama, a big
architecture for security data and its application to phishing data analytics framework,” in Proc. IEEE 6th Int. Conf. Cloud
characterization,” in Proc. IEEE 2nd Int. Conf. Big Data Security on Netw., 2017, pp. 1–7.
Cloud/IEEE Int. Conf. High Perform. Smart Comput/IEEE Int. Conf. [45] S. Kumar, A. Viinikainen, and T. Hamalainen, “Machine learning
Intell. Data Secur., 2016, pp. 36–41. classification model for network based intrusion detection sys-
[25] A. A. C ardenas, P. K. Manadhata, and S. Rajan, “Big data analyt- tem,” in Proc. 11th Int. Conf. Internet Technol. Secured Trans., 2016,
ics for security intelligence,” pp. 1–22, September, 2013, Technical pp. 242–249.
Report by Big Data Working Group of Cloud Security Alliance, [46] U. Fiore, F. Palmieri, A. Castiglione, and A. De Santis, “Network
https://downloads.cloudsecurityalliance.org/initiatives/bdwg/ anomaly detection with the restricted boltzmann machine,”
Big_Data_Analytics_for_Security_Intelligence.pdf. Elsevier Neurocomputing, vol. 122, pp. 13–23, 2013.
[26] W. Jia, “Study on network information security based on big [47] M. A. Salama, H. F. Eid, R. A. Ramadan, A. Darwish, and
data,” in Proc. 9th Int. Conf. Meas. Technol. Mechatronics Autom., A. E. Hassanien, “Hybrid intelligent intrusion detection
2017, pp. 408–409. scheme,” in Soft Computing in Industrial Applications. New York,
[27] B. G.-N. Crespo and A. Garwood, “Fighting botnets with cyber- NY, USA: Springer, 2011, pp. 293–303.
security analytics: Dealing with heterogeneous cyber-security [48] A. Javaid, Q. Niyaz, W. Sun, and M. Alam, “A deep learning
information in new generation siems,” in Proc. 9th Int. Conf. approach for network intrusion detection system,” in Proc. 9th
Availability Rel. Secur, 2014, pp. 192–198. EAI Int. Conf. Bio-Inspired Inf. Commun. Technol., 2016, pp. 21–26.
[28] D. C. Le, A. N. Zincir-Heywood, and M. I. Heywood, “Data ana- [49] M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya,
lytics on network traffic flows for botnet behaviour detection,” in R. Wald, and E. Muharemagic, “Deep learning applications and
Proc. IEEE Symp. Series Comput. Intell., 2016, pp. 1–7. challenges in big data analytics,” J. Big Data, vol. 2, no. 1, 2015,
[29] H. Fatima, S. Satpathy, S. Mahapatra, G. Dash, and S. K. Pradhan, Art. no. 1.
“Data fusion & visualization application for network forensic [50] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan,
investigation-a case study,” in Proc. 2nd Int. Conf. Anti-Cyber I. Goodfellow, and R. Fergus, “Intriguing properties of neural
Crimes, 2017, pp. 252–256. networks,” arXiv preprint arXiv:1312.6199, 2013.
[30] K. Gai, M. Qiu, and S. A. Elnagdy, “A novel secure big data cyber [51] N. Papernot, P. McDaniel, A. Sinha, and M. Wellman, “Towards
incident analytics framework for cloud-based cybersecurity the science of security and privacy in machine learning,” arXiv
insurance,” in Proc. IEEE 2nd Int. Conf. Big Data Secur. Cloud/IEEE preprint arXiv:1611.03814, 2016.
Int. Conf. High Perform. Smart Comput/IEEE Int. Conf. Intell. Data [52] W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adver-
Secur, 2016, pp. 171–176. sarial examples in deep neural networks,” arXiv preprint
[31] K. Gai, M. Qiu, and S. A. Elnagdy, “Security-aware informa- arXiv:1704.01155, 2017. https://machine-learning-and-security.
tion classifications using supervised learning for cloud-based github.io/papers/mlsec17_paper_52.pdf
cyber risk management in financial big data,” in Proc. IEEE [53] V. Patel, “A practical solution to improve cyber security on a
2nd Int. Conf. Big Data Secur. Cloud/IEEE Int. Conf. High Per- global scale,” in Proc. 3rd Worldwide Cybersecurity Summit, 2012,
form. Smart Comput/IEEE Int. Conf. Intell. Data Secur, 2016, pp. 1–5.
pp. 197–202. [54] C. Zhong, J. Yen, P. Liu, and R. F. Erbacher, “Automate cyberse-
[32] G. Gardikis, K. Tzoulas, K. Tripolitis, A. Bartzas, S. Costicoglou, curity data triage by leveraging human analysts’ cognitive proc-
A. Lioy, B. Gaston, C. Fernandez, C. Davila, A. Litke, et al., ess,” in Proc. IEEE 2nd Int. Conf. Big Data Secur. Cloud/IEEE Int.
“Shield: A novel nfv-based cybersecurity framework,” in Proc. Conf. High Perform. Smart Comput/IEEE Int. Conf. Intell. Data
IEEE Conf. Netw. Softwarization, 2017, pp. 1–6. Secur., 2016, pp. 357–363.
[33] I. Saenko, I. Kotenko, and A. Kushnerevich, “Parallel process- [55] M. G. Schultz, E. Eskin, F. Zadok, and S. J. Stolfo, “Data mining
ing of big heterogeneous data for security monitoring of iot methods for detection of new malicious executables,” in Proc.
networks,” in Proc. 25th Euromicro Int. Conf. Parallel Distrib. IEEE Symp. Secur. Privacy, 2001, pp. 38–49.
Netw.-Based Process., 2017, pp. 329–336. [56] H. H. Huang and H. Liu, “Big data machine learning and graph
[34] F. Gottwalt and A. P. Karduck, “Sim in light of big data,” in Proc. analytics: Current state and future challenges,” in Proc. IEEE Int.
11th Int. Conf. Innovations Inf. Technol, 2015, pp. 326–331. Conf. Big Data, 2014, pp. 16–17.
[35] T. Y. Win, H. Tianfield, and Q. Mair, “Big data based security analyt- [57] N. Naik, P. Jenkins, N. Savage, and V. Katos, “Big data security
ics for protecting virtualized infrastructures in cloud computing,” analysis approach using computational intelligence techniques
IEEE Trans. Big Data, 2017, vol. 4, no. 1, pp. 11–25, Mar. 2018. in r for desktop users,” in Proc. IEEE Symp. Series Comput. Intell.,
[36] C. Puri and C. Dukatz, “Analyzing and predicting security event 2016, pp. 1–8.
anomalies: Lessons learned from a large enterprise big data [58] K. Kaur, A. Syed, A. Mohammad, and M. N. Halgamuge, “An
streaming analytics deployment,” in Proc. 26th Int. Workshop evaluation of major threats in cloud computing associated with
Database Expert Syst. Appl., 2015, pp. 152–158. big data,” in Proc. IEEE 2nd Int. Conf. Big Data Anal., 2017,
[37] C. Liu, Y. Cai, and T. Wang, “Security evaluation of rc4 using big pp. 368–372.
data analytics,” in Proc. 7th IEEE Int. Conf. Softw. Eng. Serv. Sci., [59] J. Kepner, V. Gadepally, P. Michaleas, N. Schear, M. Varia,
2016, pp. 316–320. A. Yerukhimovich, and R. K. Cunningham, “Computing on
[38] K. Scarfone and P. Mell, “Guide to intrusion detection and masked data: a high performance method for improving big
prevention systems (idps),” NIST Special Publication, vol. 800, data veracity,” in Proc. IEEE High Perform. Extreme Comput. Conf.,
no. 2007, 2007, Art. no. 94. 2014, pp. 1–6.
[39] S. Mukkamala, A. Sung, and A. Abraham, “Cyber security [60] D. Wang, B. Guo, Y. Shen, S.-J. Cheng, and Y.-H. Lin, “A faster
challenges: Designing efficient intrusion detection systems and fully homomorphic encryption scheme in big data,” in Proc.
antivirus tools,” Enhancing Comput. Secur. Smart Technol., Rao V. IEEE 2nd Int. Conf. Big Data Anal., 2017, pp. 345–349.
(Ed.), CRC Press, USA, ISBN 0849330459, pp. 125–161, 2005.
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
RAWAT ET AL.: CYBERSECURITY IN BIG DATA ERA: FROM SECURING BIG DATA TO DATA-DRIVEN SECURITY 2071
[61] T. B. Patil, G. K. Patnaik, and A. T. Bhole, “Big data privacy using [85] H. A. Al Hamid, S. M. M. Rahman, M. S. Hossain, A. Almogren,
fully homomorphic non-deterministic encryption,” in Proc. IEEE and A. Alamri, “A security model for preserving the privacy of
7th Int. Adv. Comput. Conf., 2017, pp. 138–143. medical big data in a healthcare cloud using a fog computing
[62] B. Cui, B. Zhang, and K. Wang, “A data masking scheme for sen- facility with pairing-based cryptography,” IEEE Access, vol. 5,
sitive big data based on format-preserving encryption,” in Proc. pp. 22313–22328, 2017.
IEEE Int. Conf. Comput. Sci. Eng./IEEE Int. Conf. Embedded Ubiqui- [86] A. Unal, “Security in big data of medical records,” in Proc. Conf.
tous Comput., 2017, vol. 1, pp. 518–524. IT Bus. Ind. Government, 2014, pp. 1–2.
[63] M. Li, S. Yu, K. Ren, and W. Lou, “Securing personal health [87] C. Ye, Z. Xiong, Y. Ding, J. Li, G. Wang, X. Zhang, and K. Zhang,
records in cloud computing: Patient-centric and fine-grained “Secure multimedia big data sharing in social networks using
data access control in multi-owner settings,” in Proc. Int. Conf. fingerprinting and encryption in the jpeg2000 compressed
Secur. Privacy Commun. Syst., 2010, vol. 10, pp. 89–106. domain,” in Proc. IEEE 13th Int. Conf. Trust Secur. Privacy Comput.
[64] T. Yang, P. Shen, X. Tian, and C. Chen, “A fine-grained access Commun., 2014, pp. 616–621.
control scheme for big data based on classification attributes,” in [88] J. Wu, K. Ota, M. Dong, J. Li, and H. Wang, “Big data analysis
Proc. IEEE 37th Int. Conf. Distrib. Comput. Syst. Workshops, 2017, based security situational awareness for smart grid,” IEEE Trans.
pp. 238–245. Big Data, vol. 4, no. 3, pp. 408–417, Sep. 2016.
[65] S. Perez, J. L. Hernandez-Ramos, D. Pedone, D. Rotondi, [89] S. Pissanetzky, “On the future of information: Reunification, com-
L. Straniero, and A. F. Skarmeta, “A digital envelope approach putability, adaptation, cybersecurity, semantics,” IEEE Access,
using attribute-based encryption for secure data exchange in iot vol. 4, pp. 1117–1140, 2016.
scenarios,” in Proc. Global Internet Things Summit, 2017, pp. 1–6. [90] L. Xu, P. D. Khoa, S. H. Kim, W. W. Ro, and W. Shi, “Another
[66] G. Xu, Y. Ren, H. Li, D. Liu, Y. Dai, and K. Yang, “Cryptmdb: A look at secure big data processing: Formal framework and a
practical encrypted mongodb over big data,” in Proc. IEEE Int. potential approach,” in Proc. IEEE 8th Int. Conf. Cloud Comput.,
Conf. Commun., 2017, pp. 1–6. 2015, pp. 548–555.
[67] A. Al Mamun, K. Salah, S. Al-maadeed, and T. R. Sheltami, [91] K. Gai, M. Qiu, and H. Zhao, “Security-aware efficient mass distrib-
“Bigcrypt for big data encryption,” in Proc. 4th Int. Conf. Softw. uted storage approach for cloud systems in big data,” in Proc. IEEE
Defined Syst., 2017, pp. 93–99. 2nd Int. Conf. Big Data Secur. Cloud/IEEE Int. Conf. High Perform. Smart
[68] X. Dong, R. Li, H. He, W. Zhou, Z. Xue, and H. Wu, “Secure sen- Comput/IEEE Int. Conf. Intell. Data Secur., 2016, pp. 140–145.
sitive data sharing on a big data platform,” Tsinghua Sci. Technol., [92] L. Xu, H. Kim, X. Wang, W. Shi, and T. Suh, “Privacy preserving
vol. 20, no. 1, pp. 72–80, 2015. large scale dna read-mapping in MapReduce framework using
[69] A. Sharma and D. Sharma, “Big data protection via neural and fpgas,” in Proc. 24th Int. Conf. Field Programmable Logic Appl.,
quantum cryptography,” in Proc. 3rd Int. Conf. Comput. Sustain- 2014, pp. 1–4.
able Global Develop., 2016, pp. 3701–3704. [93] G. Collard, E. Disson, G. Talens, and S. Ducroquet, “Proposition of
[70] C. Zhao and J. Liu, “Novel group key transfer protocol for big a method to aid security classification in cybersecurity context,” in
data security,” in Proc. IEEE Adv. Inf. Technol. Electron. Autom. Proc. 14th Annu. Conf. Privacy Secur. Trust, 2016, pp. 88–95.
Control Conf., 2015, pp. 161–165. [94] P. Adluru, S. S. Datla, and X. Zhang, “Hadoop eco system for big
[71] S. Almuhammadi and A. Amro, “Double-hashing operation data security and privacy,” in Proc. IEEE Long Island Syst. Appl.
mode for encryption,” in Proc. IEEE 7th Annu. Comput. Commun. Technol. Conf., 2015, pp. 1–6.
Workshop Conf., 2017, pp. 1–7. [95] J. Singh, “Real time big data analytic: Security concern and chal-
[72] F. A. Kadhim, G. H. Abdul-Majeed, and R. S. Ali, “Enhancement lenges with machine learning algorithm,” in Proc. Conf. IT Bus.
cast block algorithm to encrypt big data,” in Proc. Annu. Conf. Ind. Government, 2014, pp. 1–4.
New Trends Inf. Commun. Technol. Appl., 2017, pp. 80–85. [96] T. Yang and S. Jia, “Research on network security visualization
[73] A. Kulkarni, C. Shea, H. Homayoun, and T. Mohsenin, “Less: Big under big data environment,” in Proc. Int. Comput. Symp., 2016,
data sketching and encryption on low power platform,” in Proc. pp. 660–662.
Conf. Des. Autom. Test Eur., pp. 1635–1638, 2017. [97] T. Ban, S. Pang, M. Eto, D. Inoue, K. Nakao, and R. Huang,
[74] C. Xiao, L. Wang, Z. Jie, and T. Chen, “A multi-level intelligent “Towards early detection of novel attack patterns through the
selective encryption control model for multimedia big data secu- lens of a large-scale darknet,” in Proc. Int. IEEE Conf. Ubiquitous
rity in sensing system with resource constraints,” in Proc. IEEE Intell. Comput./Adv. Trusted Comput./Scalable Comput. Commun./
3rd Int. Conf. Cyber Secur. Cloud Comput., 2016, pp. 148–153. Cloud Big Data Comput./Internet People/Smart World Congr., 2016,
[75] A. Soceanu, M. Vasylenko, A. Egner, and T. Muntean, “Manag- pp. 341–349.
ing the privacy and security of ehealth data,” in Proc. 20th Int. [98] S. A. Alsuhibany, “A space-and-time efficient technique for big
Conf. Control Syst. Comput. Sci., 2015, pp. 439–446. data security analytics,” in Proc. Saudi Int. Conf. Inf. Technol. (Big
[76] A. Al-Shomrani, F. Fathy, and K. Jambi, “Policy enforcement for Data Analysis), 2016, pp. 1–6.
big data security,” in Proc. 2nd Int. Conf. Anti-Cyber Crimes, 2017, [99] N. Mohammed, S. Barouti, D. Alhadidi, and R. Chen, “Secure
pp. 70–74. and private management of healthcare databases for data min-
[77] A. Samuel, M. I. Sarfraz, H. Haseeb, S. Basalamah, and A. Ghafoor, ing,” in Proc. IEEE 28th Int. Symp. Comput.-Based Med. Syst., 2015,
“A framework for composition and enforcement of privacy-aware pp. 191–196.
and context-driven authorization mechanism for multimedia [100] L. Xiao, C. Xu, J. Qin, G. Qin, M. Zhu, L. Ruan, Z. Wang, M. Li,
big data,” IEEE Trans. Multimedia, vol. 17, no. 9, pp. 1484–1494, and D. Tan, “Secure distribution of big data based on bittorrent,”
Sep. 2015. in Proc. IEEE 11th Int. Conf. Dependable Autonomic Secure Comput.,
[78] F. Gao, “Research on cloud security control mechanism based on big 2013, pp. 82–90.
data,” in Proc. Int. Conf. Smart Grid Electr. Autom., 2017, pp. 366–370. [101] W. Zhijun and W. Caiyun, “Security-as-a-service in big data of
[79] A. Gupta, A. Verma, P. Kalra, and L. Kumar, “Big data: A secu- civil aviation,” in Proc. IEEE Int. Conf. Comput. Commun., 2015,
rity compliance model,” in Proc. Conf. IT Business Ind. Govern- pp. 240–244.
ment, 2014, pp. 1–5. [102] C. Hu and Y. Huo, “Efficient privacy-preserving dot-product
[80] N.-Y. Lee and B.-H. Wu, “Privacy protection technology and computation for mobile big data,” IET Commun., vol. 11, no. 5,
access control mechanism for medical big data,” in Proc. 6th IIAI pp. 704–712, 2016.
Int. Congr. Adv. Appl. Informat., 2017, pp. 424–429. [103] S. Q. Ren, T. H. Meng, N. Yibin, and K. M. M. Aung, “Privacy-
[81] M. R. Islam, M. Habiba, and M. I. I. Kashem, “A framework for preserved multi-party data merging with secure equality eval-
providing security to personal healthcare records,” in Proc. Int. uation,” in Proc. Int. Conf. Cloud Comput. Res. Innovations, 2016,
Conf. Netw. Syst. Secur., 2017, pp. 168–173. pp. 34–41.
[82] R. Achana, R. S. Hegadi, and T. Manjunath, “A novel data secu- [104] K. Benzidane, H. El Alloussi, O. El Warrak, L. Fetjah,
rity framework using e-mod for big data,” in Proc. IEEE Int. WIE S. J. Andaloussi, and A. Sekkaki, “Toward a cloud-based security
Conf. Electr. Comput. Eng., 2015, pp. 546–551. intelligence with big data processing,” in Proc. IEEE/IFIP Netw.
[83] S.-H. Kim, N.-U. Kim, and T.-M. Chung, “Attribute relationship Operations Manage. Symp., 2016, pp. 1089–1092.
evaluation methodology for big data security,” in Proc. Int. Conf. [105] Z.-W. Lu, “Research about new media security technology base on
IT Convergence Secur., 2013, pp. 1–4. big data era,” in Proc. IEEE 14th Int. Dependable Autonomic Secure
[84] S.-H. Kim, J.-H. Eom, and T.-M. Chung, “Big data security hard- Comput./14th Int Conf. Pervasive Intell. Comput/2nd Int. Conf Big Data
ening methodology using attributes relationship,” in Proc. Int. Intell. Comput./Cyber Sci. Technol. Congr., 2016, pp. 933–936.
Conf. Inf. Sci. Appl., 2013, pp. 1–2.
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.
2072 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 14, NO. 6, NOVEMBER/DECEMBER 2021
[106] L. Yue, H. Junqin, Q. Shengzhi, and W. Ruijin, “Big data model of [125] Z. Tan, U. T. Nagar, X. He, P. Nanda, R. P. Liu, S. Wang,
security sharing based on blockchain,” in Proc. 3rd Int. Conf. Big and J. Hu, “Enhancing big data security with collaborative
Data Comput. Commun., 2017, pp. 117–121. intrusion detection,” IEEE Cloud Comput., vol. 1, no. 3,
[107] M. Tang, M. Alazab, and Y. Luo, “Big data for cybersecurity: pp. 27–33, Sep. 2014.
Vulnerability disclosure trends and dependencies,” IEEE [126] J. Zhao, Y. Wang, and Y. Xia, “Analysis of information security of
Trans. Big Data, p. 1, 2018. https://doi.org/10.1109/ electric power big data and its countermeasures,” in Proc. 12th
TBDATA.2017.2723570 Int. Conf. Comput. Intell. Secur., 2016, pp. 243–248.
[108] P. A. Dhande and A. Kadam, “New approach for load reba- [127] J. Hu and A. V. Vasilakos, “Energy big data analytics and secu-
lancer, scheduler in big data with security mechanism in cloud rity: Challenges and opportunities,” IEEE Trans. Smart Grid,
environment,” in Proc. IEEE Int Conf Adv Electron. Commun. Com- vol. 7, no. 5, pp. 2423–2436, Sep. 2016.
put. Technol., 2016, pp. 247–250. [128] C. Dincer, G. Akpolat, and E. Zeydan, “Security issues of big data
[109] D. Puthal, N. Malik, S. P. Mohanty, E. Kougianos, and C. Yang, applications served by mobile operators,” in Proc. 25th Signal
“The blockchain as a decentralized security framework,” IEEE Process. Commun. Appl. Conf., 2017, pp. 1–4.
Consum. Electron. Mag., vol. 7, no. 2, pp. 18–21, Mar. 2018. [129] S. Yakoubov, V. Gadepally, N. Schear, E. Shen, and
[110] Q. Xia, E. B. Sifah, K. O. Asamoah, J. Gao, X. Du, and M. Guizani, A. Yerukhimovich, “A survey of cryptographic approaches to
“Medshare: Trust-less medical data sharing among cloud service securing big-data analytics in the cloud,” in Proc. IEEE High Per-
providers via blockchain,” IEEE Access, vol. 5, pp. 14757–14767, form. Extreme Comput. Conf., 2014, pp. 1–6.
2017. [130] L. Dupre and Y. Demchenko, “Impact of information security
[111] N. Miloslavskaya, “Security intelligence centers for big data measures on the velocity of big data infrastructures,” in Proc. Int.
processing,” in Proc. 5th Int. Conf. Future Internet Things Cloud Conf. High Perform. Comput. Simul., 2016, pp. 492–500.
Workshops, 2017, pp. 7–13. [131] N. Chaudhari and S. Srivastava, “Big data security issues and
[112] M. Marchetti, F. Pierazzi, A. Guido, and M. Colajanni, challenges,” in Proc. Int. Conf. Comput. Commun. Autom., 2016,
“Countering advanced persistent threats through security intelli- pp. 60–64.
gence and big data analytics,” in Proc. 8th Int. Conf. Cyber Conflict, [132] M. Paryasto, A. Alamsyah, B. Rahardjo, et al., “Big-data security
2016, pp. 243–261. management issues,” in Proc. 2nd Int. Conf. Inf. Commun. Technol.,
[113] Q. Jin, Y. Xiang, G. Sun, Y. Liu, and C.-C. Chang, “Cybersecurity 2014, pp. 59–63.
for cyber-enabled multimedia applications,” IEEE MultiMedia, [133] R. Clarke, “Quality assurance for security applications of big
vol. 24, no. 4, pp. 10–13, Oct.-Dec. 2017. data,” in Proc. Eur. Intell. Secur. Informat. Conf., 2016, pp. 1–8.
[134] I. Skrjanc, A. S. de Miguel, J. A. Iglesias, A. Ledezma, and
[114] Y. Mengke, Z. Xiaoguang, Z. Jianqiu, and X. Jianjian, “Challenges
and solutions of information security issues in the age of big D. Dovzan, “Evolving cauchy possibilistic clustering based on
data,” China Commun., vol. 13, no. 3, pp. 193–202, 2016. cosine similarity for monitoring cyber systems,” in Proc. Evolving
[115] H. Zou, “Protection of personal information security in the age Adaptive Intell. Syst., 2017, pp. 1–5.
of big data,” in Proc. 12th Int. Conf. Comput. Intell. Secur., 2016, [135] S. Alouneh, I. Hababeh, F. Al-Hawari, and T. Alajrami, “Innovative
pp. 586–589. methodology for elevating big data analysis and security,” in Proc.
[116] D. Mondek, R. B. Bla zek, and T. Zahradnickỳ, “Security analytics 2nd Int. Conf. Open Source Softw. Comput., 2016, pp. 1–5.
in the big data era,” in Proc. IEEE Int. Conf. Softw. Quality Rel. [136] K. Dhanasekaran and B. Surendiran, “Nature-inspired classifica-
Secur. Companion, 2017, pp. 605–606. tion for mining social space information: National security intel-
[117] B. B. Jayasingh, M. Patra, and D. B. Mahesh, “Security issues ligence and big data perspective,” in Proc. Online Int. Conf. Green
and challenges of big data analytics and visualization,” in Eng. Technol., 2016, pp. 1–6.
Proc. 2nd Int. Conf. Contemporary Comput. Informat., 2016, [137] K. D. Strang and Z. Sun, “Meta-analysis of big data security and
pp. 204–208. privacy: Scholarly literature gaps,” in Proc. IEEE Int. Conf. Big
[118] A. Kott, A. Swami, and P. McDaniel, “Security outlook: six cyber Data., 2016, pp. 4035–4037.
game changers for the next 15 years,” Comput., vol. 47, no. 12, [138] L. Yuqing, “Research on personal information security on social
pp. 104–106, 2014. network in big data era,” in Proc. Int. Conf. Smart Grid Electr.
[119] H. K. Patil and R. Seshadri, “Big data security and privacy issues in Autom., 2017, pp. 676–678.
healthcare,” in Proc. IEEE Int. Congr. Big Data, 2014, pp. 762–765. [139] J. Hariharakrishnan, S. Mohanavalli, K. S. Kumar, et al., “Survey
[120] S. Chandra, S. Ray, and R. Goswami, “Big data security in health- of pre-processing techniques for mining big data,” in Proc. Int.
care: Survey on frameworks and algorithms,” in Proc. IEEE 7th Conf. Comput. Commun. Signal Process., 2017, pp. 1–5.
Int. Adv. Comput. Conf., 2017, pp. 89–94. [140] S. Bi, R. Zhang, Z. Ding, and S. Cui, “Wireless communications in
[121] S. Anandaraj and M. Kemal, “Research opportunities and chal- the era of big data,” IEEE Commun. Mag., vol. 53, no. 10, pp. 190–199,
lenges of security concerns associated with big data in cloud Oct. 2015.
computing,” in Proc. Int. Conf. I-SMAC (IoT Social, Mobile, Anal. [141] T. Lu, X. Guo, B. Xu, L. Zhao, Y. Peng, and H. Yang, “Next big
Cloud), 2017, pp. 746–751. thing in big data: the security of the ict supply chain,” in Proc.
[122] Y. Wang, B. Rawal, and Q. Duan, “Securing big data in the cloud Int. Conf. Social Comput., 2013, pp. 1066–1073.
with integrated auditing,” in Proc. IEEE Int. Conf. Smart Cloud, [142] E. Damiani, C. Ardagna, F. Zavatarelli, E. Rekleitis, and L. Marinos,
2017, pp. 126–131. “Big Data Threat Landscape,” Eur. Union Agency Netw. Inf. Secur.,
[123] S. Bahulikar, “Security measures for the big data, virtualization Jan. 2017. [Online]. Available: https://www.enisa.europa.eu/
and the cloud infrastructure,” in Proc. 1st India Int. Conf. Inf. Pro- publications/bigdata-threat-landscape
cess., 2016, pp. 1–4. [143] R. Geambasu, T. Kohno, A. A. Levy, and H. M. Levy, “Vanish:
[124] A. Sharif, S. Cooney, S. Gong, and D. Vitek, “Current security Increasing data privacy with self-destructing data,” in Proc. USE-
threats and prevention measures relating to cloud services, NIX Secur. Symp., 2009, vol. 316.
hadoop concurrent processing, and big data,” in Proc. IEEE Int.
Conf. Big Data, 2015, pp. 1865–1870.
Authorized licensed use limited to: E.G.S. Pillay Engineering College-Nagapattinam. Downloaded on September 05,2022 at 06:45:21 UTC from IEEE Xplore. Restrictions apply.