Nelson 2016

This document presents a systematic literature review of security and privacy research related to big data. It aims to categorize and analyze recent papers from top conferences to provide an overview of the security and privacy topics present in the context of big data. The review collects papers published between 2012-2015 from conferences related to data, security, and privacy. Papers are manually categorized and a qualitative analysis is performed of representative papers within each category to analyze connections between topics and identify areas for further research.

Uploaded by

Khalid Syfullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views10 pages

Nelson 2016

Uploaded by

Khalid Syfullah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

2016 IEEE International Conference on Big Data (Big Data)

Security and Privacy for Big Data: A Systematic

Literature Review
Boel Nelson Tomas Olovsson
Department of Computer Science and Engineering Department of Computer Science and Engineering
Chalmers University of Technology Chalmers University of Technology
Email: [email protected] Email: [email protected]

Abstract—Big data is currently a hot research topic, with four veracity and value. Volume refers to the amount of data, which
million hits on Google scholar in October 2016. One reason for Kaisler et al. [5] define to be in the range of 1018 bytes to
the popularity of big data research is the knowledge that can be considered big data. Variety denotes the problem of big
be extracted from analyzing these large data sets. However, data
can contain sensitive information, and data must therefore be data being able to consist of different formats of data, such
sufficiently protected as it is stored and processed. Furthermore, as text, numbers, videos and images. Velocity represents the
it might also be required to provide meaningful, proven, privacy speed at which the data grows, that is, at what speed new
guarantees if the data can be linked to individuals. data is generated. Furthermore, veracity concerns the accuracy
To the best of our knowledge, there exists no systematic and trustworthiness of data. Lastly, value corresponds to the
overview of the overlap between big data and the area of
security and privacy. Consequently, this review aims to explore usefulness of data, indicating that some data points, or a
security and privacy research within big data, by outlining and combination of points, may be more valuable than others. Due
providing structure to what research currently exists. Moreover, to the potential large scale data processing of big data, there
we investigate which papers connect security and privacy with exists a need for efficient, scalable solutions, that also take
big data, and which categories these papers cover. Ultimately, is security and privacy into consideration.
security and privacy research for big data different from the rest
of the research within the security and privacy domain? To the best of our knowledge, there exists no peer-reviewed
To answer these questions, we perform a systematic literature articles that systematically review big data papers with a
review (SLR), where we collect recent papers from top confer- security and privacy perspective. Hence, we aim to fill that gap
ences, and categorize them in order to provide an overview of by conducting a systematic literature review (SLR) of recent
the security and privacy topics present within the context of big big data papers with a security and privacy focus. While this
data. Within each category we also present a qualitative analysis
of papers representative for that specific area. Furthermore, we
review does not cover the entire, vast, landscape of security
explore and visualize the relationship between the categories. and privacy for big data, it provides an insight into the field,
Thus, the objective of this review is to provide a snapshot of the by presenting a snapshot of what problems and solutions exists
current state of security and privacy research for big data, and within the area.
to discover where further research is required. In this paper, we select papers from top security and privacy
conferences, as well as top conferences on data format and
I. I NTRODUCTION
machine learning for further analysis. The papers are recent
Big data processing presents new opportunities due to its publications, published between 2012 and 2015, which we
analytic powers. Business areas that can benefit from analyzing manually categorize to provide an overview of security and
big data include the automotive industry, the energy distribu- privacy papers in a big data context. The categories are chosen
tion industry, health care and retail. Examples from these areas to be relevant for big data, security or privacy respectively.
include analyzing driving patterns to discover anomalies in Furthermore, we investigate and visualize what categories
driving behaviour [1], making use of smart grid data to create relate to each other in each reviewed paper, to show what
energy load forecasts [2], analyzing search engine queries connections exists and which ones are still unexplored. We
to detect influenza epidemics [3] and utilizing customers’ also visualize the proportion of papers belonging to each
purchase history to generate recommendations [4]. However, category, and the proportion of papers published in each
all of these examples include data linked to individuals, which conference. Lastly we analyze and present a representative
makes the underlying data potentially sensitive. subset of papers from each of the categories.
Furthermore, while big data provides analytic support, big The paper is organized as follows. First, the method for
data in itself is difficult to store, manage and process efficiently gathering and reviewing papers is explained in Section II.
due to the inherent characteristics of big data [5]. These Then, the quantitative and qualitative results are presented
characteristics were originally divided into three dimensions in Section III, where each of the categories and their cor-
referred to as the three Vs [6], but are today often divided responding papers are further analyzed in the subsection with
into four or even five Vs [2, 5, 7]. The original three Vs their corresponding name. A discussion of the findings and
are volume, variety and velocity, and the newer V’s are directions for future work is presented in Section IV. Lastly,

978-1-4673-9005-7/16/$31.00 ©2016 IEEE 3693

Acronym Conference Name Field(s) of Researchiii
a conclusion follows in Section V.
DCC Data Compression Con- Data Format
II. M ETHODOLOGY ference
ICDE International Data Format
In this paper, we perform a systematic literature review Conference on Data
(SLR) to document what security and privacy research exists Engineering
ICDM IEEE International Data Format
within the big data area, and identify possible areas where Conference on Data
further research is needed. The purpose of this review is to Mining
categorize and analyze, both in a quantitative and a qualitative SIGKDD Association Data Format
for Computing
way, big data papers related to security or privacy. Therefore, Machinery’s Special
in accordance with SLR, we define the following research Interest Group on
questions the review should answer: Knowledge Discovery
and Data Mining
• What recent security or privacy papers exists in the big SIGMOD Association Data Format
data context? for Computing
• How many papers cover security or privacy for big data? Machinery’s Special
Interest Group on
• Which security, privacy and big data topics are repre- Management of Data
sented in the area? VLDB International Data Format
• When a paper covers more than one category, which Conference on Very
Large Databases
categories intertwine? WSDM ACM International Data Format,
SLRs originate from medical research, but has been adapted Conference on Web Distributed Computing,
for computer science, and in particular software engineering, Search and Data Mining Library and Information
Studies
by Kitchenham [8] in 2004. More specifically, a SLR is
ICML International Artificial Intelligence
useful for summarising empirical evidence concerning an Conference on Machine and Image Processing
existing technology as well as for identifying gaps in current Learning
research [8]. We answer our research questions by performing NIPS Neural Information Pro- Artificial Intelligence
cessing System Confer- and Image Processing
the steps in the review protocol we have constructed, in ence
accordance with Kitchenham’s guidelines, displayed in Table I. CCS ACM Conference on Computer Software
Computer and Commu-
1. Data sources and search strategy: Collect papers nications Security
2. Study selection/study quality assessment: Filter papers S&P IEEE Symposium on Computation Theory
3. Data extraction: Categorize papers, extract the novelty of the papers’ Security and Privacy and Mathematics,
scientific contribution Computer Software
4. Data synthesis: Visualize papers and highlight the contributions USENIX Security Usenix Security Sym- Computer Software
posium
TABLE I: Review protocol
TABLE II: Conferences the papers were collected from, in-
As the data source, we have used papers from top confer- cluding acronym and field of research
ences, ranked A∗ by the Computing Research and Education
Association of Australasia (CORE)i in 2014. In total, twelve
relevant conferences have been chosen, including all three of Step 1: To perform the first step from Table I, the
CORE’s top ranked security and privacy conferences. There collection of papers, we have constructed the following two
also exists several new, promising conferences in big data. queries:
However, none of these big data specific conferences are • Query A: allintitle: privacy OR private OR security OR
ranked yet, and thus they are not included in this review. secure
Arguably, the highest quality papers should appear in the Sources: DCC, ICDE, ICDM, SIGKDD, SIDMOD,
A∗ ranked conferences, instead of in a not proven venue. VLDB, WSDM, ICML and NIPS
Furthermore, it is our belief that new ideas hit conferences Timespan: 2012-2015
before journals, and thus journals have been excluded from
• Query B: allintitle: “big data”
the review. Consequently, we have chosen top conferences
for closely related topics: machine learning and data formatii . Sources: DCC, ICDE, ICDM, SIGKDD, SIDMOD,
Thus, the big data conferences are represented by seven con- VLDB, WSDM, ICML, NIPS, CCS, S&P and
ferences from the field of data format and two from machine USENIX Security
learning. The chosen conferences are presented in Table II, Timespan: 2012-2015
and we further discuss the consequences of choosing these Note that only the title of a paper is used to match on
conferences in Section IV. a keyword. The reason for this is to reduce the amount of
i http://portal.core.edu.au/conf-ranks/
false positives. For example, if the search is not limited to the
ii Field of research code 0804: http://www.abs.gov.au/Ausstats/[email protected]/ title, a paper might discuss the keyword in the introduction
0/206700786B8EA3EDCA257418000473E3?opendocument or as related work, but it might not otherwise be included
iii As labeled by CORE in the paper. Since the review is performed manually, it

3694
would require a labor intensive analysis just to eliminate those performing the quality assessment, 82 papers remain. Query A
irrelevant papers. Furthermore, we believe that the papers results in 78 papers, and query B contributes with four unique
related to security or privacy would mention this in their title. papers that were not already found by query A. In Table IV
Thus, we have focused on a smaller, relevant, subset. the number of papers from each conference is shown for query
Query A focuses on finding papers related to security or A and query B respectively.
privacy in one of the big data conferences. This query is
intentionally constructed to catch a wide range of security Conference Acronym Query A Query B
Number Percentage Number Percentage
and privacy papers, including relevant papers that have omitted of Papers of Papers of Papers of Papers
’big data’ from the title. Furthermore, query B is designed to DCC 0 0% 0 0%
find big data papers in any of the conferences, unlike query A. ICDE 22 28% 0 0%
The reason to also include query B is foremost to capture big ICDM 4 5% 0 0%
SIGKDD 0 0% 0 0%
data papers in security and privacy conferences. Query B will SIGMOD 21 26% 1 25%
also be able to find big data papers in the other conferences, VLDB 25 31% 1 25%
which provides the opportunity to catch security or privacy WSDM 0 0% 0 0%
papers that were not already captured by query A. ICML 5 6.3% 0 0%
Step 2: After the papers have been collected, we man- NIPS 1 1.3% 0 0%
ually filter them to perform both a selection and a quality S&P - - 1 25%
USENIX Security - - 0 0%
assessment, in accordance with the guidelines for a SLR. First, CCS - - 1 25%
we filter away talks, tutorials, panel discussions and papers Total: 78 100% 4 100%
only containing abstracts from the collected papers. We also
verify that no papers are duplicates to ensure that the data TABLE IV: The number, and percentage, of papers picked
is not skewed. Then, as a quality assessment we analyze the from each conference, for query A and query B
papers’ full corpora to determine if they belong to security
or privacy. Papers that do not discuss security or privacy Step 4: Then, as part of the data synthesis which is the
are excluded. Thus, the irrelevant papers, mainly captured by last step in the review protocol in Table I, the quantitative
query B, and other potential false positives, are eliminated. results from the queries are visualized. Both as circle packing
To further assess the quality of the papers, we investigate diagrams, where the proportion of papers and conferences
each papers’ relevance for big data. To determine if it is a is visualized, and as a circular network diagram where re-
big data paper we include the entire corpus of the paper, and lationships between categories are visualized. Thereafter a
look for evidence of scalability in the proposed solution by qualitative analysis is performed on the papers, where the
examining if the paper relates to the five V’s. The full list of novel idea and the specific topics covered are extracted from
included and excluded papers is omitted in this paper due to the papers’ corpora. A representative set of the papers are then
space restrictions, but it is available from the authors upon presented.
request.
Step 3: Then, each paper is categorized into one or more III. R ESULTS
of the categories shown in Table III. These categories were In this section, we quantitatively and qualitatively analyze
chosen based on the five V’s, with additional security and the 82 papers. Figure 1 (a) visualizes where each paper
privacy categories added to the set. Thus the categories capture originates from, using circle packing diagrams. The size of
both the inherent characteristics of big data, as well as security each circle corresponds to the proportion of papers picked
and privacy. from a conference. As can be seen, most papers have been
published in ICDE, SIGMOD or VLDB. Furthermore, the
Category V Security or Privacy
distribution of the different categories is illustrated in Figure 1
Confidentialityiv
Data Analysis Value (b), where the size of a circle represents the amount of papers
Data Format Variety, Volume covering that category. Prominent categories are privacy, data
Data Integrity Veracity analysis and confidentiality.
Privacyv
Furthermore, some papers discuss more than one category
Stream Processing Velocity, Volume
Visualization Value, Volume and therefore belong to more than one category. Therefore,
the total number of papers when all categories are summed
TABLE III: Categories used in the review, chosen based on will exceed 82. To illustrate this overlap of categories, the
the five V’s. A checkmark in the third column means that the relationship between the categories is visualized as a circular
category is a security or privacy category. network diagram in Figure 2. Each line between two categories
means that there exists at least one paper that discusses both
In total, 208 papers match the search criteria when we run
categories. The thickness of the line reflects the amount of
both queries in Google Scholar. After filtering away papers and
papers that contain the two categories connected by the line.
iv As defined by ISO 27000:2016 [9] Privacy and data analytics as well as confidentiality and
v Anonymization as defined by ISO 29100:2011 [10] data format are popular combinations. Stream processing and

3695
(a) Conferences, grouped by research ﬁeld (b) Categories, grouped by similarity
Fig. 1: Circle packing diagrams, showing the proportion of papers belonging to conferences (a) and categories (b)

visualization are only connected by one paper, respectively, to whereas the rest use partial homomorphic encryption which
privacy. supports given arithmetic operations. Liu et al. [11] propose
a secure method for comparing trajectories, for example to
compare different routes using GPS data, by using partial ho-
momorphic encryption. Furthermore, Chu et al. [12] use fully
homomorphic encryption to provide a protocol for similarity
ranking.
Another topic covered by several papers is access control.
In total, four papers discuss access control. For example,
Bender et al. [13] proposed a security model where policies
must be explainable. By explainable in this setting Ben-
der et al. refers to the fact that every time a query is denied
due to missing privileges, an explanation as to what additional
privileges are needed is returned. This security model is an
attempt to make it easier to implement the principle of least
Fig. 2: Connections between categories, where the thickness privilege, rather than giving users too generous privileges.
of the link represents the amount of papers that connect the Additionally, Meacham and Shasha [14] propose an appli-
two categories cation that provides access control in a database, where all
records are encrypted if the user does not have the appropriate
Since there is not enough room to describe each paper in privileges. Even though the solutions by Bender et al. and
the qualitative analysis, we have chosen a representative set Meacham and Shasha use SQL, traditionally not associated
for each category. This representative set is chosen to give with big data, their main ideas are still applicable since it
an overview of the papers for each category. Each selected only requires changing the database to a RDBMS for big
paper is then presented in a table to show which categories it data that have been proposed earlier, such as Vertica [15] or
belongs to. An overview of the rest of the papers are shown Zhu et al.’s [16] distributed query engine.
in Table V. Other topics covered were secure multiparty computation, a
concept where multiple entities perform a computation while
A. Confidentiality keeping each entity’s input confidential, oblivious transfer,
Confidentiality is a key attribute to guarantee when sensitive where a sender may or may not transfer a piece of information
data is handled, especially since being able to store and to the receiver without knowing which piece is sent, as well
process data while guaranteeing confidentiality could be an as different encrypted indexes used for improving search
incentive to get permission to gather data. In total, 23 papers time efficiency. In total, three papers use secure multiparty
were categorized as confidentiality papers. Most papers used computation, two use oblivious transfer and two use encrypted
different types of encryption, but there was no specific topic indexes.
that had a majority of papers. Instead, the papers were spread
across a few different topics. In Table VI, an overview of all B. Data Integrity
papers presented in this section is given. Data integrity is the validity and quality of data. It is
Five papers use homomorphic encryption, which is a tech- therefore strongly connected to veracity, one of the five V’s. In
nique that allows certain arithmetic operations to be performed total, five papers covered data integrity. Since there is only a
on encrypted data. Of those five papers, one uses fully homo- small set of data integrity papers, no apparent topic trend was
morphic encryption which supports any arithmetic operation, spotted. Nonetheless, one paper shows an attack on integrity,

3696
Author Short Title C DA DF DI P SP V
Akcora et al. Privacy in Social Networks
Allard et al. Chiaroscuro
Bonomi and Xiong Mining Frequent Patterns with Differential Privacy
Bonomi et al. LinkIT
Cao et al. A hybrid private record linkage scheme
Chen and Zhou Recursive Mechanism
Dev Privacy Preserving Social Graphs for High Precision Community Detection
Dong et al. When Private Set Intersection Meets Big Data
Fan et al. FAST
Gaboardi et al. Dual Query
Guarnieri and Basin Optimal Security-aware Query Processing
Guerraoui et al. D2P
Haney et al. Design of Policy-aware Differentially Private Algorithms
He et al. Blowfish Privacy
He et al. DPT
He et al. SDB
Hu et al. Authenticating Location-based Services Without Compromising Location Privacy
Hu et al. Private search on key-value stores with hierarchical indexes
Hu et al. VERDICT
Jain and Thakurta (Near) Dimension Independent Risk Bounds for Differentially Private Learning
Jorgensen and Cormode Conservative or liberal?
Kellaris and Practical differential privacy via grouping and smoothing
Papadopoulos
Khayyat et al. BigDansing
Kozak and Zezula Efficiency and Security in Similarity Cloud Services
Li and Miklau An Adaptive Mechanism for Accurate Query Answering Under Differential Privacy
Li et al. A Data- and Workload-aware Algorithm for Range Queries Under Differential
Privacy
Li et al. DPSynthesizer
Li et al. Fast Range Query Processing with Strong Privacy Protection for Cloud Computing
Li et al. PrivBasis
Lin and Kifer Information Preservation in Statistical Privacy and Bayesian Estimation of
Unattributed Histograms
Lu et al. Generating private synthetic databases for untrusted system evaluation
Mohan et al. GUPT
Nock et al. Rademacher observations, private data, and boosting
Oktay et al. SEMROD
Pattuk et al. Privacy-aware dynamic feature selection
Potluru et al. CometCloudCare (C3)
Qardaji et al. Differentially private grids for geospatial data
Qardaji et al. PriView
Qardaji et al. Understanding Hierarchical Methods for Differentially Private Histograms
Rahman et al. Privacy Implications of Database Ranking
Rana et al. Differentially Private Random Forest with High Utility
Ryu et al. Curso
Sen et al. Bootstrapping Privacy Compliance in Big Data Systems
Shen and Jin Privacy-Preserving Personalized Recommendation
Terrovitis et al. Privacy Preservation by Disassociation
To et al. A Framework for Protecting Worker Location Privacy in Spatial Crowdsourcing
Wong et al. Secure Query Processing with Data Interoperability in a Cloud Database Environ-
ment
Xiao et al. DPCube
Xu et al. Differentially private frequent sequence mining via sampling-based candidate
pruning
Xue et al. Destination prediction by sub-trajectory synthesis and privacy protection against
such prediction
Yang et al. Bayesian Differential Privacy on Correlated Data
Yaroslavtsev et al. Accurate and efficient private release of datacubes and contingency tables
Yi et al. Practical k nearest neighbor queries with location privacy
Yuan et al. Low-rank Mechanism
Zeng et al. On Differentially Private Frequent Itemset Mining
Zhang et al. Functional Mechanism
Zhang et al. Lightweight privacy-preserving peer-to-peer data integration
Zhang et al. Private Release of Graph Statistics Using Ladder Functions
Zhang et al. PrivBayes
Zhang et al. PrivGene

TABLE V: The reviewed papers omitted from the reference list, showing categories covered by each paper. C = Conﬁdentiality,
DA = Data Analysis, DF = Data Format, DI= Data Integrity, P = Privacy, SP = Stream Processing, V = Visualization.

3697
Author C DA DF DI P SP V
can be used to anonymize data. The first three are techniques
Bender et al. [13]
Chu et al. [12]
for releasing entire sets of data through privacy-preserving
Liu et al. [11] data publishing (PPDP), whereas differential privacy is used
Meacham and Shasha [14] for privacy-preserving data mining (PPDM). Thus, differential
TABLE VI: A set of confidentiality papers, showing categories privacy is obtained without processing the entire data set,
covered by each paper. A checkmark indicates the paper on unlike the others. Therefore, anonymizing larger data sets can
that row contains the category. be difficult from an efficiency perspective. However, larger sets
have greater potential to hide individual data points within the
set [27].
two papers are on error correction and data cleansing and Out of a total of 61 privacy papers, one paper [28] uses
two papers use tamper-proof hardware to guarantee integrity k-anonymity, and another paper [29] uses l-diversity and t-
of the data. An overview of all papers covered in this section closeness but also differential privacy to anonymize data.
are shown in Table VII. Furthermore, Cao and Karras [30] introduce a successor to t-
Xiao et al. [17] shows that it is enough to poison 5% closeness, called β-likeness which they claim is more informa-
of the training values, a data set used solely to train a tive and comprehensible. In comparison, a large portion, 46 pa-
machine learning algorithm, in order for feature selection to pers, of the privacy oriented papers focuses only on differential
fail. Feature selection is the step where relevant attributes are privacy as their privacy model. Most of them propose methods
being decided, and it is therefore an important step since the for releasing differentially private data structures. Among these
rest of the algorithm will depend on these features. Thus, are differentially private histograms [31] and different data
Xiao et al. show that feature selection is not secure unless structures for differentially private multidimensional data [32].
the integrity of the data can be verified. An interesting observation by Hu et al. [33] is that dif-
Furthermore, Arasu et al. [18] implemented a SQL database ferential privacy can have a large impact on accuracy of the
called Cipherbase that focuses on confidentiality of data as result. When Hu et al. enforced differential privacy on their
well as integrity in the cloud. To maintain the integrity of the telecommunications platform, they got between 15% to 30%
cryptographic keys, they use FPGA based custom hardware accuracy loss. In fact, guaranteeing differential privacy while
to provide tamper-proof storage. Lallali et al. [19] also used maintaining high utility of the data is not trivial. From the
tamper-resistant hardware where they enforce confidentiality reviewed papers, 15 of them investigated utility in combination
for queries performed in personal clouds. The tamper-resistant with differential privacy.
hardware is in the form of a secure token which prevents One example of a paper that investigates the utility of
any data disclosure during the execution of a query. While differentially private results, and how to improve it is Proser-
the secure tokens ensures a closed execution environment, pio et al. [34]. The work of Proserpio et al. is a continuation of
they posses limited processing power due to the hardware the differentially private querying language PINQ [35], which
constraints which adds to the technical challenge. they enhance by decreasing the importance of challenging
Author C DA DF DI P SP V
entries, which induce high noise, in order to improve accuracy
Arasu et al. [18]
of the results.
Lallali et al. [19] The papers reviewed in this section can be seen in Ta-
Xiao et al. [17] ble VIII.
TABLE VII: A set of data integrity papers, showing categories Author C DA DF DI P SP V
covered by each paper Acs et al.[31]
Cao and Karras [30]
Cormode et al.[32]
C. Privacy Hu et al. [33]
An important notion is privacy for big data, since it can Jurczyk et al. [29]
Proserpio et al. [34]
potentially contain sensitive data about individuals. To mitigate Wang and Zheng [28]
the privacy problem, data can be de-identified by removing
attributes that would identify an individual. This is an approach TABLE VIII: A set of privacy papers, showing categories
that works, if done correctly, both when data is managed and covered by each paper
when released. However, under certain conditions it is still
possible to re-identify individuals even when some attributes
have been removed [20, 21, 22]. Lu et al. [7] also point out D. Data Analysis
that the risk of re-identification can increase with big data, Data analysis is the act of extracting knowledge from data.
as more external data from other sources than the set at hand It includes both general algorithms for knowledge discovery,
can be used to cross-reference and infer additional information and machine learning. Out of 26 papers categorized as data
about individuals. analysis papers, 15 use machine learning. Apart from machine
Several privacy models, such as k-anonymity [23], l- learning, other topics included frequent sequence mining,
diversity [24], t-closeness [25] and differential privacy [26], where reoccurring patterns are detected, and different versions

3698
of the k-nearest neighbor (kNN) algorithm, that finds the k storage capacity in comparison with saving the entire data set.
closest points given a point of reference. All papers from this Furthermore, stream processing can also completely remove
section are shown in Table IX. the bottleneck of first writing data to disk and then reading it
Jain and Thakurta [36] implemented differentially pri- back in order to process it if it is carried out in real-time.
vate learning using kernels. The problem investigated by One paper, by Kellaris et al. [41] shown in Table XI,
Jain and Thakurta is keeping the features, which are different combines stream processing with a privacy, and provides a
attributes of an entity, of a learning set private while still differentially private way of querying streamed data. Their
providing useful information. approach enforces w event-level based privacy rather than user-
Furthermore, Elmehdwi et al. [37] implemented a secure level privacy, which makes each event in the stream private,
kNN algorithm, based on partial homomorphic encryption. rather than the user that continuously produces events. Event-
Here, Elmehdwi et al. propose a method for performing kNN level based privacy, originally introduced by Dwork et al. [42],
in the cloud, where both the query and the database are is more suitable in this case due to the fact that differential
encrypted. Similarly, Yao et al. [38] investigated the secure privacy requires the number of queries connected to the same
nearest neighbour (SNN) problem which asks a third party individual to be known in order to provide user-level based
to find the point closest to a given point, without revealing privacy. In the case of streaming however, data is gathered
any of the points to the third party. They show attacks for continuously, making it impossible to estimate how many
existing methods for SNN, and design a new SNN method times a certain individual will produce events in the future.
that withstand the attacks.
Author C DA DF DI P SP V
Author C DA DF DI P SP V Kellaris et al. [41]
Elmehdwi et al. [37]
Jain and Thakurta [36] TABLE XI: All stream processing papers, showing categories
Yao et al. [38] covered by each paper
TABLE IX: A set of data analysis papers, showing categories
covered by each paper G. Data Format
In order to store and access big data, it can be structured
E. Visualization in different ways. Out of the 19 papers labeled as data format
Visualization of big data provides a quick overview of the papers, most used a distributed file system, database or cloud
data points. It is an important technique, especially while that made them qualify in this category. An overview of all
exploring a new data set. However, it is not trivial to implement papers from this section can be found in Table XII.
for big data. Gordov and Gubarev [39] point out visual noise, One example of combining data format and privacy is the
large image perception, information loss, high performance work by Peng et al. [43] that focuses on query optimization
requirements and high rate of image change as the main under differential privacy. The main challenge faced when
challenges when visualizing big data. enforcing differential privacy on databases is the interactive
One paper, by To et al. [40], shown in Table X, was catego- nature of the database where new queries are issued in real-
rized as a visualization paper. To et al. implemented a toolbox time. An unspecified number of queries makes it difficult
for visualizing and assigning tasks based on an individuals’ to wisely spend the privacy budget, which essentially keeps
location. In this toolbox, location privacy is provided while track of how many queries can be asked, used to guarantee
at the same time allowing for allocation strategies of tasks differential privacy, to still provide high utility of query an-
to be analyzed. Thus, it presents a privacy-preserving way of swers. Therefore, Peng et al. implemented the query optimizer
analyzing how parameters in a system should be tuned to result Pioneer, that makes use of old query replies when possible in
in a satisfactory trade-off between privacy and accuracy. order to consume as little as possible of the remaining privacy
budget.
Author C DA DF DI P SP V Furthermore, Sathiamoorthy et al. [44] focus on data in-
To et al. [40] tegrity, and present an alternative to standard Reed-Solomon
TABLE X: All visualization papers, showing categories cov- codes, which are erasure codes used for error-correction, that
ered by each paper are more efficient and offer higher reliability. They imple-
mented their erasure codes in the Hadoop’s distributed file
system, HDFS, and were able to show that the network traffic
F. Stream Processing could be reduced, but instead their erasure codes required more
Stream processing is an alternative to the traditional store- storage space than traditional Reed-Solomon codes.
then-process approach, which can allow processing of data Lastly, Wang and Ravishankar [45] point out that pro-
in real-time. The main idea is to perform analysis on data viding both efficient and confidential queries in databases
as it is being gathered, to directly address the issue of data is challenging. Inherently, the problem stems from the fact
velocity. Processing streamed data also allows an analyst to that indexes invented to increase performance of queries also
only save the results from the analysis, thus requiring less leak information that can allow adversaries to reconstruct

3699
the plaintext, as Wang and Ravishankar show. Consequently, While privacy was covered by a large portion of papers,
Wang and Ravishankar present an encrypted index that pro- only two papers use an existing privacy-preserving data pub-
vides both confidentiality and efficiency for range queries, lishing (PPDP) technique. Moreover, one paper introduces a
tackling the usual trade-off between security and performance. new PPDP technique called β-likeness. A reason for why this
topic might not be getting a lot of attention is the fact that
Author C DA DF DI P SP V PPDP is dependent on the size of the data set. Thus PPDP
Peng et al. [43] is harder to apply to big data, since the entire data set must
Sathiamoorthy et al. [44]
Wang and Ravishankar [45]
be processed in order to anonymize it. Consequently, further
work may be required in this area to see how PPDP can be
TABLE XII: A set of data format papers, showing categories applied to big data.
covered by each paper We have also detected a gap in the knowledge considering
stream processing and visualization in combination with either
data integrity or confidentiality, as no papers covered two of
IV. D ISCUSSION AND F UTURE W ORK these topics. Data integrity is also one of the topics that were
underrepresented, with five papers out of 82 papers in total,
While this review investigates security and privacy for big which is significantly lower than the number of confidentiality
data, it does not cover all papers available within the topic, and privacy papers. However, it might be explained by the fact
since it would be infeasible to manually review them all. that the word ’integrity’ was not part of any of the queries.
Instead, the focus of this review is to explore recent papers This is a possible expansion of the review.
and to provide both a qualitative and a quantitative analysis,
in order to create a snapshot of the current state-of-the-art. V. C ONCLUSION
By selecting papers from top conferences and assessing their There are several interesting ideas for addressing security
quality manually before selecting them, we include only papers and privacy issues within the context of big data. In this paper,
relevant for big data, security and privacy. 208 recent papers have been collected from A∗ conferences,
A potential problem with only picking papers from top to provide an overview of the current state-of-the-art. In the
conferences is that, while the quality of the papers is good, the end, 82 were categorized after passing the filtering and quality
conferences might only accept papers with ground breaking assessment stage. All reviewed papers can be found in tables
ideas. After conducting this review, however, we believe most in Section III.
big data solutions with respect to security and privacy are Conclusively, since papers can belong to more than one
not necessarily ground breaking ideas, but rather new twists category, 61 papers investigate privacy, 25 data analysis, 23
on existing ideas. From the papers collected for this review, confidentiality, 19 data format, 5 data integrity, one stream
none of the topics covered are specific for big data, rather the processing and one visualization. Prominent topics were differ-
papers present new combinations of existing topics. Thus, it ential privacy, machine learning and homomorphic encryption.
seems that security and privacy for big data is not different None of the identified topics are unique for big data.
from other security and privacy research, as the ideas seem to Categories such as privacy and data analysis are covered
scale well. in a large portion of the reviewed papers, and 20 of them
Another part of the methodology that can be discussed is the investigate the combination of privacy and data analysis.
two queries used to collect papers. Query A was constructed However, there are certain categories where interesting con-
to cover a wide range of papers, and query B was set to only nections could be made that do not yet exist. For example, one
include big data papers. Unfortunately, query A contributed combination that is not yet represented is stream processing
with far more hits than query B after the filtering step from with either confidentiality or data integrity. Visualization is
Table I. This means that most papers might not have been another category that was only covered by one paper.
initially intended for big data, but they were included after the In the end, we find that the security and privacy for big data,
quality assessment step, since the methods used were deemed based on the reviewed papers, is not different from security
scalable. Consequently, widening the scope of query B might and privacy research in general.
include papers that present security or privacy solutions solely
ACKNOWLEDGEMENTS
intended for big data.
Regarding the categories, confidentiality was covered by This research was sponsored by the BAuD II project (2014-
almost a third of the papers, but had no dominating topic. 03935) funded by VINNOVA, the Swedish Governmental
Rather, it contained a wide spread of different cryptographic Agency for Innovation Systems.
techniques and access control. Furthermore, privacy was well R EFERENCES
represented, with 61 papers in the review. A large portion of
[1] G. Fuchs et al. “Constructing semantic interpretation
these papers used differential privacy, the main reason prob-
of routine and anomalous mobility behaviors from big
ably being the fact that most differentially private algorithms
data”. In: SIGSPATIAL Special 7.1 (May 2015), pp. 27–
are independent of the data set’s size, which makes it beneficial
34.
for large data sets.

3700
[2] M. Chen et al. “Big Data: A Survey”. en. In: Mobile [15] C. Bear et al. “The vertica database: SQL RDBMS
Networks and Applications 19.2 (Jan. 2014), pp. 171– for managing big data”. In: Proceedings of the 2012
209. workshop on Management of big data systems. ACM,
[3] J. Ginsberg et al. “Detecting influenza epidemics using 2012, pp. 37–38.
search engine query data”. English. In: Nature 457.7232 [16] F. Zhu et al. “A Fast and High Throughput SQL Query
(Feb. 2009), pp. 1012–4. System for Big Data”. In: Web Information Systems En-
[4] O. Tene and J. Polonetsky. “Privacy in the Age of Big gineering - WISE 2012. Ed. by X. S. Wang et al. Lecture
Data: A Time for Big Decisions”. In: Stanford Law Notes in Computer Science 7651. DOI: 10.1007/978-
Review Online 64 (Feb. 2012), p. 63. 3-642-35063-4 66. Springer Berlin Heidelberg, 2012,
[5] S. Kaisler et al. “Big Data: Issues and Challenges Mov- pp. 783–788.
ing Forward”. English. In: System Sciences (HICSS), [17] H. Xiao et al. “Is Feature Selection Secure against
2013 46th Hawaii International Conference on. IEEE, Training Data Poisoning?” In: Proceedings of the 32nd
Jan. 2013, pp. 995–1004. International Conference on Machine Learning (ICML-
[6] D. Laney. 3D Data Management: Controlling Data 15). 2015, pp. 1689–1698.
Volume, Velocity, and Variety. Tech. rep. META Group, [18] A. Arasu et al. “Secure Database-as-a-service with Ci-
Feb. 2001. pherbase”. In: Proceedings of the 2013 ACM SIGMOD
[7] R. Lu et al. “Toward efficient and privacy-preserving International Conference on Management of Data. SIG-
computing in big data era”. English. In: Network, IEEE MOD ’13. New York, NY, USA: ACM, 2013, pp. 1033–
28.4 (Aug. 2014), pp. 46–50. 1036.
[8] B. Kitchenham. Procedures for performing systematic [19] S. Lallali et al. “A Secure Search Engine for the
reviews. Joint Technical Report. Keele, UK: Software Personal Cloud”. In: Proceedings of the 2015 ACM
Engineering Group Department of Computer Science SIGMOD International Conference on Management of
Keele University, UK, and Empirical Software Engi- Data. SIGMOD ’15. New York, NY, USA: ACM, 2015,
neering, National ICT Australia Ltd, 2004, p. 26. pp. 1445–1450.
[9] International Organization for Standardization. Informa- [20] M. Barbaro and T. Zeller. “A Face Is Exposed for AOL
tion technology – Security techniques – Information se- Searcher No. 4417749”. In: The New York Times (Aug.
curity management systems – Overview and vocabulary. 2006).
Standard. Geneva, CH: International Organization for [21] A. Narayanan and V. Shmatikov. “Robust De-
Standardization, Feb. 2016. anonymization of Large Sparse Datasets”. In: IEEE
[10] International Organization for Standardization. Informa- Symposium on Security and Privacy, 2008. SP 2008.
tion technology – Security techniques – Privacy frame- May 2008, pp. 111–125.
work. Standard. Geneva, CH: International Organization [22] P. Samarati and L. Sweeney. Protecting privacy when
for Standardization, Dec. 2011. disclosing information: k-anonymity and its enforce-
[11] A. Liu et al. “Efficient secure similarity computation ment through generalization and suppression. Tech. rep.
on encrypted trajectory data”. In: 2015 IEEE 31st SRI International, 1998.
International Conference on Data Engineering (ICDE). [23] L. Sweeney. “k-anonymity: A model for protecting
2015 IEEE 31st International Conference on Data En- privacy”. In: International Journal of Uncertainty,
gineering (ICDE). 2015, pp. 66–77. Fuzziness and Knowledge-Based Systems 10.05 (2002),
[12] Y.-W. Chu et al. “Privacy-Preserving SimRank over Dis- pp. 557–570.
tributed Information Network”. In: 2012 IEEE 12th In- [24] A. Machanavajjhala et al. “L-diversity: Privacy beyond
ternational Conference on Data Mining (ICDM). 2012 k -anonymity”. In: ACM Transactions on Knowledge
IEEE 12th International Conference on Data Mining Discovery from Data 1.1 (2007), 3–es.
(ICDM). 2012, pp. 840–845. [25] N. Li et al. “t-Closeness: Privacy Beyond k-Anonymity
[13] G. Bender et al. “Explainable Security for Relational and l-Diversity.” In: ICDE. Vol. 7. 2007, pp. 106–115.
Databases”. In: Proceedings of the 2014 ACM SIGMOD [26] C. Dwork. “Differential privacy”. In: Automata, lan-
International Conference on Management of Data. SIG- guages and programming. Springer, 2006, pp. 1–12.
MOD ’14. New York, NY, USA: ACM, 2014, pp. 1411– [27] H. Zakerzadeh et al. “Privacy-preserving big data pub-
1422. lishing”. In: Proceedings of the 27th International Con-
[14] A. Meacham and D. Shasha. “JustMyFriends: Full SQL, ference on Scientific and Statistical Database Manage-
Full Transactional Amenities, and Access Privacy”. In: ment. ACM, June 2015, p. 26.
Proceedings of the 2012 ACM SIGMOD International [28] Y. Wang and B. Zheng. “Preserving privacy in social
Conference on Management of Data. SIGMOD ’12. networks against connection fingerprint attacks”. In:
New York, NY, USA: ACM, 2012, pp. 633–636. 2015 IEEE 31st International Conference on Data
Engineering (ICDE). 2015 IEEE 31st International Con-
ference on Data Engineering (ICDE). 2015, pp. 54–65.

3701
[29] P. Jurczyk et al. “DObjects+: Enabling Privacy- ference on Data Engineering (ICDE). 2014, pp. 664–
Preserving Data Federation Services”. In: 2012 IEEE 675.
28th International Conference on Data Engineering [38] B. Yao et al. “Secure nearest neighbor revisited”. In:
(ICDE). 2012 IEEE 28th International Conference on 2013 IEEE 29th International Conference on Data En-
Data Engineering (ICDE). 2012, pp. 1325–1328. gineering (ICDE). 2013 IEEE 29th International Con-
[30] J. Cao and P. Karras. “Publishing Microdata with a ference on Data Engineering (ICDE). 2013, pp. 733–
Robust Privacy Guarantee”. In: Proc. VLDB Endow. 744.
5.11 (2012), pp. 1388–1399. [39] E. Y. Gorodov and V. V. Gubarev. “Analytical review of
[31] G. Acs et al. “Differentially Private Histogram Pub- data visualization methods in application to big data”.
lishing through Lossy Compression”. In: 2012 IEEE In: Journal of Electrical and Computer Engineering
12th International Conference on Data Mining (ICDM). 2013 (Jan. 2013), p. 22.
2012 IEEE 12th International Conference on Data Min- [40] H. To et al. “PrivGeoCrowd: A toolbox for studying
ing (ICDM). 2012, pp. 1–10. private spatial Crowdsourcing”. In: 2015 IEEE 31st
[32] G. Cormode et al. “Differentially Private Spatial De- International Conference on Data Engineering (ICDE).
compositions”. In: 2012 IEEE 28th International Con- 2015 IEEE 31st International Conference on Data En-
ference on Data Engineering (ICDE). 2012 IEEE 28th gineering (ICDE). 2015, pp. 1404–1407.
International Conference on Data Engineering (ICDE). [41] G. Kellaris et al. “Differentially Private Event Se-
2012, pp. 20–31. quences over Inﬁnite Streams”. In: Proc. VLDB Endow.
[33] X. Hu et al. “Differential Privacy in Telco Big Data Plat- 7.12 (2014), pp. 1155–1166.
form”. In: Proc. VLDB Endow. 8.12 (2015), pp. 1692– [42] C. Dwork et al. “Differential privacy under contin-
1703. ual observation”. In: Proceedings of the forty-second
[34] D. Proserpio et al. “Calibrating Data to Sensitivity in ACM symposium on Theory of computing. ACM, 2010,
Private Data Analysis: A Platform for Differentially- pp. 715–724.
private Analysis of Weighted Datasets”. In: Proc. VLDB [43] S. Peng et al. “Query optimization for differentially
Endow. 7.8 (2014), pp. 637–648. private data management systems”. In: 2013 IEEE 29th
[35] F. D. McSherry. “Privacy integrated queries: an ex- International Conference on Data Engineering (ICDE).
tensible platform for privacy-preserving data analysis”. 2013 IEEE 29th International Conference on Data En-
In: Proceedings of the 2009 ACM SIGMOD Interna- gineering (ICDE). 2013, pp. 1093–1104.
tional Conference on Management of data. ACM, 2009, [44] M. Sathiamoorthy et al. “XORing elephants: novel
pp. 19–30. erasure codes for big data”. In: Proceedings of the 39th
[36] P. Jain and A. Thakurta. “Differentially private learning international conference on Very Large Data Bases.
with kernels”. In: Proceedings of the 30th International VLDB’13. Trento, Italy: VLDB Endowment, 2013,
Conference on Machine Learning (ICML-13). 2013, pp. 325–336.
pp. 118–126. [45] P. Wang and C. V. Ravishankar. “Secure and efﬁ-
[37] Y. Elmehdwi et al. “Secure k-nearest neighbor query cient range queries on outsourced databases using Rp-
over encrypted data in outsourced environments”. In: trees”. In: 2013 IEEE 29th International Conference
2014 IEEE 30th International Conference on Data En- on Data Engineering (ICDE). 2013 IEEE 29th Interna-
gineering (ICDE). 2014 IEEE 30th International Con- tional Conference on Data Engineering (ICDE). 2013,
pp. 314–325.

3702

Big Data and Security Challenges
100% (1)
Big Data and Security Challenges
2 pages
A Review Paper On Big Data Analytics: Ankita S. Tiwarkhede, Prof. Vinit Kakde
No ratings yet
A Review Paper On Big Data Analytics: Ankita S. Tiwarkhede, Prof. Vinit Kakde
4 pages
Informatics Engineering, An International Journal (IEIJ)
No ratings yet
Informatics Engineering, An International Journal (IEIJ)
20 pages
Jsaer2016 03 02 106 108
No ratings yet
Jsaer2016 03 02 106 108
3 pages
Test Bank For Information Systems Today 9th Edition by Valacich
100% (3)
Test Bank For Information Systems Today 9th Edition by Valacich
35 pages
Fulltext Big Data Security
No ratings yet
Fulltext Big Data Security
11 pages
Big Data Privacy
No ratings yet
Big Data Privacy
30 pages
Big Data Privacy A Technological Review PDF
No ratings yet
Big Data Privacy A Technological Review PDF
25 pages
15 Years of Big Data: A Systematic Literature Review: Open Access Survey
No ratings yet
15 Years of Big Data: A Systematic Literature Review: Open Access Survey
39 pages
Security Challenges of Big Data Computing
No ratings yet
Security Challenges of Big Data Computing
8 pages
Elementaryeducationonline 10-21-37
No ratings yet
Elementaryeducationonline 10-21-37
31 pages
1 s2.0 S1877050916322864 Main
No ratings yet
1 s2.0 S1877050916322864 Main
10 pages
Rafiq Et Al 2022 Privacy Prevention of Big Data Applications A Systematic Literature Review
No ratings yet
Rafiq Et Al 2022 Privacy Prevention of Big Data Applications A Systematic Literature Review
23 pages
Engineering Journal Big Data and Apache Spark: A Review
No ratings yet
Engineering Journal Big Data and Apache Spark: A Review
5 pages
Rise of Big Data - Issues and Challenges
No ratings yet
Rise of Big Data - Issues and Challenges
6 pages
Next-Generation Big Data Analytics
No ratings yet
Next-Generation Big Data Analytics
9 pages
Big Data Methodology
No ratings yet
Big Data Methodology
12 pages
Privacy Issues and Data Protection in Big Data: A Case Study Analysis Under GDPR
No ratings yet
Privacy Issues and Data Protection in Big Data: A Case Study Analysis Under GDPR
7 pages
SID-0000003141750 Optimized
No ratings yet
SID-0000003141750 Optimized
158 pages
Big Data A Review
No ratings yet
Big Data A Review
6 pages
Analytical Paragraph - Grade 10
No ratings yet
Analytical Paragraph - Grade 10
4 pages
Toward Efficient and Privacy-Preserving Computing in Big Data Era
No ratings yet
Toward Efficient and Privacy-Preserving Computing in Big Data Era
5 pages
Big Data: A Review: Seref SAGIROGLU and Duygu SINANC
No ratings yet
Big Data: A Review: Seref SAGIROGLU and Duygu SINANC
6 pages
Plagiarism Scan Report: Content Checked For Plagiarism
No ratings yet
Plagiarism Scan Report: Content Checked For Plagiarism
3 pages
Security Issues Associated With Big Data - Final
No ratings yet
Security Issues Associated With Big Data - Final
15 pages
Research IN BIG Data - AN: Dr. S.Vijayarani and Ms. S.Sharmila
No ratings yet
Research IN BIG Data - AN: Dr. S.Vijayarani and Ms. S.Sharmila
20 pages
A Review On Big Data Privacy and Security
No ratings yet
A Review On Big Data Privacy and Security
6 pages
Big Data Security Management Issues
No ratings yet
Big Data Security Management Issues
4 pages
LRDM Local Record-Driving Mechanism For Big Data Privacy Preservation in Social Networks
No ratings yet
LRDM Local Record-Driving Mechanism For Big Data Privacy Preservation in Social Networks
5 pages
Isssc50941 2020 9358878
No ratings yet
Isssc50941 2020 9358878
6 pages
Policy Paper Final - Ethics and Society
No ratings yet
Policy Paper Final - Ethics and Society
9 pages
C Program Advance
No ratings yet
C Program Advance
290 pages
Big Data and Data Security 2
No ratings yet
Big Data and Data Security 2
6 pages
27 May Systematic Review of Privacy Preservation Techniques in Big Data Analytics
No ratings yet
27 May Systematic Review of Privacy Preservation Techniques in Big Data Analytics
10 pages
Bigdata
No ratings yet
Bigdata
3 pages
Adobe Scan Jul 28, 2023
No ratings yet
Adobe Scan Jul 28, 2023
2 pages
Azure Data Engineering Project Part 1
No ratings yet
Azure Data Engineering Project Part 1
41 pages
Security Challenges of Big Data Computin
No ratings yet
Security Challenges of Big Data Computin
8 pages
Big Data Security and Privacy A Review On Issues C
No ratings yet
Big Data Security and Privacy A Review On Issues C
7 pages
Kevin Cooper - Computer Programming For Beginners - 3 Books in 1 - Step by Step Guide To Learn Programming, Python For Beginners, Python Machine Learning
No ratings yet
Kevin Cooper - Computer Programming For Beginners - 3 Books in 1 - Step by Step Guide To Learn Programming, Python For Beginners, Python Machine Learning
501 pages
Aristotle Data Model
No ratings yet
Aristotle Data Model
20 pages
Big Data Security and Privacy A Review On Issues C
No ratings yet
Big Data Security and Privacy A Review On Issues C
7 pages
ML, ALgo Roadmap
No ratings yet
ML, ALgo Roadmap
21 pages
Top Books in Trending Tech-4
No ratings yet
Top Books in Trending Tech-4
6 pages
Big Data Security I
No ratings yet
Big Data Security I
8 pages
17 Ijcse 01468
No ratings yet
17 Ijcse 01468
7 pages
Security and Privacy in Big Data
No ratings yet
Security and Privacy in Big Data
2 pages
Hossain Et Al. - 2021 - Text Mining and Sentiment Analysis of Newspaper He
No ratings yet
Hossain Et Al. - 2021 - Text Mining and Sentiment Analysis of Newspaper He
15 pages
Csm-Part C
No ratings yet
Csm-Part C
25 pages
2022 ICAIoT
No ratings yet
2022 ICAIoT
7 pages
Semantic Web Unit-IV
No ratings yet
Semantic Web Unit-IV
19 pages
Big Data Security Issues
No ratings yet
Big Data Security Issues
7 pages
Bi Insem Paper Paper Format
No ratings yet
Bi Insem Paper Paper Format
1 page
Data Science Unit 1 Unit 2
No ratings yet
Data Science Unit 1 Unit 2
49 pages
Informatica Transformations: Aggregator Transformation
No ratings yet
Informatica Transformations: Aggregator Transformation
7 pages
Slowly Changing Dimension
No ratings yet
Slowly Changing Dimension
6 pages
Ahad Beykaei AI ML Manager
No ratings yet
Ahad Beykaei AI ML Manager
7 pages
Trần Tuấn Kiệt: Data Analyst/ Data Science Intern
No ratings yet
Trần Tuấn Kiệt: Data Analyst/ Data Science Intern
3 pages
Zetaris Technology White Paper
No ratings yet
Zetaris Technology White Paper
12 pages
An Overview of Named Entity Recognition
No ratings yet
An Overview of Named Entity Recognition
5 pages
Harshavarddhan: Contact
No ratings yet
Harshavarddhan: Contact
2 pages
MCQ Question - Multimedia System - V Sem
No ratings yet
MCQ Question - Multimedia System - V Sem
21 pages
GGI 4202 Spatial Business Intelligence - Course Outline
No ratings yet
GGI 4202 Spatial Business Intelligence - Course Outline
2 pages
Semester 5 Modules: Specialization Artificial Intelligence (Intelligence Artificielle - IA - )
No ratings yet
Semester 5 Modules: Specialization Artificial Intelligence (Intelligence Artificielle - IA - )
45 pages
Iso PRF 4669 1
No ratings yet
Iso PRF 4669 1
12 pages
Click To Edit Master Title Style: Evaluation Techniques For
No ratings yet
Click To Edit Master Title Style: Evaluation Techniques For
15 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
6 pages
Alex G. Shaller: Qualifications Summary
No ratings yet
Alex G. Shaller: Qualifications Summary
2 pages
PTI BSC Füzet Angol 2021 20242025 CURRICULUM 0
No ratings yet
PTI BSC Füzet Angol 2021 20242025 CURRICULUM 0
5 pages
0796 Ict Al p2 Soremex 2025
No ratings yet
0796 Ict Al p2 Soremex 2025
5 pages
(CA) Rdbms Internal 1 Question Bank Atnms
No ratings yet
(CA) Rdbms Internal 1 Question Bank Atnms
3 pages
Enterprise Data Science: Smarter Decisions with Big Data
From Everand
Enterprise Data Science: Smarter Decisions with Big Data
Vidhur Gupta
No ratings yet
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
Data-Driven Security: Analysis, Visualization and Dashboards
From Everand
Data-Driven Security: Analysis, Visualization and Dashboards
Jay Jacobs
No ratings yet
Data Science, AI, and Blockchain: Integrated Approaches
From Everand
Data Science, AI, and Blockchain: Integrated Approaches
Ekaaksh Deshpande
No ratings yet
Cybersecurity
From Everand
Cybersecurity
Harry Katzan Jr.
No ratings yet
The Little Book of Cybersecurity
From Everand
The Little Book of Cybersecurity
Harry Katzan Jr.
No ratings yet
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
Cybersecurity: Issues of Today, a Path for Tomorrow
From Everand
Cybersecurity: Issues of Today, a Path for Tomorrow
Daniel Reis
No ratings yet
Edge Computing Architecture and Applications: Definitive Reference for Developers and Engineers
From Everand
Edge Computing Architecture and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Designing Secure and Scalable IoT Systems: Definitive Reference for Developers and Engineers
From Everand
Designing Secure and Scalable IoT Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Decision Support Systems: Concepts and Applications
From Everand
Decision Support Systems: Concepts and Applications
Richard Johnson
No ratings yet
Data Loss Prevention Technologies and Strategies: Definitive Reference for Developers and Engineers
From Everand
Data Loss Prevention Technologies and Strategies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Enterprise Data Protection with Rubrik: Definitive Reference for Developers and Engineers
From Everand
Enterprise Data Protection with Rubrik: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Enterprise Data Protection with Veritas Technologies: Definitive Reference for Developers and Engineers
From Everand
Enterprise Data Protection with Veritas Technologies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Big Data: Statistics, Data Mining, Analytics, And Pattern Learning
From Everand
Big Data: Statistics, Data Mining, Analytics, And Pattern Learning
Rob Botwright
No ratings yet
Image Retrieval: Unlocking the Power of Visual Data
From Everand
Image Retrieval: Unlocking the Power of Visual Data
Fouad Sabry
No ratings yet
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet

Nelson 2016

Uploaded by

Nelson 2016

Uploaded by

2016 IEEE International Conference on Big Data (Big Data)

Security and Privacy for Big Data: A Systematic

978-1-4673-9005-7/16/$31.00 ©2016 IEEE 3693

You might also like