0% found this document useful (0 votes)
68 views12 pages

Big Data Methodology

This document summarizes a research report on security challenges in big data applications. The report discusses how big data provides both opportunities and security risks due to the large volumes of diverse data from many sources. It also describes common data types used in big data like structured, unstructured, and semi-structured data. The research methodology used both qualitative and quantitative analysis of data collected through surveys of organizations using big data. Key findings from the surveys showed security and privacy of data as major challenges for organizations, along with issues of data growth, integration and understanding big data.

Uploaded by

shahab qureshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views12 pages

Big Data Methodology

This document summarizes a research report on security challenges in big data applications. The report discusses how big data provides both opportunities and security risks due to the large volumes of diverse data from many sources. It also describes common data types used in big data like structured, unstructured, and semi-structured data. The research methodology used both qualitative and quantitative analysis of data collected through surveys of organizations using big data. Key findings from the surveys showed security and privacy of data as major challenges for organizations, along with issues of data growth, integration and understanding big data.

Uploaded by

shahab qureshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Report Title: Security in Big Data Applications

Introduction
Big Data means the exposure of most of the data for the diverse variety of the users, it

provides access to data sources in an easy manner, security linked to the data base is provided,

the private information is opened to be analyzed by the person, the methods of the data

encoding and encryption leads to the data corruption.

Big Data is an issue without a doubt yet it isn't as close beginning at an issue or at the size of

use, control, and security of the information. The advanced security is one of the most

concerning issue on the rising. An outline would be the security attack on Target.The

programmers took all the data set aside inside targets data base, including customer's

information, for instance card information, name, address, character, social protections and

altogether more. So not considering the way that information is immense, the issue of the

advanced security and control is altogether significant. The information is excessive anyway it

also causes you make decisions speedier, quickens the system even more then ever some time

as of late.

Wrong information can be a major issue, since we are depending on data that isn't correct and

can prompt huge disappointment of the framework. It looks like having a correct name yet not

the location. Assume you convey an item like this case, the buyer can gripe that he didn't get,

however according to information it is conveyed to the ideal individual.

Another model can be a client taken care of the apparent multitude of tabs, however the bogus

information refreshed for another client. This could prompt separation of assets for the first

client. Every choice that made on defective information would prompt a disappointment. So

the information we have ought to be of clear and genuine data.

The information is costly to keep up and it is never to going to stop develop. The information

will build each day, so is the expense of upkeep. Something else is organization needs to plan
approaches for who to access and who to adjust the information and arranging it. The

information can be of various sources in various dialects and it very well may be repetitve as

well. So the organization ought to have components to deal with immense measures of

information.

The low quality of information for a model repetitive data of same thing, which increment

expenses and endeavors to look after it. Ineffectively composed information will prompt

deferral in handling and giving outcomes.

Data Sets

There are several data sets used for the big data as it includes structured, unstructured and semi-

structured. The semi-structured if further classified into variety, velocity and volume.

Structured Data

It is the data knowledge which can be controlled and then recovered in a particular fixed

organization or firm. It refers to the data that is sorted and then can be used without any

interruption and information is formed on the basis of search engine.

Unstructured Data

This data refers to the knowledge about the data which do not possess the particular structure and

it allows it to make a measurable and investigating one.

Examples of the unstructured information incorporate Relational Database System (RDBMS)

and the spreadsheets, which just responses to the inquiries regarding what occurred. These

information base just gives an understanding to an issue at the little level. Anyway so as to

improve the capacity of an association, to acquire understanding into the information and

furthermore to think about metadata unstructured information is utilized . Large information


utilizes the semi-organized and unstructured information and improves the assortment of the

information accumulated from various sources like clients, crowd or endorsers. After the

assortment, Bid information changes it into information based data .

Semi-structured Data

It relates to the information that comprise of both the arrangements linked to the organized

information and to be precise it relates mainly to the information that is not characterized and yet

it contains, yet contains imperative data or labels that isolate singular components inside the

information.

Variety

This type of data that is accumulated from different sources. This specific information must be

gathered from spreadsheets and data sets, today information arrives in a variety of structures, for

example, messages, PDFs, photographs, recordings, sounds, SM posts, thus substantially more.

Assortment is one of the significant qualities of huge information.

Velocity

It refers to the speed at which information is being gathered continuously. In a more extensive

possibility, it consists of the progress pace, connection and the particular information indexes.

Volume

It is one of the attributes of large information. We definitely realize that Big Data demonstrates

gigantic 'volumes' of information that is being created constantly using various links like web-

based media stages, business measures, machines, organizations, human collaborations, and so

on and this large amount of information are put away in information distribution centers.
Subsequently reaches the finish of qualities. It is capacity of huge measure of information would

lessen the expense for putting away the information and aids in providing the business

knowledge.

Research Methodology
The research work provided here is the secondary one and the methodology used is qualitative

and quantitative as first one involves the inclusion of non-numerical data and information and

later includes the use of numerical data as analysis is done to find out what problems and on how

much scale are they facing because of the big data.

Research Philosophy

The philosophy of the big data is explained by help of the two types of the data. Firstly, the

theoretical information that is collected from the work of the others in which data challenges and

risks of security of big data is explained in detail and after that the other philosophy linked is the

numeric data i.e. quantitative data and the data gathered for this process is from the primary

research conducted of the companies as questionnaire was filled by manager of every firm and

data was organized on the basis of that and pie chart are also drawn to exhibit the information.

Research Approach

The approach of the research is quite obvious as the data gathered from the literature review is

used for the qualitative research termed as secondary data and on other hand the data gathered
using the questionnaires are collected by surveying the organizations using the big data. The

methodology provides deep information regarding the research approach.

Research Strategy

There are several strategies used for the research process as here the one used is of questionnaire

as several questions related to big data security are asked from the high level managers of the

organizations and responses are collected by this source. The data comprise of both primary and

secondary data as the primary data is the one in which data is gathered on first hand source as by

help of the questionnaires and secondary data is that which is collected by the work of the others.

Data Analysis

In this phase, data is collected by the method as discussed above as here about 10 organizations

were taken and research was conducted from them. The necessary information will be gathered

utilizing a survey with help of the close ended questions. This will take into consideration less

equivocalness and guarantee that the information gathered is exact and thusly fitting for the

examination reason. Interestingly, the examination plan of the paper that has been counseled for

this investigation has an experimental methodology and the short pool of cases broke down

limits its materialness to a more extensive setting.

So as to gather the necessary information, particular SMEs will be chosen, commonly the

individuals who as of now have executed the utilization of large information examination in their

business measures. The inquiries will be pointed toward understanding the reduction or the

expansion in their going through with respect to the security and protection upkeep of the
information frameworks. This information as gathered will at that point be examined utilizing

SPSS to offer ascent to a lot of evident and solid outcomes

Survey Questions

Q1 Does your organization work with the big data?

Yes = 36.8

No = 19.8

Planned in near future = 43.4

Q2 Which security areas of Big Data are used?

Hadoop = 21

Cloud computing = 54

Monitoring = 18

Auditing = 7
Q 3 What are challenges of big data faced by your organization?

Lack of proper understanding = 15

Growth issues of data = 29

Integrating data = 37

Securing data = 19

Q4 What are dangers of big data?


Limitations of GDPR = 20

Online marketing can be aggressive = 15

Privacy problem = 35

Easy to remember passwords = 30

Q5 Are current laws and legislations to big data?

Yes = 45

No = 55

Q6 Is Big Data an independent phenomenon?


Yes= 35

No= 45

May be= 20

Q7 Do you prefer good data or good models?

Yes =75

No = 25
Q8 Keeping in view the threats to big data, is this really the future?

Yes = 85

No = 10

May be = 5

References

Kim, S. H., Kim, N. U., & Chung, T. M. (2013, December). Attribute relationship evaluation methodology

for big data security. In 2013 International conference on IT convergence and security (ICITCS) (pp. 1-4).

IEEE.
Zhang, Y., Zhang, G., Chen, H., Porter, A. L., Zhu, D., & Lu, J. (2016). Topic analysis and forecasting for

science, technology and innovation: Methodology with a case study focusing on big data

research. Technological Forecasting and Social Change, 105, 179-191.

You might also like