0% found this document useful (0 votes)
8 views9 pages

Big Data

Big Data refers to extremely large data sets, typically measured in petabytes, generated from various sources such as social media, e-commerce, and telecommunications. It is characterized by the 5 V's: Volume, Variety, Veracity, Value, and Velocity, which highlight its complexity and the need for advanced analytics. While Big Data offers significant advantages like improved decision-making and operational efficiency, it also presents challenges including a talent gap, security risks, and high costs.

Uploaded by

Avani Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views9 pages

Big Data

Big Data refers to extremely large data sets, typically measured in petabytes, generated from various sources such as social media, e-commerce, and telecommunications. It is characterized by the 5 V's: Volume, Variety, Veracity, Value, and Velocity, which highlight its complexity and the need for advanced analytics. While Big Data offers significant advantages like improved decision-making and operational efficiency, it also presents challenges including a talent gap, security risks, and high costs.

Uploaded by

Avani Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Unit 4.

Big Data
What is Big Data
Data which are very large in size is called Big Data. Normally we work on data of size
MB(WordDoc ,Excel) or maximum GB(Movies, Codes) but data in Peta bytes i.e. 10^15 byte
size is called Big Data. It is stated that almost 90% of today's data has been generated in the
past 3 years.

Sources of Big Data


These data come from many sources like

o Social networking sites: Facebook, Google, LinkedIn all these sites generates huge
amount of data on a day to day basis as they have billions of users worldwide.
o E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge amount of logs
from which users buying trends can be traced.
o Weather Station: All the weather station and satellite gives very huge data which are
stored and manipulated to forecast weather.
o Telecom company: Telecom giants like Airtel, Vodafone study the user trends and
accordingly publish their plans and for this they store the data of its million users.
o Share Market: Stock exchange across the world generates huge amount of data
through its daily transaction.

Big Data Characteristics


There are five v's of Big Data that explains the characteristics.

5 V's of Big Data


o Volume
o Veracity
o Variety
o Value
o Velocity
Volume

The name Big Data itself is related to an enormous size. Big Data is a vast 'volumes' of data
generated from many sources daily, such as business processes, machines, social media
platforms, networks, human interactions, and many more.

Facebook can generate approximately a billion messages, 4.5 billion times that the "Like"
button is recorded, and more than 350 million new posts are uploaded each day. Big data
technologies can handle large amounts of data.994

Prime Ministers of India | List of Prime Minister of India (1947-2020)

Variety

Big Data can be structured, unstructured, and semi-structured that are being collected
from different sources. Data will only be collected from databases and sheets in the past,
But these days the data will comes in array forms, that are PDFs, Emails, audios, SM posts,
photos, videos, etc.
The data is categorized as below:

a. Structured data: In Structured schema, along with all the required columns. It is in a
tabular form. Structured Data is stored in the relational database management system.
b. Semi-structured: In Semi-structured, the schema is not appropriately defined,
e.g., JSON, XML, CSV, TSV, and email. OLTP (Online Transaction Processing)
systems are built to work with semi-structured data. It is stored in relations,
i.e., tables.
c. Unstructured Data: All the unstructured files, log files, audio files, and image files
are included in the unstructured data. Some organizations have much data available,
but they did not know how to derive the value of data since the data is raw.
d. Quasi-structured Data:The data format contains textual data with inconsistent data
formats that are formatted with effort and time with some tools.

Example: Web server logs, i.e., the log file is created and maintained by some server that
contains a list of activities.

Veracity

Veracity means how much the data is reliable. It has many ways to filter or translate the data.
Veracity is the process of being able to handle and manage data efficiently. Big Data is also
essential in business development.

For example, Facebook posts with hashtags.

Value
Value is an essential characteristic of big data. It is not the data that we process or store. It
is valuable and reliable data that we store, process, and also analyze.
Velocity
Velocity plays an important role compared to others. Velocity creates the speed by which the
data is created in real-time. It contains the linking of incoming data sets speeds, rate of
change, and activity bursts. The primary aspect of Big Data is to provide demanding data
rapidly.

Big data velocity deals with the speed at the data flows from sources like application logs,
business processes, networks, and social media sites, sensors, mobile devices, etc.

Applications of Big Data


The term Big Data is referred to as large amount of complex and unprocessed data. Now a
day's companies use Big Data to make business more informative and allows to take
business decisions by enabling data scientists, analytical modelers and other professionals to
analyse large volume of transactional data. Big data is the valuable and powerful fuel that
drives large IT industries of the 21st century. Big data is a spreading technology used in each
business sector. In this section, we will discuss application of Big Data.

Travel and Tourism


Travel and tourism are the users of Big Data. It enables us to
forecast travel facilities requirements at multiple locations, improve
business through dynamic pricing, and many more.
Financial and banking sector

The financial and banking sectors use big data technology


extensively. Big data analytics help banks and customer behaviour on
the basis of investment patterns, shopping trends, motivation to
invest, and inputs that are obtained
from personal or financial backgrounds.

Healthcare
Big data has started making a massive difference in
the healthcare sector, with the help of predictive analytics, medical
professionals, and health care personnel. It can
produce personalized healthcare and solo patients also.

Telecommunication and media


Telecommunications and the multimedia sector are the main
users of Big Data. There are zettabytes to be generated every
day and handling large-scale data that require big data
technologies.

Government and Military

The government and military also used technology at high


rates. We see the figures that the government makes on the
record. In the military, a fighter plane requires to
process petabytes of data.

Government agencies use Big Data and run many agencies,


managing utilities, dealing with traffic jams, and the effect of crime like hacking and online
fraud.

Aadhar Card: The government has a record of 1.21 billion citizens. This vast data is
analyzed and store to find things like the number of youth in the country. Some schemes are
built to target the maximum population. Big data cannot store in a traditional database, so it
stores and analyze data by using the Big Data Analytics tools.
E-commerce
E-commerce is also an application of Big data. It
maintains relationships with customers that are essential
for the e-commerce industry. E-commerce websites
have many marketing ideas to retail merchandise
customers, manage transactions, and implement better
strategies of innovative ideas to improve businesses
with Big data.

o Amazon: Amazon is a tremendous e-commerce website dealing with lots of traffic


daily. But, when there is a pre-announced sale on Amazon, traffic increase rapidly that
may crash the website. So, to handle this type of traffic and data, it uses Big Data. Big
Data help in organizing and analyzing the data for far use.

Social Media

Social Media is the largest data generator. The statistics


have shown that around 500+ terabytes of fresh data
generated from social media daily, particularly
on Facebook. The data mainly contains videos, photos,
message exchanges, etc. A single activity on the social
media site generates many stored data and gets processed when required. The data stored is
in terabytes (TB); it takes a lot of time for processing. Big Data is a solution to the problem.

Advantages of Big Data


1. Making wiser decisions

Businesses use big data to enhance B2B operations, advertising, and


communication. Big data is primarily being used by many industries, such as travel, real
estate, finance, and insurance, to enhance decision-making. Businesses can use big data to
accurately predict what customers want and don't want, as well as their behavioural
tendencies because it reveals more information in a usable format.

Big data provides business intelligence and cutting-edge analytical insights that help with
decision-making. A company can get a more in-depth picture of its target market by
collecting more customer data.

Business trends and behaviours are revealed by data-driven insights, which also help
businesses compete and grow by enhancing their decision-making. Additionally, these
insights help companies develop more specialised goods and services, strategies, and
intelligent marketing campaigns to compete in their sector.

2. Cut back on the expense of business operations

According to surveys done by New Vantage and Syncsort (now Precisely), big data analytics
has helped businesses significantly cut their costs. Big data is being used to cut costs, according
to 66.7% of survey participants from New Vantage. Moreover, 59.4% of Syncsort survey
participants stated that using big data tools improved operational efficiency and reduced
costs. Do you know that Hadoop and Cloud-Based Analytics, two popular big data analytics
tools, can help lower the cost of storing big data

3. Detection of Fraud

Financial companies especially use big data to identify fraud. To find anomalies and
transaction patterns, data analysts use artificial intelligence and machine learning algorithms.
These irregularities in transaction patterns show that something is out of place or that there
is a mismatch, providing us with hints about potential fraud.

For credit unions, banks, and credit card companies, fraud detection is crucial for identifying
account information, materials, or product access. By spotting frauds before they cause
problems, any industry, including finance, can provide better customer service.

For instance, using big data analytics, banks and credit card companies can identify fraudulent
purchases or credit cards that have been stolen even before the cardholder becomes aware of
the issue.

4. A rise in productivity

A survey by Syncsort found that 59.9% of respondents said they were using big data
analytics tools like Spark and Hadoop to boost productivity. They have been able to
increase sales and improve customer retention as a result of this rise in productivity. Modern
big data tools make it possible for data scientists and analysts to analyse a lot of data quickly
and effectively, giving them an overview of more data.

They become more productive as a result of this. Additionally, big data analytics aids data
scientists and analysts in learning more about themselves to figure out how to be more
effective in their tasks and job responsibilities. As a result, investing in big data analytics
gives businesses across all sectors a chance to stand out through improved productivity.

5. Enhanced customer support

As part of their marketing strategies, businesses must improve customer interactions. Since
big data analytics give businesses access to more information, they can use that information
to make more specialised, highly personalised offers to each individual customer as well as
more targeted marketing campaigns.

Social media, email exchanges, customer CRM (customer relationship management) systems,
and other major data sources are the main sources of big data. As a result, it provides
businesses with access to a wealth of data about the needs, interests, and trends of their
target market.

Big data also enables businesses better to comprehend the thoughts and feelings of their
clients to provide them with more individualised goods and services. Providing a
personalised experience can increase client satisfaction, strengthen bonds with clients, and,
most importantly, foster loyalty.

6. Enhanced speed and agility


Increasing business agility is a big data benefit for competition. Big data analytics can assist
businesses in becoming more innovative and adaptable in the marketplace. Large customer
data sets can be analysed to help businesses gain insights ahead of the competition and
more effectively address customer pain points.

Additionally, having a wealth of data at their disposal enables businesses to assess risks,
enhance products and services, and improve communications. Additionally, big data assists
businesses in strengthening their business tactics and strategies, which are crucial in
coordinating their operations to support frequent and quick changes in the industry.

7. Greater innovation

Innovation is another common benefit of big data, and the NewVantage survey found that 11.6
per cent of executives are investing in analytics primarily as a means to innovate and disrupt
their markets. They reason that if they can glean insights that their competitors don't have,
they may be able to get out ahead of the rest of the market with new products and services.

Disadvantages of Big Data


1. A talent gap

A study by AtScale found that for the past three years, the biggest challenge in this industry
has been a lack of big data specialists and data scientists. Given that it requires a different
skill set, big data analytics is currently beyond the scope of many IT professionals. Finding
data scientists who are also knowledgeable about big data can be difficult.

Data scientists and big data specialists are two well-paid professions in the data science
industry. As a result, hiring big data analysts can be very costly for businesses, particularly for
start-ups. Some businesses must wait a long time to hire the necessary personnel to carry
out their big data analytics tasks.

2. Security hazard

For big data analytics, businesses frequently collect sensitive data. These data need to be
protected, and security risks can be detrimental if they are not properly maintained.

Additionally, having access to enormous data sets can attract the unwanted attention of
hackers, and your company could become the target of a potential cyber-attack. You are
aware that for many businesses today, data breaches are the biggest threat. Unless you take
all necessary precautions, important information could be leaked to rivals, which is another
risk associated with big data.

3. Adherence

Another disadvantage of big data is the requirement for legal compliance with governmental
regulations. To store, handle, maintain, and process big data that contains sensitive or
private information, a company must make sure that they adhere to all applicable laws and
industry standards. As a result, managing data governance tasks, transmission, and storage
will become more challenging as big data volumes grow.

4. High Cost
Given that it is a science that is constantly evolving and has as its goal the processing of
ever-increasing amounts of data, only large companies can sustain the investment in the
development of their Big Data techniques.

5. Data quality

Dealing with data quality issues was the main drawback of working with big data. Data
scientists and analysts must ensure the data they are using is accurate, pertinent, and in the
right format for analysis before they can use big data for analytics efforts.

This significantly slows down the reporting process, but if businesses don't address data
quality problems, they may discover that the insights their analytics produce are useless or
even harmful if used.

6. Rapid Change

The fact that technology is evolving quickly is another potential disadvantage of big data
analytics. Businesses must deal with the possibility of spending money on one technology
only to see something better emerge a few months later. This big data drawback was ranked
fourth among all the potential difficulties by Syncsort respondents.

You might also like