0% found this document useful (0 votes)
55 views22 pages

Big Data

The document provides an introduction to big data platforms. It defines big data and describes its key characteristics including volume, variety, velocity, and veracity. It explains that a big data platform is an integrated computing solution that combines software, tools, and hardware to manage large amounts of data in an organized manner at scale. Finally, it provides examples of big data platforms like Google Cloud Platform, AWS, and Microsoft Azure.

Uploaded by

NeeharikaPandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views22 pages

Big Data

The document provides an introduction to big data platforms. It defines big data and describes its key characteristics including volume, variety, velocity, and veracity. It explains that a big data platform is an integrated computing solution that combines software, tools, and hardware to manage large amounts of data in an organized manner at scale. Finally, it provides examples of big data platforms like Google Cloud Platform, AWS, and Microsoft Azure.

Uploaded by

NeeharikaPandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Big Data Platform

INTRODUCTION TO DATA ANALYTICS AND


VISUALIZATION
BIG DATA
 Big Data refers to complex and large data sets
that have to be processed and analyzed to
uncover valuable information that can benefit
businesses and organizations.
 It is a data set that is so huge and complicated
that no typical data management technologies can
effectively store or process it
SOURCES OF BIG DATA
 Social networking sites
 E-commerce site

 Weather Station
 Telecom company

 Share Market
IMPORTANCE OF BIG DATA
 Cost Savings
 Time Reductions

 Understand the market conditions


 Social Media Listing’s

 Using Big Data Analytics to Boost Customer


Acquisition and Retention.
 Using Big Data Analytics to Solve Advertisers
Problem and Offer Marketing Insights.
CHARACTERISTICS OF BIG DATA
CHARACTERISTICS OF BIG DATA
 Volume: the size and amounts of big data that companies manage and
analyze
 Value: the most important “V” from the perspective of the business, the
value of big data usually comes from insight discovery and pattern
recognition that lead to more effective operations, stronger customer
relationships and other clear and quantifiable business benefits
 Variety: the diversity and range of different data types, including
unstructured data, semi-structured data and raw data
 Velocity: the speed at which companies receive, store and manage data –
e.g., the specific number of social media posts or search queries received
within a day, hour or other unit of time
 Veracity: the “truth” or accuracy of data and information assets, which often
determines executive-level confidence
BIG DATA PLATFORM
 A big data platform is an integrated computing
solution that combines numerous software systems,
tools, and hardware for big data management.
 Large amounts of data are stored in an organized
manner on a big data platform.
 Big data platforms use a combination of hardware
and software tools for data management to aggregate
data on an enormous scale, typically onto the cloud.
CHARACTERISTICS OF BIG DATA PLATFORM

 Ability to accommodate new applications and tools


depending on the evolving business needs
 Support several data formats

 Ability to accommodate large volumes of streaming or at-


rest data
 Have a wide variety of conversion tools to transform data to
different preferred formats
 Capacity to accommodate data at any speed

 Provide the tools for scouring the data through massive data
sets
CHARACTERISTICS OF BIG DATA PLATFORM

 Provide the tools for scouring the data through


massive data sets
 Support linear scaling

 Have the tools for data analysis and reporting


requirements
 Capacity to accommodate data at any speed
 Have a wide variety of conversion tools to
transform data to different preferred formats
STAGES OF BIG DATA PLATFORM
 Data Collection
 Data Storage

 Data Processing
 Data Analytics

 Data Governance
 Data Management
FEW EXAMPLES OF BIG DATA PLATFORMA
 GCP-Google Cloud Platform
 AWS (Amazon Web Services)

 Microsoft Azure
 Cloud Era

 Apache Haddoop
 Apache Spark
NEED OF DATA ANALYTICS
 1. Improved Decision Making
 2. Better Customer Service

 3. Efficient Operations
 4. Effective Marketing
TYPES OF DATA ANALYTICS
Descriptive: Descriptive analytics is when you assess historical data and try to
identify specific patterns. The main goal is to answer what happened and if it was
expected or not, making comparisons with other timeframes.

Diagnostic: When we know what’s going on, the next step is to understand why. So
you may have performed some descriptive analytics techniques and you were able
to identify that sales went up by 12%. Diagnostic analytics is there to help identify
why this happened and what actually worked for your business.

Predictive: Predictive analytics involves sophisticated techniques that can help you
use the patterns observed and make forecasts about future performance, e.g.,
financial data analytics. While this may require specific expertise, it’s extremely
useful in order to be better prepared for the future.

Prescriptive: Last but not least, prescriptive analytics techniques can help you
identify the best course of action. This type of analytics is frequently used by
marketers to draft their strategies and achieve better results.
MODERN DATA ANALYTIC TOOLS
 Apache Hadoop:
It’s a Java-based open-source platform that is being used to store
and process big data. It is built on a cluster system that allows
the system to process data efficiently and let the data run
parallel. It can process both structured and unstructured data
from one server to multiple computers.
 Cassandra

APACHE Cassandra is an open-source NoSQL distributed database


that is used to fetch large amounts of data. It is capable of
delivering thousands of operations every second and can handle
petabytes of resources with almost zero downtime.
MODERN DATA ANALYTIC TOOLS
 SAS (Statistical Analytical System):
Statistical Analytical System or SAS allows a user to access the
data in any format (SAS tables or Excel worksheets). Besides
that it also offers a cloud platform for business analytics
called SAS Viya and also to get a strong grip on AI & ML,
they have introduced new tools and products.
 Spark

APACHE Spark is another framework that is used to process data


and perform numerous tasks on a large scale. It is also used to
process data via multiple computers with the help of
distributing tools.
MODERN DATA ANALYTIC TOOLS
 Apache Storm
A storm is a robust, user-friendly tool used for data analytics,
especially in small companies. The best part about the storm is
that it has no language barrier (programming) in it and can
support any of them. It was designed to handle a pool of large
data in fault-tolerance and horizontally scalable methods.
4. Rapid Miner
It’s a fully automated visual workflow design tool used for data
analytics. It’s a no-code platform and users aren’t required to
code for segregating data. Today, it is being heavily used in
many industries such as ed-tech, training, research, etc.
APPLICATIONS OF DATA ANALYTICS
 Transportation
 Security

 Internet Web search results


 Marketing and digital advertising
ANALYTICS AND REPORTING
Analytics: Analytics is about diving deeper into your data
and reports in order to look for insights. It’s actually an
attempt to answer why something is happening.
Analytics powers up decision-making as the main goal is
to make sense of the data explaining the reason behind
the reported numbers.
Reporting: Data reporting is about taking the available
information (e.g. your dataset), organizing it, and
displaying it in a well-structured and digestible format
we call “reports”.
ANALYTICS VS REPORTING
 Purpose:
The purpose of reports is to take data and organize it into clear
information. Analytics aims to take that data and provide
insights that drive better business decisions.
 Methods:

When discussing reports or reporting, you may use language such


as organizing, formatting, building, configuring, consolidating,
or summarizing. Analytics employs words and phrases like
investigating, performing a “deep dive,” questioning,
examining, interpreting, comparing, and confirming
ANALYTICS VS REPORTING
 Value:
Data analytics transforms data into information
whereas reporting transforms the information into
insights & recommendations.
EXAMPLES OF REPORTS
ANALYSIS

You might also like