0% found this document useful (0 votes)
43 views48 pages

Unit 1 Big Data Notes

The document discusses topics related to big data including what big data is, types of big data, examples and use cases, big data architecture, the need for big data analytics, types of big data analytics, and big data applications. It provides details on structured, unstructured and semi-structured data as well as descriptive, predictive, and prescriptive analytics.

Uploaded by

iafdeepakupreti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views48 pages

Unit 1 Big Data Notes

The document discusses topics related to big data including what big data is, types of big data, examples and use cases, big data architecture, the need for big data analytics, types of big data analytics, and big data applications. It provides details on structured, unstructured and semi-structured data as well as descriptive, predictive, and prescriptive analytics.

Uploaded by

iafdeepakupreti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Topics to be covered...

Evolution of Technology
What is Big Data?
Types of Big Data?
Big Data Examples & Use Cases
Big Data architecture
When to use this architecture
5Vs of Big Data
Big Data technology
Big Data importance
Big Data applications
Big Data Analytics
Need for Big D ata Analytics
What is Big Data Analytics
Types of Big Data Analytics
Evolution of Technology
Evolution of Technology
IOT
Social Media
What is Big Data?
What is Big Data?
Big data is the term for a collection of data sets so large and
complex that it becomes difficult to process using on-hand
database management tools or traditional data processing
applications
Types of Big Data?
What is Big Data?
Structured
Unstructured
Semi-structured
Structured
Any data that can be stored, accessed and processed in the
form of fixed format is termed as a 'structured' data.

Table
Semi-structured
S emi-structured data is information that
does not reside in a relational database
or any other data table, but nonetheless
has some organizational properties to
make it easier to analyze, such as
semantic tags.
Big Data Examples & Use Cases
8 Big Data Examples & Use Cases
Transportation.
Advertising and Marketing.
Banking and Financial Services.
Government.
Media and Entertainment.
Meteorology.
Healthcare.
Cybersecurity.
Big Data architecture
Big Data architecture

Data sources: All big data solutions start with one or more
data sources.
Examples include:
-> Application data stores, such as relational databases.
-> Static files produced by applications, such as web server log files.
-> Real-time data sources, such as IoT devices.
Big Data architecture

Data storage: Data for batch processing operations is typically stored


in a distributed file store that can hold high volumes of large files in
various formats. This kind of store is often called a data lake. Options
for implementing this storage include Azure Data Lake Store or blob
containers in Azure Storage.
Big Data architecture

Batch processing: Because the data sets are so large, often a big data
solution must process data files using long-running batch jobs to
filter, aggregate, and otherwise prepare the data for analysis. Usually
these jobs involve reading source files, processing them and writing
the output to new files. Options include running U-SQL jobs in Azure
Data Lake Analytics, using Hive, Pig, or custom Map/Reduce jobs in
an HDInsight Hadoop cluster, or using Java, Scala, or Python
programs in an HDInsight Spark cluster.
Big Data architecture

Stream processing: After capturing real-time messages, the solution


must process them by filtering, aggregating , and otherwise
preparing the data for analysis. The processed stream data is then
written to an output sink. Azure Stream Analytics provides a
managed stream processing service based on perpetually running
SQL queries that operate on unbounded streams. You can also use
open source Apache streaming technologies like Storm and Spark
Streaming in an HDInsight cluster.
Big Data architecture

Analytical data store: Many big data solutions prepare data for
analysis and then serve the processed data in a structured format
that can be queried using analytical tools. The analytical data store
used to serve these queries can be a Kimball-style relational data
warehouse, as seen in most traditional business intelligence (BI)
solutions.
Big Data architecture

Analysis and reporting: The goal of most big data solutions is to


provide insights into the data through analysis and reporting. To
empower users to analyze the data, the architecture may include a
data modelling layer, such as a multidimensional OLAP cube or
tabular data model in Azure Analysis Services. It might also support
self-service BI, using the modelling and visualization technologies in
Microsoft Power BI or Microsoft Excel. Analysis and reporting can
also take the form of interactive data exploration by data scientists or
data analysts.
Big Data architecture

Orchestration: Most big data solutions consist of repeated data


processing operations, encapsulated in workflows, that transform
source data, move data between multiple sources and sinks, load the
processed data into an analytical data store, or push the results
straight to a report or dashboard. To automate these workflows, you
can use an orchestration technology such Azure Data Factory or
Apache Oozie and Sqoop.
When to use this architecture
When to use this architecture
1 . S tore and process data in volumes too larg e for a traditional
database.
2.Transform unstructured data for analysis and reporting.
3.Capture, process, and analyse unbounded streams of data in real
time, or with low latency.
4.Use Azure Machine Learning or Microsoft Cognitive Services.
5Vs of Big Data
Characteristics OR 5Vs of Big Data
1.Volume
2.Veracity
3.Variety
4.Value
5.Velocity
Big Data technology
Big Data technology
Big Data importance
Big Data importance
1.Cost Savings
2.Time-Saving
3.Understand the market conditions
4.Social Media Listening
5.Boost Customer Acquisition and Retention
6.Solve Advertisers Problem
7.The driver of Innovations and Product Development
Big Data applications
Big Data applications
1. Banking and Securities
2. Communications, Media and Entertainment
3. Healthcare Providers
4. Education
5. Government
6. Insurance
7. Retail and Wholesale trade
8. Transportation
Big Data Analytics
Need for Big Data Analytics
Need for Big Data Analytics
1. Optimize business operations by analyzing
customer behaviour
Need for Big Data Analytics
2. Next Generation Products
What is Big Data Analytics
What is Big Data Analytics?
Big data analytics is the use of advanced analytic
techniques against very large, diverse data sets
that include structured, semi-structured and
unstructured data, from different sources, and in
different sizes from terabytes to zettabytes.
Types of Big Data Analytics
Types of Big Data Analytics
1 . D escriptive Analysis
2.Predictive Analysis
3.Prescriptive
Analysis
4.Diagnostic Analysis
Types of Big Data Analytics
1. D escriptive Analysis
What is happening now based on
incoming data.
Types of Big Data Analytics
2. Predictive Analysis
What might happen in future.
Types of Big Data Analytics
3. Prescriptive Analysis
What action should be taken.

Google's self-driving car is perfect


example of Presciptive Analysis.
UNIT-1 st Finished

You might also like