Lec 7 Hadoop Intro
Lec 7 Hadoop Intro
Variety
Different
forms of data
The four Vs of
big data
Velocity
Veracity
Analysis of
streaming Value Uncertainty
of data
data
Domain knowledge
Statistics Visualizations
Data
Machine Science Pattern
learning recognition
Business analysis
Presentation
KDD AI
Databases and
data processing
Big data exploration Enhanced 360⁰ view of the Security & Intelligence
Find, visualize, and customer extension
understand all big data to Extend existing customer views Lower risk, detect fraud, and
improve decision making. by incorporating extra internal monitor cybersecurity in real
and external data sources. time.
Data warehouse
modernization
Operational analysis Integrate big data and data
Analyze various machine warehouse capabilities to gain
data for improved new business insights and
business results. increase operational efficiency.
Apache Hadoop
MapReduce
HDFS
▪ Apache Spark
Introduction to big data © Copyright IBM Corporation 2021
Think differently
As you start to work with Hadoop, you must think differently:
• There are different processing paradigms.
• There are different approaches to storing data.
• Think ELT rather than ETL.