Lec 7 Hadoop Intro
Lec 7 Hadoop Intro
Variety
Different
forms of data
Velocity
Veracity
Analysis of
streaming Value Uncertainty
of data
data
Domain knowledge
Statistics Visualizations
Data
Machine Science Pattern
learning recognition
Business analysis
Presentation
KDD AI
Databases and
data processing
Data warehouse
modernization
Operational analysis Integrate big data and data
Analyze various machine warehouse capabilities to gain
data for improved new business insights and
business results. increase operational efficiency.
• Event-driven
If when you say “real time” that you mean the opposite of scheduled, then you mean event-
driven. Instead of happening in a particular time interval, event-driven data processing
happens when a certain action or condition triggers it. The performance requirement for it is
generally before another event happens.
Apache Hadoop
MapReduce
HDFS
▪ Apache Spark
Introduction to big data © Copyright IBM Corporation 2021
Think differently
As you start to work with Hadoop, you must think differently:
• There are different processing paradigms.
• There are different approaches to storing data.
• Think ELT rather than ETL.