Chapter 6 (P1)
Chapter 6 (P1)
Chapter 13:
Big Data Analytics
Big Data -
Definition and Concepts
Big [volume] Data is not new!
Big Data means different things to people with
different backgrounds and interests
Traditionally, “Big Data” = massive volumes of data
E.g., volume of data at CERN, NASA, Google, …
Where does the Big Data come from?
Everywhere! Web logs, RFID, GPS systems, sensor
networks, social networks, Internet-based text documents,
Internet search indexes, detail call records, astronomy,
atmospheric science, biology, genomics, nuclear physics,
biochemical experiments, medical records, scientific
research, military surveillance, multimedia archives, …
13-2 Copyright © 2014 Pearson Education, Inc.
Technology Insights 6.1
The Data Size Is Getting Big, Bigger…
Hadron Collider - 1 PB/sec
Boeing jet - 20 TB/hr Names for Big Data Sizes
Facebook - 500 TB/day.
YouTube – 1 TB/4 min.
The proposed Square
Kilometer Array telescope
(the world’s proposed
biggest telescope) – 1
EB/day
ERP
ERP MOVE MANAGE ACCESS
Marketing
Marketing
Executives
SCM
DATA Operational
PLATFORM Applications
Systems
CRM
INTEGRATED
DATA WAREHOUSE Customers
Business
Partners
Images Intelligence
Frontline
Audio Workers
and Video Data
Mining
Business
Machine
Logs DISCOVERY PLATFORM Analysts
Math
and Stats
Data
Text Scientists
EVENT
PROCESSING Languages
Web and Engineers
Social
Keys to Success
with Big Data
Alignment
Analytics
The right between the
analytics tools business and IT
strategy
A fact-based
A strong data
decision-making
infrastructure
culture
In-memory analytics
Storing and processing the complete data set in
RAM
In-database analytics
Placing analytic procedures close to where data is
stored
Grid computing & MPP
Use of many machines and processors in parallel
(MPP- massively parallel processing)
Appliances
Combining hardware, software and storage in a
single unit for performance and scalability
13-10 Copyright © 2014 Pearson Education, Inc.
Challenges of Big Data Analytics
Data volume
The ability to capture, store, and process the huge
Processing capabilities
The ability to process the data quickly, as it is
DATA
SCIENTIST
Curiosity and Programming,
Creativity Scripting and Hacking
$60
$50
$40
$30
$20
$10
$0
impactful applications
of stream analytics Sensor Data
were developed in the (Energy Production
energy industry, System Status)
specifically for smart
grid (electric power
Streaming Analytics
supply chain) systems. Meteorological Data Data Integration
(Predicting Usage,
(Wind, Light, and Temporary
Production and
Temperature, etc.) Staging
Anomalies)
Permanent
Usage Data
Storage Area
(Smart Meters,
Smart Grid Devises)