Lecture 2
Lecture 2
computational
Tools for
4170201
Data Science
• The smart phones, the data they create and consume; sensors
embedded into everyday objects will soon result in billions of new,
constantly-updated data feeds containing environmental, location,
and other information, including video.
Data volume is increasing
exponentially
Exponential increase in
collected/generated data
9
Clickstreams and ad impressions capture user behavior at
millions of events per second
high-frequency stock trading algorithms reflect market
changes within microseconds
machine to machine processes exchange data between
billions of devices
infrastructure and sensors generate massive log data in
real-time
on-line gaming systems support millions of concurrent
users, each producing multiple inputs per second.
Data is begin generated fast and need to be processed fast
Online Data Analytics
Late decisions missing opportunities
Examples
◦ E-Promotions: Based on your current location, your purchase
history, what you like send promotions right now for store
next to you
11
Big Data isn't just numbers, dates, and strings. Big Data
is also geospatial data, 3D data, audio and video, and
unstructured text, including log files and social media.
13
14
"Big Data are high-volume, high-velocity, and/or high-
variety information assets that require new forms of
processing to enable enhanced decision making, insight
discovery and process optimization” .