Lecture 07
Lecture 07
HTML:
URL:
HTTP
1996
Theprice of digital storage falls to the
point where it is more cost-effective than
paper.
1997
Google launch their search engine which
will quickly become the most popular in
the world.
Michael Lesk estimates the digital
universe is increasing tenfold in size
every year.
Lecture 7– Part 2
First
use of term Internet of Things, in a
business presentation by Kevin Ashton to
Procter and Gamble.
2000
InHow Much Information? Peter Lyman and Hal
Varian (now chief economist at Google) attempted
to quantify the amount of digital information in
the world, and its rate of growth, for the first
time.
2001
Three “Vs” of Big Data
Volume,
Velocity,
Variety
The
Next Frontier for Innovation,
Competition and Productivity by
McKinsey Global Institute.
2010
Eric
Schmidt, executive chairman of
Google, tells a conference that as much
data is now being created every two days,
as was created from the beginning of
human civilization to the year 2003.
2011
The McKinsey report states that by 2018
the US will face a shortfall of between
140,000 and 190,000 professional data
scientists, and warns that issues including
privacy, security and intellectual property
will have to be resolved before the full value
of Big Data will be realised.
2014
Mobile internet use overtakes desktop for
the first time
44
WHAT’S BIG DATA?
45
BIG DATA: 3V’S
VOLUME (SCALE)
Data Volume
44x increase from 2009 2020
From 0.8 zettabytes to 35zb
100s of milli
ons of GPS e
data every day
nabled devices
? TBs of
sold annually
25+ TBs of
log data every day 2+ billion p
eople on the W
eb by end 2011
• (http://www.msnbc.msn.com/id/44363598/ns/technology_and_science-
future_of_technology/#.TmetOdQ--uI)
VARIETY (COMPLEXITY)
Relational Data (Tables/Transaction/Legacy Data)
Text Data (Web)
Semi-structured Data (XML)
Graph Data
Social Network, Semantic Web (RDF), …
Streaming Data
You can only scan the data once
Social Me Banking
dia Finance
Our
Gaming
Customer Known
History
Entertain Purchase
VELOCITY (SPEED)
Examples
E-Promotions: Based on your current location, your purchase
history, what you like ➔ send promotions right now for store next to
you
Mobile devices
(tracking all objects all the time)
Scientific instruments
Social media and networks
(collecting all sorts of data) Sensor technology and networks
(all of us are generating data)
(measuring all kinds of data)
Product
Recommendations Learning why Customers
Influence
that are Relevant Behavior Switch to competitors
& Compelling and their offers; in
time to Counter
Friend Invitations
Improving the Customer to join a
Marketing Game or Activity
Effectiveness of a that expands
Promotion while it business
is still in Play
Preventing Fraud
as it is Occurring
& preventing more
proactively
SOME MAKE IT 4V’S
HARNESSING BIG DATA
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
WHAT’S DRIVING BIG DATA
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time