Lecture 01 - Intro
Lecture 01 - Intro
Lecture 01 - Intro
Compu)ng
Lecture
1
Intro
–
By
Shmulik
Goldstein
Dan
Amiga
[email protected]
• Problems
• History
• What is cloud computing
Development Deployment
• Store Data
• Run Applications
– Combined with:
• Utility model
• Elastic Nature
• It’s possible
• It’s mandatory
• “Web Space Race”
– Build extremely large datacenters (10,000/0’s)
– Driven by growth (more users, more data)
• Operations & Infrastructure expertise
• Broadband got better
45%
Dan
Amiga
–
IDC
Cloud
Compu)ng
2Commerce
015
Department Statistics
%
of
U)lized
Server
Capacity
on
Average?
6%
Dan
Amiga
–
IDC
Cloud
Compu)ng
2015
Economist Survey on IT, 2008
Dan
Amiga
–
IDC
Cloud
Compu)ng
2015
Elasticity – Provisioning for Peak
Real World Server Utilization is 5% to 20%
• Many Services Peak Exceeds Average by a Factor of 2 to 10
• Most Provision for Peak
• Painful to Under-Provision (Lost Customers)
Under-‐Provisioning
#1
Poten)al
Revenue
(Shaded
Area)
Is
Sacrificed
Under-‐Provisioning
#2
Some
Users
Respond
to
Under-‐Provisioning
by
Permanently
Deser)ng
the
Site...
Bad
for
Revenue!
Dan
Amiga
–
IDC
Cloud
Compu)ng
2015
Fallback:
Hosted
• Time
• Cost
• Scale
• Storage Size
• Availability
• Connectivity
• Security
• How-to-do-it-right
Compute
Compute
Inac)vity
Period
Average
Usage
Average
Usage
Time Time
On
&
off
workloads
(e.g.
batch
job)
Successful
services
needs
to
grow/scale
Over
provisioned
capacity
is
wasted
Keeping
up
w/
growth
is
big
IT
challenge
Time
to
market
can
be
cumbersome
Complex
lead
)me
for
deployment
Compute
Average
Usage
Average
Usage
Time Time
“IaaS”
Infrastructure-as-a-Service
“PaaS”
Platform-as-a-Service
“SaaS”
Software-as-a-Service
You manage
Applications Applications Applications Applications
You manage
Data Data Data Data
Managed by vendor
Middleware Middleware Middleware Middleware
You manage
Managed by vendor
O/S O/S O/S O/S
Managed by vendor
Ques)ons:
1. Is
this
cheap
or
expensive?
Why?
2. Will
windows
be
cheaper?
Why?
3. Why
are
the
memory
numbers
are
floats?
4. How
can
I
make
this
cheaper?
Dan
Amiga
–
IDC
Cloud
Compu)ng
2015
Spot Instances – bid!
• 2009 stats:
– +200B pageviews/month – +20B photos, +2B/month growth
– >3.9T feed actions/day – 600,000 photos served / sec
– +300M active users – 25TB log data / day processed
– >1B chat mesgs/day thru Scribe
– 100M search queries/day – 120M queries /sec on memcache
– >6B minutes spent/day (ranked
#2 on Internet)
• 2007 stats:
– +20 petabytes of data processed / day by +100K MapReduce jobs
– 1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks
– +200 GFS clusters, each at 1-5K nodes, handling +5 petabytes of storage