Case Study: Amazon AWS: CSE 40822 - Cloud Compu0ng Prof. Douglas Thain University of Notre Dame

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Case Study: Amazon AWS

CSE 40822 – Cloud Compu0ng


Prof. Douglas Thain
University of Notre Dame
Cau3on to the Reader:

Herein are examples of prices consulted in spring


2016, to give a sense of the magnitude of costs.
Do your own research before spending your own
money!
Several Historical Trends
•  Shared U0lity Compu0ng
•  1960s – MULTICS – Concept of a Shared Compu0ng U0lity
•  1970s – IBM Mainframes – rent by the CPU-hour. (Fast/slow switch.)
•  Data Center Co-loca0on
•  1990s-2000s – Rent machines for months/years, keep them close to the network
access point and pay a flat rate. Avoid running your own building with u0li0es!
•  Pay as You Go
•  Early 2000s - Submit jobs to a remote service provider where they run on the raw
hardware. Sun Cloud ($1/CPU-hour, Solaris +SGE) IBM Deep Capacity Compu0ng on
Demand (50 cents/hour)
•  Virtualiza0on
•  1960s – OS-VM, VM-360 – Used to split mainframes into logical par00ons.
•  1998 – VMWare – First prac0cal implementa0on on X86, but at significant
performance hit.
•  2003 – Xen paravirtualiza0on provides much perf, but kernel must assist.
•  Late 2000s – Intel and AMD add hardware support for virtualiza0on.
Virtual-* Allows for the Scale of Abstrac3on to
Increase Over Time
•  Run one process within certain resource limits.
Op Sys has virtual memory, virtual CPU, and virtual storage (file system).
•  Run mul0ple processes within certain resource limits.
Resource containers (Solaris), virtual servers (Linux), virtual images (Docker)
•  Run an en0re opera0ng system within certain limits.
Virtual machine technology: VMWare, Xen, KVM, etc.
•  Run a set of virtual machines connected via a private network.
Virtual networks (SDNs) provision bandwidth between virtual machines.
•  Run a private virtual architecture for every customer.
Automated tools replicate virtual infrastructure as needed.
Amazon AWS
•  Grew out of Amazon’s need to rapidly provision and configure machines of
standard configura0ons for its own business.
•  Early 2000s – Both private and shared data centers began using
virtualiza0on to perform “server consolida0on”
•  2003 – Internal memo by Chris Pinkham describing an “infrastructure
service for the world.”
•  2006 – S3 first deployed in the spring, EC2 in the fall
•  2008 – Elas0c Block Store available.
•  2009 – Rela0onal Database Service
•  2012 – DynamoDB
•  Does it turn a profit?
Terminology
•  Instance = One running virtual machine.
•  Instance Type = hardware configura0on: cores, memory, disk.
•  Instance Store Volume = Temporary disk associated with instance.
•  Image (AMI) = Stored bits which can be turned into instances.
•  Key Pair = Creden0als used to access VM from command line.
•  Region = Geographic loca0on, price, laws, network locality.
•  Availability Zone = Subdivision of region the is fault-independent.
EC2 Pricing Model
• Free Usage Tier
• On-Demand Instances
•  Start and stop instances whenever you like, costs are rounded up
to the nearest hour. (Worst price)
• Reserved Instances
•  Pay up front for one/three years in advance. (Best price)
•  Unused instances can be sold on a secondary market.
• Spot Instances
•  Specify the price you are willing to pay, and instances get started
and stopped without any warning as the marked changes. (Kind of
like Condor!)
hnp://aws.amazon.com/ec2/pricing/
Free Usage Tier
•  750 hours of EC2 running Linux, RHEL, or SLES t2.micro instance
usage
•  750 hours of EC2 running Microsop Windows Server t2.micro
instance usage
•  750 hours of Elas0c Load Balancing plus 15 GB data processing
•  30 GB of Amazon Elas0c Block Storage in any combina0on of General
Purpose (SSD) or Magne0c, plus 2 million I/Os (with Magne0c) and 1
GB of snapshot storage
•  15 GB of bandwidth out aggregated across all AWS services
•  1 GB of Regional Data Transfer

Reserved Instance Example
Surprisingly, you can’t scale up that large.
Simple Storage Service (S3)
•  A bucket is a container for objects and describes loca0on, logging,
accoun0ng, and access control. A bucket can hold any number of objects,
which are files of up to 5TB. A bucket has a name that must be globally
unique.
•  Fundamental opera0ons corresponding to HTTP ac0ons:
•  hnp://bucket.s3.amazonaws.com/object
•  POST a new object or update an exis0ng object.
•  GET an exis0ng object from a bucket.
•  DELETE an object from the bucket
•  LIST keys present in a bucket, with a filter.
•  A bucket has a flat directory structure (despite the appearance given by
the interac0ve web interface.)
Easily Integrated into Web Applica3ons
<form action="http://examplebucket.s3.amazonaws.com/" method="post" enctype="multipart/form-data">

<input type="input" name="key" value="user/user1/" />

<input type="hidden" name="acl" value="public-read" />


<input type="hidden" name="success_action_redirect"
value="http://examplebucket.s3.amazonaws.com/successful_upload.html" />
...
<input type="text" name="X-Amz-Credential”
value="AKIAIOSFODNN7EXAMPLE/20130806/us-east-1/s3/aws4_request" />
...
<input type="submit" name="submit" value="Upload to Amazon S3" /> </form>

hnp://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-post-example.html
Bucket Proper3es

•  Versioning – If enabled, POST/DELETE result in the crea0on of new


versions without destroying the old.
•  Lifecycle – Delete or archive objects in a bucket a certain 0me aper
crea0on or last access or number of versions.
•  Access Policy – Control when and where objects can be accessed.
•  Access Control – Control who may access objects in this bucket.
•  Logging – Keep track of how objects are accessed.
•  No0fica0on – Be no0fied when failures occur.
S3 Weak Consistency Model
Direct quote from the Amazon developer API:
“Updates to a single key are atomic….”
“Amazon S3 achieves high availability by replica0ng data across mul0ple servers
within Amazon's data centers. If a PUT request is successful, your data is safely
stored. However, informa0on about the changes must replicate across Amazon S3,
which can take some 0me, and so you might observe the following behaviors:
•  A process writes a new object to Amazon S3 and immediately anempts to read it. Un0l the
change is fully propagated, Amazon S3 might report "key does not exist."
•  A process writes a new object to Amazon S3 and immediately lists keys within its bucket.
Un0l the change is fully propagated, the object might not appear in the list.
•  A process replaces an exis0ng object and immediately anempts to read it. Un0l the change is
fully propagated, Amazon S3 might return the prior data.
•  A process deletes an exis0ng object and immediately anempts to read it. Un0l the dele0on is
fully propagated, Amazon S3 might return the deleted data.”

Always read the fine print….
Elas3c Block Store
•  An EBS volume is a virtual disk of a fixed size with a block read/write
interface. It can be mounted as a filesystem on a running EC2
instance where it can be updated incrementally. Unlike an instance
store, an EBS volume is persistent.
•  (Compare to an S3 object, which is essen0ally a file that must be
accessed in its en0rety.)
•  Fundamental opera0ons:
•  CREATE a new volume (1GB-1TB)
•  COPY a volume from an exis0ng EBS volume or S3 object.
•  MOUNT on one instance at a 0me.
•  SNAPSHOT current state to an S3 object.
EBS is approx. 3x more expensive by volume and
10x more expensive by IOPS than S3.


Use Glacier for Cold Data
•  Glacier is structured like S3: a vault is a container for an arbitrary
number of archives. Policies, accoun0ng, and access control are
associated with vaults, while an archive is a single object.
•  However:
•  All opera0ons are asynchronous and no0fied via SNS.
•  Vault lis0ngs are updated once per day.
•  Archive downloads may take up to four hours.
•  Only 5% of total data can be accessed in a given month.
•  Pricing:
•  Storage: $0.01 per GB-month
•  Opera0ons: $0.05 per 1000 requests
•  Data Transfer: Like S3, free within AWS.
•  S3 Policies can be set up to automa0cally move data into Glacier.
Durability
•  Amazon claims about S3:
•  Amazon S3 is designed to sustain the concurrent loss of data in two facili0es, e.g. 3+ copies
across mul0ple available domains.
•  99.999999999% durability of objects over a given year.
•  Amazon claims about EBS:
•  Amazon EBS volume data is replicated across mul0ple servers in an Availability Zone to
prevent the loss of data from the failure of any single component.
•  Volumes <20GB modified data since last snapshot have an annual failure rate of 0.1% - 0.5%,
resul0ng in complete loss of the volume.
•  Commodity hard disks have an AFR of about 4%.
•  Amazon claims about Glacier is the same as S3:
•  Amazon S3 is designed to sustain the concurrent loss of data in two facili0es, e.g. 3+ copies
across mul0ple available domains PLUS periodic internal integrity checks.
•  99.999999999% durability of objects over a given year.

•  Beware of oversimplified arguments about low-probability events!
Architecture Center
•  Ideas for construc0ng large scale infrastructures using AWS:
hnp://aws.amazon.com/architecture/
Command Line Setup
•  Go to your profile menu (your name) in the upper right hand corner,
select “Security Creden0als” and “Con0nue to Security Creden0als”
•  Select “Access Keys”
•  Select “New Access Key” and save the generated keys somewhere.
•  Edit ~/.aws/config and set it up like this:
[default]
Note the syntax here is different from how
output = json
it was given in the web console!
region = us-west-2
AWSAccessKey=XXXXXX
aws_access_key = XXXXXX
aws_secret_access_key = YYYYYYYYYYYY
AWSSecretAccessKey=YYYYYYYYY

•  Now test it: aws ec2-describe-instances


S3 Command Line Examples
aws s3 mb s3://bucket
. . . cp localfile s3://bucket/key
mv s3://bucket/key s3://bucket/newname
ls s3://bucket
rm s3://bucket/key
rb s3://bucket

aws s3 help
aws s3 ls help


EC2 Command Line Examples
aws ec2 describe-instances
run-instances --image-id ami-xxxxx -- count 1
--instance-type t1.micro --key-name keyfile
stop-instances --instance-id i-xxxxxx

aws ec2 help
aws ec2 start-instances help

Warmup: Get Started with Amazon
•  Skim through the AWS documenta0on.
•  Sign up for AWS at hnp://aws.amazon.com
•  (Skip the IAM management for now)
•  Apply the service credit you received by email.
•  Create and download a Key-Pair, save it in your home directory.
•  Create a VM via the AWS Console
•  Connect to your newly-created VM like this:
•  ssh -i my-aws-keypair.pem ec2-user@ip-address-of-vm
•  Create a bucket in S3 and upload/download some files.
Demo Time
h_p://aws.amazon.com

You might also like