Unit 4 - Cloud Applications and AWS Cloud Platform
Unit 4 - Cloud Applications and AWS Cloud Platform
CLOUD COMPUTING
UNIT 4 – AWS
CLOUD
PLATFORM
1. Introduction
Cloud computing has gained huge popularity in industry due to its ability to host applications for
which the services can be delivered to consumers rapidly at minimal cost.
Figure 9.1 shows all the services available in the AWS ecosystem. At the base of the solution stack
are services that provide raw compute and raw storage: Amazon Elastic Compute (EC2) and Amazon
Simple Storage Service (S3). These are the two most popular services, which are generally
complemented with other offerings for building a complete system. At the higher level, Elastic
MapReduce and AutoScaling provide additional capabilities for building smarter and more elastic
computing systems. On the data side, ElasticBlockStore(EBS), Amazon SimpleDB, AmazonRDS, and
Amazon Elasti Cache provide solutions for reliable data snapshots and the management of structured
and semistructured data. Communication needs are covered at the networking level by
AmazonVirtualPrivateCloud(VPC), Elastic Load Balancing, AmazonRoute53, and Amazon Direct
Connect. More advanced services for connecting applications are Amazon Simple Queue Service
(SQS), Amazon Simple Notification Service(SNS), and Amazon Simple E-mail Service (SES) Other
services include:
Amazon CloudFront content delivery network solution
Amazon CloudWatch monitoring solution for several Amazon services
Amazon Elastic Bean Stalk and Cloud Formation flexible application packaging and
deployment
1. Compute Services
The fundamental service in this space is Amazon EC2, which delivers an IaaS solution that has served
as a reference model for several offerings from other vendors in the same market segment. Amazon
EC2 allows deploying servers in the form of virtual machines created as instances of a specific image.
Images come with a preinstalled operating system and a software stack, and instances can be con-
figured for memory, number of processors, and storage. Users are provided with credentials to
remotely access the instance and further configure or install software if needed.
3. High-memory Instances: This class targets applications that need to process huge workloads
and require large amounts of memory. Three-tier Web applications characterized by high
traffic are the target profile. Three categories of increasing memory and CPU are available,
with memory proportionally larger than computing power.
4. High-CPU Instances: This class targets compute-intensive applications. Two configurations
are available where computing power proportionally increases more than memory.
5. Cluster Compute Instances: This class is used to provide virtual cluster services. Instances in
this category are characterized by high CPU compute power and large memory and an
extremely high I/O and network performance, which makes it suitable for HPC applications.
6. Cluster GPU Instances: This class provides instances featuring graphic processing units
(GPUs) and high compute power, large memory, and extremely high I/O and network
performance. This class is particularly suited for cluster applications that perform heavy
graphic computations, such as rendering clusters.
attaching storage volumes, and configuring security in terms of access control and network
connectivity.
By default, instances are created with an internal IP address, which makes them capable of
communicating within the EC2 network and accessing the Internet as clients. It is possible to
associate an Elastic IP to each instance, which can then be remapped to a different instance over
time. Elastic IPs allows instances running in EC2 to act as servers reachable from the Internet and,
since they are not strictly bound to specific instances, to implement failover capabilities. Together
with an external IP, EC2 instances are also given a domain name that generally is in the form ec2-
xxx-xxx-xxx.compute-x.amazonaws.com, where xxx-xxx-xxx normally represents the four parts
of the external IP address separated by a dash, and compute-x gives information about the
availability zone where instances are deployed. Currently, there are five availability zones that are
priced differently: two in the United States (Virginia and Northern California), one in Europe
(Ireland), and two in Asia Pacific (Singapore and Tokyo).
Amazon Elastic MapReduce provides AWS users with a cloud computing platform for
MapReduce applications. It utilizes Hadoop as the MapReduce engine, deployed on a virtual
infrastructure com- posed of EC2 instances, and uses Amazon S3 for storage needs.
2. Storage Services
AWS provides a collection of services for data storage and information management. The core service
in this area is represented by Amazon Simple Storage Service (S3). This is a distributed object store
that allows users to store information in different formats. The core components of S3 are two:
buckets and objects.
Buckets represent virtual containers in which to store objects; objects represent the content that is
actually stored. Objects can also be enriched with metadata that can be used to tag the stored content
with additional information.
Access to S3 is provided with RESTful Web services. These express all the operations that can be
performed on the storage in the form of HTTP requests (GET, PUT, DELETE, HEAD, and POST )
which operate differently according to the element they address. As a rule of thumb PUT/ POST
requests add new content to the store, GET/HEAD requests are used to retrieve content and
information, and DELETE requests are used to remove elements or information attached to them.
a) Resource Naming:
Buckets, objects, and attached metadata are made accessible through a REST interface.
Therefore, they are represented by uniform resource identifiers (URIs) under the
s3.amazonaws.com domain. All the operations are then performed by expressing the entity they
are directed to in the form of a request for a URI. Amazon offers three different ways of
addressing a bucket:
Canonical form: http://s3.amazonaws.com/bukect_name/. The bucket name is expressed as a path
component of the domain name s3.amazonaws.com. This is the naming convention that has less
restriction in terms of allowed characters, since all the characters that are allowed for a path
component can be used.
Subdomain form: http://bucketname.s3.amazon.com/. Alternatively, it is also possible to reference a
bucket as a subdomain of s3.amazonaws.com. To express a bucket name in this form, the name
has to do all of the following:
• Be between 3 and 63 characters long
• Contain only letters, numbers, periods, and dashes
• Start with a letter or a number
• Contain at least one letter
• Have no fragments between periods that start with a dash or end with a dash or that are empty strings
It is the one to be preferred since it works more effectively for all the geographical locations serving
resources stored in S3.
Virtual hosting form: http://bucket-name.com/. Amazon also allows referencing of its resources with
custom URLs. This is accomplished by entering a CNAME record into the DNS that points to the
subdomain form of the bucket URI.
b) Buckets:
A bucket is a container of objects. Buckets are top- level elements of the S3 storage architecture and do not
support nesting. That is, it is not possible to create “subbuckets” or other kinds of physical divisions.
A bucket is located in a specific geographic location. Users can select the location at which to create
buckets, which by default are created in Amazon’s U.S. datacenters. Once a bucket is created, all the
objects that belong to the bucket will be stored in the same availability zone of the bucket. Users create a
bucket by sending a PUT request to http://s3.amazonaws.com/ with the name of the bucket and, if they
want to specify the availability zone, additional information about the preferred location. The content of a
bucket can be listed by sending a GET request specifying the name of the bucket. Once created, the bucket
cannot be renamed or relocated. If it is necessary to do so, the bucket needs to be deleted and recreated.
The deletion of a bucket is performed by a DELETE request, which can be successful if and only if the
bucket is empty.
The Amazon Elastic Block Store (EBS) allows AWS users to provide EC2 instances with persistent
storage in the form of volumes that can be mounted at instance startup. They accommodate up to 1 TB of
space and are accessed through a block device interface, thus allowing users to format them according to
the needs of the instance they are connected to (raw storage, file system, or other). EBS volumes can be
cloned, used as boot partitions, and constitute durable storage since they rely on S3 and it is possible to
take incremental snapshots of their content. EBS volumes normally reside within the same availability
zone of the EC2 instances that will use them to maximize the I/O performance. It is also possible to
connect volumes located in different availability zones. Once mounted as volumes, their content is lazily
loaded in the background and according to the request made by the operating system. This reduces the
number of I/O requests that go to the network. The expense related to a volume comprises the cost
generated by the amount of storage occupied in S3 and by the number of I/O requests performed against
the volume. Currently, Amazon charges $0.10/GB/month of allocated storage and $0.10 per 1 million
requests made to the volume.
Amazon ElastiCache works as an in-memory data store and cache to support the most demanding
applications requiring sub-millisecond response times. By utilizing an end-to-end optimized stack running
on customer dedicated nodes, Amazon ElastiCache provides secure, blazing fast performance .
ElastiCache nodes are priced according to the EC2 costing model, with a small price difference due to the
use of the caching service installed on such instances. It is possible to choose between different types of
instances; Table 9.3 provides an overview of the pricing options.
Enterprise applications quite often rely on databases to store data in a structured form, index, and perform
analytics against it. Traditionally, RDBMS have been the common data back-end for a wide range of
applications, even though recently more scalable and lightweight solutions have been proposed. Amazon
provides applications with structured storage services in three different forms:
Preconfigured EC2 AMIs are predefined templates featuring an installation of a given database
management system. EC2 instances created from these AMIs can be completed with an EBS volume
for storage persistence. Available AMIs include installations of IBM DB2, Microsoft SQL Server,
MySQL, Oracle, PostgreSQL, Sybase, and Vertica. Instances are priced hourly according to the EC2
cost model. This solution poses most of the administrative burden on the EC2 user, who has to
configure, maintain, and manage the relational database, but offers the greatest variety of products to
choose from.
Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data,
videos, applications, and APIs to customers globally with low latency, high transfer speeds, all within a
developer-friendly environment. CloudFront is integrated with AWS – both physical locations that are
directly connected to the AWS global infrastructure, as well as other AWS services.
3. Communication Services
Amazon provides facilities to structure and facilitate the communication among existing applications and
services residing within the AWS infrastructure. These facilities can be organized into two major categories:
1. Virtual Networking
2. Messaging
Virtual Networking comprises a collection of services that allow AWS users to control the
connectivity to and between compute and storage services. Amazon Virtual Private Cloud (VPC)
and Amazon Direct Connect provide connectivity solutions in terms of infrastructure; Route 53
facilitates connectivity in terms of naming.
1.2 Messaging
Messaging services constitute the next step in connecting applications by leveraging AWS
capabilities. The three different types of messaging services offered are :
i. Amazon Simple Queue Service (SQS),
ii. Amazon Simple Notification Service(SNS), and
iii. Amazon Simple Email Service(SES).
Using the AWS console or directly the underlying Web service AWS, users can create an
unlimited number of message queues and configure them to control their access. Applications
can send messages to any queue they have access to. These messages are securely and
redundantly stored within the AWS infrastructure for a limited period of time, and they can be
accessed by other (authorized) applications. While a message is being read, it is kept locked to
avoid spurious processing from other applications. Such a lock will expire after a given period.
Amazon SNS allows applications to be notified when new content of interest is available. This
feature is accessible through a Web service whereby AWS users can create a topic, which other
applications can subscribe to. At any time, applications can publish content on a given topic and
subscribers can be automatically notified. The service provides subscribers with different
notification models (HTTP/HTTPS, email/email JSON, and SQS).
Amazon SES provides AWS users with a scalable email service that leverages the AWS infra-
structure. Once users are signed up for the service, they have to provide an email that SES will
use to send emails on their behalf. To activate the service, SES will send an email to verify the
given address and provide the users with the necessary information for the activation.