0% found this document useful (0 votes)
142 views53 pages

4.AutoScaling-Elastic Load Balancing

Auto Scaling allows you to ensure that you have the correct number of EC2 instances available to handle load for your application. You create Auto Scaling groups that specify minimum and maximum instance counts. Auto Scaling maintains instance counts and replaces unhealthy instances. Scaling policies allow instances to be launched or terminated based on demand. CloudWatch metrics are used to trigger scaling based on utilization. Auto Scaling provides better availability, fault tolerance, and cost management.

Uploaded by

Biplab Parida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views53 pages

4.AutoScaling-Elastic Load Balancing

Auto Scaling allows you to ensure that you have the correct number of EC2 instances available to handle load for your application. You create Auto Scaling groups that specify minimum and maximum instance counts. Auto Scaling maintains instance counts and replaces unhealthy instances. Scaling policies allow instances to be launched or terminated based on demand. CloudWatch metrics are used to trigger scaling based on utilization. Auto Scaling provides better availability, fault tolerance, and cost management.

Uploaded by

Biplab Parida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Auto Scaling

Auto Scaling

 Auto Scaling helps you ensure that you have the


correct number of EC2 instances available to handle
the load for your application
 You create collections of EC2 instances, called Auto
Scaling groups
 You can specify the minimum number of instances in
each Auto Scaling group, and Auto Scaling ensures
that your group never goes below this size
Autoscaling
Auto Scaling

 You can specify the maximum number of instances


in each Auto Scaling group, and Auto Scaling ensures
that your group never goes above this size
 If you specify the desired capacity, either when you
create the group or at any time thereafter, Auto
Scaling ensures that your group has this many
instances
 If you specify scaling policies, then Auto Scaling can
launch or terminate instances as demand on your
application increases or decreases.
Auto Scaling
Auto Scaling: Benefits

 Better fault tolerance. Auto Scaling can detect when an


instance is unhealthy, terminate it, and launch an
instance to replace it.
 Better availability. You can configure Auto Scaling to use
multiple Availability Zones. If one Availability Zone
becomes unavailable, Auto Scaling can launch instances
in another one to compensate.
 Better cost management. Auto Scaling can dynamically
increase and decrease capacity as needed. Because you
pay for the EC2 instances you use, you save money by
launching instances when they are actually needed and
terminating them when they aren't needed.
Launch Configuration

 A launch configuration is a template that an Auto


Scaling group uses to launch EC2 instances
 When you create a launch configuration, you specify
information for the instances such as the ID of the
Amazon Machine Image (AMI), the instance type, a
key pair, one or more security groups, and a block
device mapping.
Launch Configuration Rules

 You can specify your launch configuration with


multiple Auto Scaling groups
 You can only specify one launch configuration for an
Auto Scaling group at a time
 You can't modify a launch configuration after you've
created it
 When you change the launch configuration for your
Auto Scaling group, any new instances are launched
using the new configuration parameters, but existing
instances are not affected
Auto Scaling Group

 An Auto Scaling group contains a collection of EC2


instances that share similar characteristics and are
treated as a logical grouping for the purposes of
instance scaling and management
 You can use the Auto Scaling group to scale the
number of instances automatically based on criteria
that you specify, or maintain a fixed number of
instances even if an instance becomes unhealthy
Auto Scaling Group Rules

 You must specify a name, launch configuration,


minimum number of instances, and maximum
number of instances
 You can optionally specify a desired capacity, which
is the number of instances that the group must have
at all times
 If you don't specify a desired capacity, the default
desired capacity is the minimum number of
instances that you specified
Auto Scaling Group

 An Auto Scaling group starts by launching enough


EC2 instances to meet its desired capacity
 The Auto Scaling group maintains this number of
instances by performing periodic health checks on
the instances in the group
 If an instance becomes unhealthy, the group
terminates the unhealthy instance and launches
another instance to replace it
 You can use scaling policies to increase or decrease
the number of running EC2 instances in your group
automatically to meet changing conditions
Scaling Plans

 Maintain current instance levels at all times


 Manual scaling
 Scale based on a schedule
 Scale based on demand
Maintain current instance levels at all times

 You can configure your Auto Scaling group to


maintain a minimum or specified number of running
instances at all times.
 To maintain the current instance levels, Auto Scaling
performs a periodic health check on running
instances within an Auto Scaling group.
 When Auto Scaling finds an unhealthy instance, it
terminates that instance and launches a new one
Manual Scaling

 Specify the change in the maximum, minimum, or


desired capacity of your Auto Scaling group
 Auto Scaling manages the process of creating or
terminating instances to maintain the updated
capacity
Scale Based on Schedule

 Scaling actions are performed automatically as a


function of time and date
 For example, every week the traffic to your web
application starts to increase on Wednesday,
remains high on Thursday, and starts to decrease on
Friday
 To configure your Auto Scaling group to scale based
on a schedule, you need to create scheduled actions
 A scheduled action tells Auto Scaling to perform a
scaling action at certain time in future
Scale Based on Demand

 For e.g. You can create a policy that calls for


enlarging your fleet of EC2 instances whenever the
average CPU utilization rate stays above ninety
percent for fifteen minutes
 An Auto Scaling group uses a combination of policies
and alarms to determine when the specified
conditions for launching and terminating instances
are met
 Auto Scaling integrates with CloudWatch for
identifying metrics and defining alarms
Dynamic Scaling Architecture
Dynamic Scaling

 The following events begin when a client sends a


request to the AWS user's application, and with the
launch of EC2 instances in the Auto Scaling group:
 The application is ready to communicate with users
after the Auto Scaling group has launched all
Amazon EC2 application instances for the
application.
 While requests are being sent by users and received
by the application instances, CloudWatch monitors
the specified metrics of all the instances in the Auto
Scaling group.
Dynamic Scaling

 As the demand for the application either grows or


shrinks, the specified metrics change.
 The change in metrics invokes the CloudWatch
alarm to perform an action. The action is a message
sent to either the scaling-in policy or the scaling-out
policy, depending on the metrics that were breached.
 The Auto Scaling policy that receives the message
then invokes the scaling activity within the Auto
Scaling group.
 This Auto Scaling process continues until the policies
are deleted or the Auto Scaling group is terminated.
Auto Scaling Lifecycle

 Auto Scaling group follow a specific path, or lifecycle.


 For Auto Scaling instances, this lifecycle starts when
you create a new Auto Scaling group or when a scale
out event occurs.
 At that point, a new instance launches and is put into
service by the Auto Scaling group.
 The lifecycle ends when a corresponding scale in
event occurs, at which point the Auto Scaling group
detaches the instance and terminates it.
Auto Scaling Basic Lifecycle
Auto Scaling Lifecycle

 Scale out
 These events direct the Auto Scaling group to launch new
instances and add them to the group. For example:
 You manually increase the number of instances, either by
setting a new minimum number of instances or desired
capacity for the group.
 You use a Amazon CloudWatch alarm to monitor your
application and scale based on specified criteria.
 You use a schedule-based policy to increase or decrease the
number of instances in the group at a specific time.
 An existing instance fails required health checks, or you
manually configure an instance to have a have an Unhealthy
status.
Auto Scaling Instance States

 Instances in an Auto Scaling group can be in one of


four main states:
 Pending
 InService
 Terminating
 Terminated
Auto Scaling Instance States
Auto Scaling Limits
Auto Scaling: Cooldowns

 The Auto Scaling cooldown period is a configurable


setting that determines when Auto Scaling should
suspend any scaling activities related to a specific
Auto Scaling group
 This cooldown period is important, because it helps
to ensure you don’t launch or terminate more
resources than you need.
Auto Scaling Cooldowns
Auto Scaling Cooldowns
Auto Scaling Cooldowns
Default Termination Policy

 The default termination policy is designed to help


ensure that your network architecture spans
Availability Zones evenly. When using the default
termination policy, Auto Scaling selects an instance
to terminate as follows:
 Auto Scaling determines whether there are instances
in multiple Availability Zones. If so, it selects the
Availability Zone with the most instances. If there is
more than one Availability Zone with this number of
instances, Auto Scaling selects one of these
Availability Zones at random.
Default Termination Policy

 Auto Scaling determines which instances in the selected


Availability Zone use the oldest launch configuration. If
there is one such instance, it terminates it.
 If there are multiple instances that use the oldest launch
configuration, Auto Scaling determines which instances
are closest to the next billing hour. (This helps you
maximize the use of your EC2 instances while
minimizing the number of hours you are billed for
Amazon EC2 usage.) If there is one such instance, Auto
Scaling terminates it.
 If there is more than one instance closest to the next
billing hour, Auto Scaling selects one of these instances
at random.
Elastic Load Balancing
Elastic Load Balancing

 Elastic Load Balancing automatically distributes


incoming web traffic across multiple EC2 instances
 With Elastic Load Balancing, you can add and
remove EC2 instances as your needs change without
disrupting the overall flow of information
 If an EC2 instance fails, Elastic Load Balancing
automatically reroutes the traffic to the remaining
running EC2 instances
 If a failed EC2 instance is restored, Elastic Load
Balancing restores the traffic to that instance
Elastic Load Balancing
Elastic Load Balancing

 Elastic Load Balancing offers clients a single point of


contact, and it can also serve as the first line of
defense against attacks on your network
 ELB allows SSL termination of multiple websites
hosted in EC2 instances
ELB Benefits

 Requests are distributed to EC2 instances in multiple


Availability Zones, minimizing the risk of overloading
one single instance. If an entire Availability Zone goes
offline, the load balancer routes traffic to instances in
other Availability Zones.
 The health of your EC2 instances registered with the load
balancer so that requests are sent only to the healthy
instances. If an instance becomes unhealthy, Elastic Load
Balancing stops sending traffic to that instance and
spreads the load across the remaining healthy instances.
 Support for end-to-end traffic encryption on those
networks that use secure (HTTPS/SSL) connections.
ELB Benefits

 The ability to take over the encryption and decryption work


from the EC2 instances, and manage it centrally on the load
balancer.
 Support for sticky sessions, which is the ability to "stick" user
sessions to specific EC2 instances.
 Association of the load balancer with your domain name.
Because the load balancer is the only computer that is exposed
to the Internet, you don't have to create and manage public
domain names for the instances that the load balancer
manages. You can point the instance's domain records at the
load balancer instead and scale as needed (either adding or
removing capacity) without having to update the records with
each scaling activity.
 Support for security groups associated with your load
balancer to provide additional networking and security
options.
How ELB Works

 Elastic Load Balancing (ELB) consists of two


components: the load balancers and the controller
service
 The load balancers monitor the traffic and handle
requests that come in through the Internet
 The controller service monitors the load balancers,
adding and removing load balancers as needed and
verifying that the load balancers are functioning
properly
How ELB Works

 Elastic Load Balancing automatically generates a


unique Domain Name System (DNS) name for each
load balancer instance you create
 For example, if you create a load balancer named
myLB in the us-east-1a,
 Your load balancer might have a DNS name such as myLB-
1234567890.us-east-1.elb.amazonaws.com.
 Clients can request access your load balancer by
using the ELB generated DNS name.
How ELB Works

 You can map custom domain name, for e.g.


www.example.com
 And then associate the custom domain name with
the load balancer DNS name.
 When a request is placed to your load balancer using
the custom domain name that you created, it
resolves to the load balancer DNS name.
 You can create cname record to achieve this in
www.example.com
How ELB Works

 When a client makes a request to your application using


either your load balancer's DNS name or the custom
domain name, the DNS server returns one or more IP
addresses.
 The client then makes a connection to your load balancer
at the provided IP address.
 When Elastic Load Balancing scales, it updates the DNS
record for the load balancer.
 The DNS record for the load balancer has the time-to-live
(TTL) set to 60 seconds. This setting ensures that IP
addresses can be re-mapped quickly to respond to events
that cause Elastic Load Balancing to scale up or down.
How ELB Works

 When you create a load balancer, you must configure it to


accept incoming traffic and route requests to your EC2
instances.
 The controller ensures that load balancers are operating
with the correct configuration.
 After you create your load balancer, you have to register
the EC2 instances that you want to load balance with the
load balancer.
 Your load balancer monitors and routes the incoming
traffic to the registered instances. The instances are
registered with the load balancer using the IP addresses
associated with the instances.
How ELB Works

 Your load balancer also monitors the health of the


registered instances and ensures that the traffic goes
to healthy instances.
 When the load balancer detects an unhealthy
instance, it stops routing the traffic to that instance
and resumes the routing when the instance has been
restored to a healthy state.
 Elastic Load Balancing performs health checks on all
your registered instances using the configuration you
provide, regardless of whether the instance is in a
healthy or unhealthy state.
ELB Multi AZ Deployement

 You can configure your load balancer to load balance


incoming application traffic across multiple instances in
a single Availability Zone or across multiple instances in
several Availability Zones in the same region.
 For example, if you choose to load balance multiple
instances across two Availability Zones, and all the
instances in the first Availability Zone become unhealthy,
the load balancer will route traffic to the healthy
instances in the other Availability Zone.
 When you use multiple Availability Zones, it is important
to keep approximately the same capacity in each
Availability Zone registered with the load balancer.
ELB Architecture
ELB Architecture

 This example assumes that you have created a load


balancer, created a custom domain name and associated
your load balancer with the domain name using a
CNAME entry in DNS, and have registered your
instances with it.
 The client sends a URL request to DNS servers to access
your application. The DNS server responds with a DNS
name. For example, myLB-1234567890.us-east-
1.elb.amazonaws.com.
 The client looks for the resolution of the DNS name sent
by the DNS server. The DNS entry is controlled by
Amazon because your application instances are under
the amazonaws.com domain. The Amazon DNS servers
return one or more IP addresses.
ELB Architecture

 The client then opens a connection to the machine at the provided


IP address. The instance at this address is the load balancer you
created.
 The load balancer checks the health states of all the registered EC2
application instances within the selected Availability Zones and will
begin routing traffic to instances that have met the healthy
threshold defined in the health check configuration.
 The load balancer routes the client request to the healthy EC2
application instance identified in the previous step. At this point,
the client is communicating with one of your EC2 instances through
your load balancer. The load balancer listeners can be configured to
use either HTTP, HTTPS, TCP, or SSL protocols for both front-end
connection (client to load balancer) and back-end connection (load
balancer to back-end instance).
ELB Health Checks

 To discover the availability of your EC2 instances,


the load balancer periodically sends pings, attempts
connections, or sends requests to test the EC2
instances. These tests are called health checks.
 Instances that are healthy at the time of the health
check are marked as "InService" and the instances
that are unhealthy at the time of the health check are
marked as "OutOfService".
 The load balancer performs health checks on all
registered instances, regardless of whether the
instance is in a healthy or unhealthy state.
Health Check Configuration

 Health Check Configuration contains:


 Ping Protocol - The protocol to use to connect with
the instance. The protocol can be TCP, HTTP,
HTTPS, or SSL.
 Ping Port – For e.g. TCP:80
 Ping Path - The destination for sending the
HTTP/HTTPS request. If you specify an HTTP or
HTTPS protocol, you have to include a ping port and
a ping path, such as HTTP:80/index.html
Health Check Configuration

 Response Timeout - Time to wait when receiving a


response from the health check (2 sec - 60 sec)
 HealthCheck Interval - Amount of time between
health checks (5 sec - 300 sec).
 Unhealthy Threshold - Number of consecutive health
check failures before declaring an EC2 instance
unhealthy
 Healthy Threshold - Number of consecutive health
check successes before declaring an EC2 instance
healthy
Health Check

 The load balancer interprets the health check


configuration as follows: Send a request to registered
instance at the ping port and ping path,
http://EC2InstanceIPaddress:80/welcome.html,
every 30 seconds.
 Allow a response timeout period of 5 seconds for the
instance to respond. If the load balancer gets 2
consecutive failures, take the instance out of service.
 If the load balancer gets 5 consecutive successful
responses, put the instance back in service
SSL Certificate for ELB

 If you use HTTPS or SSL for your front-end listener, you


must install an SSL certificate on your load balancer.
 The load balancer uses the certificate to terminate the
connection and then decrypt requests from clients before
sending them to the back-end instances.
 Before you can install the SSL certificate on your load
balancer, you must create the certificate, get the
certificate signed by a CA, and then upload the certificate
using the AWS Identity and Access Management (IAM)
service.
Lab: ELB

HTTPS://RUN.QWIKLABS.COM
INTRODUCTION TO ELASTIC LOAD
BALANCING

You might also like