Amazon EC2 Autoscaling Groups and Network Load Balancer
Amazon EC2 Autoscaling Groups and Network Load Balancer
The Amazon Web Services (AWS) Auto Scaling service automatically adds or removes compute
resources allocated for your cloud application, in response to changes in demand. For applications
configured to run on a cloud infrastructure, scaling is an important part of cost control and resource
management.
Scaling is the ability to increase or decrease the compute capacity of your application either by
changing the number of servers (horizontal scaling) or by changing the size of the servers (vertical
scaling).
Auto Scaling helps you maintain application availability and allows you to scale your Amazon EC2
capacity up or down automatically according to the defined conditions. You can use Auto Scaling to
help ensure that you are running your desired number of Amazon EC2 instances. Auto Scaling can
also automatically increase the number of Amazon EC2 instances during demand spikes to maintain
performance and decrease capacity during lulls to reduce costs. Auto Scaling is well suited to
applications that have stable demand patterns, or that experience hourly, daily, or weekly variability
in usage.
In this lab, you will create an Auto Scaling Group and place it behind a Network Load Balancing (NLB).
Don't worry if you don't fully understand all the components yet. Each one will be discussed in
greater detail as you create and configure them.
Lab Objectives:
Configure Auto Scaling to automatically launch EC2 instances using conditions described by
CloudWatch alarms
Create and configure a Network Load Balancer
Utilize Auto Scaling and a Network Load Balancer to ensure the availability of compute
resources
Build an elastic cluster by integrating Auto Scaling with an Elastic Load Balancer
Perform end-to-end testing of the system and understand how to diagnose issues
Groups
Your EC2 instances are organized into groups so that they can be treated as a logical unit for the
purposes of scaling and management. When you create a group, you can specify its minimum,
maximum, and desired number of EC2 instances.
Launch configurations
Your group uses a launch configuration as a template for its EC2 instances. When you create a launch
configuration, you can specify information such as the AMI ID, instance type, key pair, security
groups, and block device mapping for your instances.
Launch template
Instructions
2. In the left-hand menu, under Network & Security, click Security Groups:
4. Under Basic details, in the Security group name field, enter webserver-cluster:
You have added this rule so that later you can access instances using SSH.
8. To add another inbound rule, click Add rule again in the Inbound rules section.
In this lab, you are allowing access from anywhere (0.0.0.0/0). In a non-lab environment, you are
likely to be required to have a much more restrictive Source. For example, you may be required to
specify a corporate IP range to reduce the likelihood of unauthorized access.
10. To finish creating your security group, scroll to the bottom of the page and click Create security
group:
You will see a notification that your security group has been created:
You have created a security group that you can specify in the launch template you are about to
create.
12. To open the Launch Templates page and click the Create launch template button:
13. In the Launch template name and description section, enter the following values accepting the
defaults for fields not specified:
14. Scroll down to the Application and OS Images (Amazon machine Image) section, click Quick Start,
select the Amazon Linux box, and select the Amazon Linux 2 AMI (HVM) option:
Note: Ensure that the AMI for the 64-bit (x86) architecture is selected.
15. In the Instance type field, enter t2.micro and click the t2.micro result:
16. In the Key pair (login) section, under Key pair name, select the numeric option:
You will use the key pair to access EC2 instances launched with this configuration later in the lab. In
situations where you don't need SSH, you don't have to configure a key pair, this is often preferred in
terms of security.
17. Under the Network Settings section, click the Security groups drop-down and select webserver-
cluster:
Notice that you have created separate security groups for the instances and the load balancer. In a
non-lab environment, it is often the case that the instances are listening on a different port than the
load balancer, and the load balancer re-directs the traffic. Assigning separate security groups allows
you to more flexibly control and restrict traffic at the network level.
This part of the form allows you to add or increment the size of any EBS volume attached to each EC2
instance started by the Auto Scaling group. Leave the defaults and do not add any EBS volumes.
Typically, large EBS volumes are only needed if your software requires storage space to process the
application data. Many applications store raw or processed data with Amazon S3, Redshift,
DynamoDB, or another storage/database service provided by Amazon. When that is the use case,
large EBS volumes are usually not required. This lab environment does not need extra disk space.
20. Scroll down to the Detailed CloudWatch monitoring option, and select Enable:
By default, CloudWatch monitors EC2 instances approximately every 5 minutes. Detailed monitoring
enables monitoring more often (each minute).
#!/bin/bash
This bash script installs PHP, an Apache webserver (httpd), and a tool for stress testing called Stress.
Warning: The EC2 instances will never reach 100% CPU Utilization due to the limitations of the
burstable credit. They should reach ba usage of about 80%.
You will see a notification that your launch template has been created:
24. To return to the EC2 management console, click View launch templates:
Elastic Load Balancing (ELB) automatically distributes incoming application traffic across multiple
Amazon EC2 instances. They enable you to achieve greater fault tolerance in your applications and
seamlessly provide the correct amount of load balancing capacity needed in response to incoming
application requests.
ELB detects unhealthy instances within a pool and automatically reroutes traffic to healthy instances
until the unhealthy instances have been restored. Elastic Load Balancers can be enabled within a
single Availability Zone or across multiple zones for greater consistent application performance.
In the AWS Management Console search bar, enter EC2, and click the EC2 result under Services:
2. In the left-hand menu, click Load Balancers:
4. Take a moment to read the information for the load balancer types before clicking Create in
the Network Load Balancer tile:
Basic configuration:
o Load balancer name: Web
Network mapping:
o Mappings: Check all availability zones
Notice that the default listener is TCP port 80 which is used for serving HTTP traffic.
6. Look at the Security groups section, deselect the default security group, and select the webserver-
cluster security group:
7. Look at the Listeners and routing section, and click on Create target group which opens a new
browser tab:
8. In the new target group browser tab that opened, set the following values leaving the others at
their defaults:
Basic configuration:
o Choose a target type: Instances
o Target group name: Website
o Protocol: Select TCP from the drop-down menu
o Port: Ensure 80 is set as the port value
Health checks:
o Health check protocol: TCP
o Advanced health check settings: (Click the triangle to expand the section)
o Interval: 10 seconds (This will cause instances to reach a healthy state faster for this
Lab, but may be too fast for certain applications)
The target type option allows you to specify IP addresses or a Lambda function in addition
to Instances. Using an IP address gives you the ability to use a network load balancer with compute
instances outside of AWS. You will use Instances within AWS for this lab.
Note: Ensure you set the protocol to TCP. Because network load balancers operate at layer 4 and
aren't HTTP aware, if you set the protocol as HTTP you will be unable to use the target group with
your network load balancer.
The message is due to not creating an Auto Scaling group or launching any EC2 instances yet. That is
not a problem. You will configure your Auto Scaling group to register its EC2 instances in the Network
Load Balancer's target group.
12. In the Listener section, click the refresh icon beside the Default action drop-down menu, and
select the newly created target group from the drop-down:
14. Wait for the Successfully created load balancer message to display before clicking View load
balancer:
15. In the load balancers table, ensure the Web load balancer is selected and click Actions > Edit load
balancer attributes:
16. In the Edit load balancer attributes form, set the following value before clicking Save changes:
You must enable Cross-zone load balancing to achieve the highest level of availability. Without
enabling this feature, clients could cache the DNS address of the load balancer node in one
availability zone and that node would only distribute requests to instances within the availability
zone. Cross-Zone Load Balancing allows every load balancer node to distribute requests across all
availability zones, although for the Network Load Balancer there are data transfer charges when this
feature is enabled. (There are no data charges for other types of load balancers)
21. Change the Deregistration delay to 30 seconds and click Save changes:
The deregistration delay specifies how long the load balancer should wait before removing an
instance from the target group. The default value of 300 seconds gives connections to the instance
five minutes to drain before they are forcefully closed. Depending on your application, you may be
able to reduce to delay to remove instances more quickly. Thirty seconds is enough for this Lab.
An Auto Scaling group is a representation of multiple Amazon EC2 instances that share similar
characteristics and that are treated as a logical grouping for the purposes of instance scaling and
management. For example, if a single application operates across multiple instances, you might want
to increase or decrease the number of instances in that group to improve the performance of the
application. You can use the Auto Scaling group to automatically scale the number of instances or
maintain a fixed number of instances. You create Auto Scaling groups by defining the minimum,
maximum, and the desired number of running EC2 instances the group must have at any given point
of time.
An Auto Scaling group starts by launching the minimum number (or the desired number, if specified)
of EC2 instances and then increases or decreases the number of running EC2 instances automatically
according to the conditions that you define. Auto Scaling also maintains the current instance levels
by conducting periodic health checks on all the instances within the Auto Scaling group. If an EC2
instance within the Auto Scaling group becomes unhealthy, Auto Scaling terminates the unhealthy
instance and launches a new one to replace the unhealthy instance. This automatic scaling and
maintenance of the instance in an Auto Scaling group is the core value of the Auto Scaling service. It's
what puts the "elastic" in EC2.
9. Under the Load balancing section, enter the following leaving unspecified fields at their defaults:
12. In the Group size section, in the Maximum capacity field, enter 4:
Auto-scaling groups allow you to scale out (add more instances) or scale in (remove instances) based
on metrics such as:
It's possible to create your own metrics and use these to decide when to scale, such metrics can be
populated in CloudWatch directly from the application rather than inspecting the instance's resource
usage. This is useful when the point that you need to scale at is determined by something unrelated
to the instance, such as reaching a database connection limit or some other application-specific
bottleneck.
By default, the Metric type is Average CPU utilization, and the Target value is 50 percent. Leave
these values unchanged.
You will see a wizard step that allows you to configure notifications when scaling events occur.
Notifications aren't used in this lab.
16. To advance to the Add tags page of the wizard, click Next.
In a non-lab environment, it is best practice to tag resources when you create them so they can be
easily filtered and discovered. Tags are not required in this lab.
18. Check the details for accuracy and when ready, click Create Auto Scaling group:
19. To see details about your auto-scaling group, check the box next to webserver-cluster:
You will see some tabs appear under the list of groups.
If you don't see an instance that has a LifeCycle value of InService, wait a minute or two, and click the
refresh button:
The auto-scaling group has started an instance, this is because the minimum capacity of the group is
1.
Because the webserver is not CPU-intensive and there is no load on the webserver, the high CPU
alarm won't trigger. The number of instances will stay at 1 unless CPU utilization on the existing
instance increases above 50 percent.
21. To see monitoring information about your auto-scaling group, click the Monitoring tab:
Please note it may take several minutes for data points to display. Click the refresh button
periodically to update the chart.
You will see the CPU utilization is currently near zero. You may see a higher value in the past, this
spike in CPU utilization occurred when the instance was started and executed the commands you
specified in the user data of your launch template.
You configured the auto-scaling group to have a minimum number of instances of one. This means
that even though the CPU utilization is currently below 50 percent, the auto-scaling group is keeping
this instance running instead of terminating it.
23. In the left-hand side menu, under Load Balancing, click Target Groups:
You will see the target group list page with one item, the Website target group you created earlier.
24. Select the Website target group, and then click the Targets tab:
Note you may need to click the refresh button to update the table if the instance is in
the initial status while the load balancer waits for three successful health checks before assigning a
healthy status.
Observe there is an instance added to the Registered targets and it is the same instance created by
the Auto Scaling group. Also, notice the Status is healthy meaning the instance is reachable on TCP
port 80 (HTTP). That means the launch template's user data script successfully completed to start the
Apache webserver on the instance. Everything appears to be working. You will perform more
thorough tests in the next lab step.
If the instance doesn't become healthy after five minutes, a likely reason is the user data being
absent or incorrect. Check the user data field on your Launch Template, if necessary, you can delete
and re-create the Launch Template and Auto Scaling group.
Performing end-to-end tests to make sure everything is working as you think it should is very
important. Although this may be an automated procedure, often a quick sanity test by other
individuals and/or groups directly from the AWS Console is also helpful. This lab step will point out a
few ways to test that your Launch Configuration is working in conjunction with the Auto Scaling
group and CloudWatch Alarm (which uses AWS Simple Notification Service (SNS)).
1. In the left-hand menu, under Load Balancing, click Load Balancers:
You will see a list of load balancers with one item selected, the load balancer you created previously.
2. In the Description tab, look for the DNS name field, and copy the value (you can click the copy
button at the end of the field).
3. Open a new browser tab and paste the DNS name you just copied into the address bar.
You will see the Apache webserver test page that is served by default upon a fresh unconfigured web
server installation:
If the page is not displayed, there are several places you can check to troubleshoot the issue starting
with the following:
Ensure the security groups of the load balancer and the instances allows HTTP ingress traffic
Ensure the user data script in the launch template correctly installs and runs the Apache
webserver
Ensure the Auto Scaling group is configured to add its instances to the load balancer's target
group
Ensure the health checks are configured for TCP port 80 otherwise, the instances will never
reach a healthy status and will be terminated and then replaced with a new instance by the
Auto Scaling group. The new instance will subsequently never reach a healthy status and be
replaced, and the process repeats.
o To allow for you to debug the instances without having them be replaced, you can
block instance termination by performing the following steps:
Navigate to Auto Scaling Groups (and check it) > Edit
Set the Suspended Processes to Terminate (This will prevent instances in
your group from getting terminated. Don't forget to remove the
configuration once the issue is resolved.)
4. To list EC2 instances, in the left-hand menu, click Instances:
You will see a list of instances with one running instance, this instance is created by your auto-scaling
group.
6. To terminate the running instance, click Instance state > Terminate instance
The note on EBS-backed instances will appear. You can ignore this warning.
The Auto Scaling group will detect this change and relaunch an instance automatically to meet the
minimum desired capacity of one.
You will see the new instance launch and settle into a running state:
9. Connect to the running instance using the PEM (macOS/Linux) or PPK (Windows) key file in
the Your lab data of this Lab.
Tip: To remind yourself how to connect, select the running instance, click Actions, and click Connect:
Cloud Academy recommends connecting using EC2 Instance Connect. You can also connect using an
SSH client and PEM file in the credentials section of this lab. Connecting using session manager is not
supported in this lab.
10. Enter the following command at the command line to run stress causing the CPU utilization to
increase for five minutes:
Copy code
1
stress --cpu 2 --io 1 --vm 1 --vm-bytes 128M --timeout 5m
11. Navigate to the Auto Scaling group's Monitoring tab in the EC2 Console and click the EC2 tab.
You will see the CPU utilization metric increase to nearly one hundred percent:
The auto-scaling group will detect the increase in utilization and launch a second instance in
response.
Note that it will take a few minutes for the metrics and graph to update.
You will see events detailing your earlier deletion of the initial instance.
If you don't see a new instance launching wait a minute or two and click the refresh button.
Eventually, because the stress command stops after five minutes, the number of instances will
drop down to the minimum of one.
You will see two running instances, or if your auto-scaling group has already initiated a scale-in
(because the stress command finished and CPU utilization dropped), you will see two terminated
instances including the one you manually terminated, and one running instance.