AWS Certified Developer Study Content
AWS Certified Developer Study Content
Amazon Amazon ECR Amazon ECS AWS Elastic AWS Elastic Load Amazon Amazon Amazon Amazon
EC2 Beanstalk Lambda Balancing CloudFront Kinesis Route 53 S3
Amazon Amazon Amazon Amazon Amazon Amazon AWS Step Functions Auto Scaling Amazon API Amazon Amazon
RDS Aurora DynamoDB ElastiCache SQS SNS Gateway SES Cognito
IAM Amazon Amazon EC2 AWS AWS AWS AWS AWS AWS AWS AWS KMS
CloudWatch Systems Manager CloudFormation CloudTrail CodeCommit CodeBuild CodeDeploy CodePipeline X-Ray
Navigating the AWS spaghetti bowl
Getting started with AWS
AWS Cloud History
2002: 2004: 2007:
Internally Launched publicly Launched in
launched with SQS Europe
2003: 2006:
Amazon infrastructure is Re-launched
one of their core strength. publicly with
Idea to market SQS, S3 & EC2
AWS Cloud Number Facts
• In 2019, AWS had $35.02
billion in annual revenue
• AWS accounts for 47% of the
market in 2019 (Microsoft is
2nd with 22%)
• Pioneer and Leader of the
AWS Cloud Market for the
9th consecutive year
• Over 1,000,000 active users
• https://infrastructure.aws/
AWS Regions
• AWS has Regions all around the world
• Names can be us-east-1, eu-west-3…
• A region is a cluster of data centers
• Most AWS services are region-scoped
https://aws.amazon.com/about-aws/global-infrastructure/
How to choose an AWS Region?
https://aws.amazon.com/cloudfront/features/
Tour of the AWS Console
• AWS has Global Services:
• Identity and Access Management (IAM)
• Route 53 (DNS service)
• CloudFront (Content Delivery Network)
• WAF (Web Application Firewall)
• Most AWS services are Region-scoped:
• Amazon EC2 (Infrastructure as a Service)
• Elastic Beanstalk (Platform as a Service)
• Lambda (Function as a Service)
• Rekognition (Software as a Service)
• Region Table: https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services
AWS Identity & Access
Management (AWS IAM)
IAM: Users & Groups
• IAM = Identity and Access Management, Global service
• Root account created by default, shouldn’t be used or shared
• Users are people within your organization, and can be grouped
• Groups only contain users, not other groups
• Users don’t have to belong to a group, and user can belong to multiple groups
"Resource": "*"
}
]
}
IAM Policies inheritance
Audit Team
Developers Operations
inline
Alice
m5.2xlarge
• m: instance class
• 5: generation (AWS improves them over time)
• 2xlarge: size within the instance class
EC2 Instance Types – General Purpose
• Great for a diversity of workloads such as web servers or code repositories
• Balance between:
• Compute
• Memory
• Networking
• In the course, we will be using the t2.micro which is a General Purpose EC2
instance
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types – Compute Optimized
• Great for compute-intensive tasks that require high performance
processors:
• Batch processing workloads
• Media transcoding
• High performance web servers
• High performance computing (HPC)
• Scientific modeling & machine learning
• Dedicated gaming servers
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types – Memory Optimized
• Fast performance for workloads that process large data sets in memory
• Use cases:
• High performance, relational/non-relational databases
• Distributed web scale cache stores
• In-memory databases optimized for BI (business intelligence)
• Applications performing real-time processing of big unstructured data
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types – Storage Optimized
• Great for storage-intensive tasks that require high, sequential read and write
access to large data sets on local storage
• Use cases:
• High frequency online transaction processing (OLTP) systems
• Relational & NoSQL databases
• Cache for in-memory databases (for example, Redis)
• Data warehousing applications
• Distributed file systems
* this list will evolve over time, please check the AWS website for the latest information
EC2 Instance Types: example
t2.micro is part of the AWS free tier (up to 750 hours per month)
Inbound traffic
Security
Group
WWW Outbound traffic EC2 Instance
EC2 Instance
IP XX.XX.XX.XX
Security Group 1
EC2 Instance Security
Inbound EC2 Instance
IP XX.XX.XX.XX Port 123 Group 1
Authorising Security Group 1 IP XX.XX.XX.XX
(attached)
Authorising Security Group 2
Security
Port 123 Group 3 EC2 Instance
IP XX.XX.XX.XX
(attached)
Classic Ports to know
• 22 = SSH (Secure Shell) - log into a Linux instance
• 21 = FTP (File Transfer Protocol) – upload files into a file share
• 22 = SFTP (Secure File Transfer Protocol) – upload files using SSH
• 80 = HTTP – access unsecured websites
• 443 = HTTPS – access secured websites
• 3389 = RDP (Remote Desktop Protocol) – log into a Windows instance
SSH Summary Table
EC2 Instance
SSH Putty
Connect
Mac
Linux
Windows < 10
Windows >= 10
Which Lectures to watch
• Mac / Linux:
• SSH on Mac/Linux lecture
• Windows:
• Putty Lecture
• If Windows 10: SSH on Windows 10 lecture
• All:
• EC2 Instance Connect lecture
SSH troubleshooting
• Students have the most problems with SSH
SSH – Port 22
WWW EC2 Instance
Linux
Public IP
SSH – Port 22
WWW EC2 Instance
Linux
Public IP
• We will configure all the required parameters necessary for doing SSH
on Windows using the free tool Putty.
EC2 Instance Connect
• Connect to your EC2 instance within your browser
• No need to use your key file that was downloaded
• The “magic” is that a temporary key is uploaded onto EC2 by AWS
• Useful for software that have complicated licensing model (BYOL – Bring Your
Own License)
• Or for companies that have strong regulatory or compliance needs
EC2 Dedicated Instances
• Instances run on hardware that’s
dedicated to you
US-EAST-1A US-EAST-1B
EBS Snapshot
Custom AMI
US-EAST-1A US-EAST-1B
Launch
Create AMI from AMI
EC2 Instance Store
• EBS volumes are network drives with good but “limited” performance
• If you need a high-performance hardware disk, use EC2 Instance
Store
• EBS Volumes are characterized in Size | Throughput | IOPS (I/O Ops Per Sec)
• When in doubt always consult the AWS documentation – it’s good!
• Only gp2/gp3 and io1/io2 can be used as boot volumes
EBS Volume Types Use cases
General Purpose SSD
• Cost effective storage, low-latency
• System boot volumes, Virtual desktops, Development and test environments
• 1 GiB - 16 TiB
• gp3:
• Baseline of 3,000 IOPS and throughput of 125 MiB/s
• Can increase IOPS up to 16,000 and throughput up to 1000 MiB/s independently
• gp2:
• Small gp2 volumes can burst IOPS to 3,000
• Size of the volume and IOPS are linked, max IOPS is 16,000
• 3 IOPS per GB, means at 5,334 GB we are at the max IOPS
EBS Volume Types Use cases
Provisioned IOPS (PIOPS) SSD
• Critical business applications with sustained IOPS performance
• Or applications that need more than 16,000 IOPS
• Great for databases workloads (sensitive to storage perf and consistency)
• io1/io2 (4 GiB - 16 TiB):
• Max PIOPS: 64,000 for Nitro EC2 instances & 32,000 for other
• Can increase PIOPS independently from storage size
• io2 have more durability and more IOPS per GiB (at the same price as io1)
• io2 Block Express (4 GiB – 64 TiB):
• Sub-millisecond latency
• Max PIOPS: 256,000 with an IOPS:GiB ratio of 1,000:1
• Supports EBS Multi-attach
EBS Volume Types Use cases
Hard Disk Drives (HDD)
• Cannot be a boot volume
• 125 GiB to 16 TiB
• Throughput Optimized HDD (st1)
• Big Data, Data Warehouses, Log Processing
• Max throughput 500 MiB/s – max IOPS 500
• Cold HDD (sc1):
• For data that is infrequently accessed
• Scenarios where lowest cost is important
• Max throughput 250 MiB/s – max IOPS 250
EBS – Volume Types Summary
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html#solid-state-drives
EBS Multi-Attach – io1/io2 family
• Attach the same EBS volume to multiple EC2
instances in the same AZ Availability Zone 1
• Each instance has full read & write permissions
to the high-performance volume
• Use case:
• Achieve higher application availability in clustered
Linux applications (ex: Teradata)
• Applications must manage concurrent write
operations
• Up to 16 EC2 Instances at a time
• Must use a file system that’s cluster-aware (not io2 volume with Multi-Attach
XFS, EXT4, etc…)
EFS – Elastic File System
• Managed NFS (network file system) that can be mounted on many EC2
• EFS works with EC2 instances in multi-AZ
• Highly available, scalable, expensive (3x gp2), pay per use
us-east-1a us-east-1b us-east-1c
Security Group
EFS FileSystem
EFS – Elastic File System
• Use cases: content management, web serving, data sharing, Wordpress
• Uses NFSv4.1 protocol
• Uses security group to control access to EFS
• Compatible with Linux based AMI (not Windows)
• Encryption at rest using KMS
EFS Standard
• Availability and durability
move
• Standard: Multi-AZ, great for prod Lifecycle Policy
EFS
High Availability & Scalability
Scalability & High Availability
• Scalability means that an application / system can handle greater loads
by adapting.
• There are two kinds of scalability:
• Vertical Scalability
• Horizontal Scalability (= elasticity)
• Scalability is linked but different to High Availability
• Let’s deep dive into the distinction, using a call center as an example
Vertical Scalability
• Vertically scalability means increasing the size
of the instance
• For example, your application runs on a
t2.micro
• Scaling that application vertically means
running it on a t2.large
• Vertical scalability is very common for non
distributed systems, such as a database.
• RDS, ElastiCache are services that can scale
vertically.
• There’s usually a limit to how much you can
vertically scale (hardware limit)
junior operator senior operator
Horizontal Scalability operator operator operator
• High Availability: Run instances for the same application across multi AZ
• Auto Scaling Group multi AZ
• Load Balancer multi AZ
What is load balancing?
• Load Balances are servers that forward traffic to multiple
servers (e.g., EC2 instances) downstream
EC2 Instance
EC2 Instance
Why use a load balancer?
• Spread load across multiple downstream instances
• Expose a single point of access (DNS) to your application
• Seamlessly handle failures of downstream instances
• Do regular health checks to your instances
• Provide SSL termination (HTTPS) for your websites
• Enforce stickiness with cookies
• High availability across zones
• Separate public traffic from private traffic
Why use an Elastic Load Balancer?
• An Elastic Load Balancer is a managed load balancer
• AWS guarantees that it will be working
• AWS takes care of upgrades, maintenance, high availability
• AWS provides only a few configuration knobs
• It costs less to setup your own load balancer but it will be a lot more effort
on your end
• ALB are a great fit for micro services & container-based application
(example: Docker & Amazon ECS)
• Has a port mapping feature to redirect to a dynamic port in ECS
• In comparison, we’d need multiple Classic Load Balancer per application
Application Load Balancer (v2)
HTTP Based Traffic
Target Group
Health Check
application
for Users
Route /user HTTP
WWW
External
Application
Load Balancer
(v2)
Target Group
Health Check
application
for Search
Route /search HTTP
WWW
Application Load Balancer (v2)
Target Groups
• EC2 instances (can be managed by an Auto Scaling Group) – HTTP
• ECS tasks (managed by ECS itself) – HTTP
• Lambda functions – HTTP request is translated into a JSON event
• IP Addresses – must be private IPs
Requests External
WWW Application
Load Balancer
(v2) Target Group 2
?Platform=Desktop On-premises – Private IP routing
Application Load Balancer (v2)
Good to Know
• Fixed hostname (XXX.region.elb.amazonaws.com)
• The application servers don’t see the IP of the client directly
• The true IP of the client is inserted in the header X-Forwarded-For
• We can also get Port (X-Forwarded-Port) and proto (X-Forwarded-Proto)
Load Balancer IP
Client IP (Private IP) EC2
12.34.56.78 Instance
Connection termination
Network Load Balancer (v2)
• Network load balancers (Layer 4) allow to:
• Forward TCP & UDP traffic to your instances
• Handle millions of request per seconds
• Less latency ~100 ms (vs 400 ms for ALB)
• NLB has one static IP per AZ, and suppor ts assigning Elastic IP
(helpful for whitelisting specific IP)
Target Group
Health Check
application
for Users
TCP + Rules TCP
WWW
External
Network Load
Balancer (v2)
Target Group
Health Check
application
for Search
TCP + Rules HTTP
WWW
Network Load Balancer – Target Groups
• EC2 instances
• IP Addresses – must be private IPs
• Application Load Balancer
• Health Checks support the TCP, HTTP and HTTPS Protocols
Network Network Network
Load Balancer Load Balancer Load Balancer
Gateway Gateway
Load Balancer Load Balancer
50 50 50 50
• SSL certificates have an expiration date (you set) and must be renewed
Load Balancer - SSL Certificates
LOAD BALANCER
HTTPS (encrypted) HTTP
Over www Over private VPC
EC2
Users Instance
SSL Cert:
• Does not work for CLB (older gen) www.mycorp.com
….
Elastic Load Balancers – SSL Certificates
• Classic Load Balancer (v1)
• Support only one SSL certificate
• Must use multiple CLB for multiple hostname with multiple SSL certificates
EC2 Instance
• Time to complete “in-flight requests” while the DRAINING
instance is de-registering or unhealthy
• Stops sending new requests to the EC2
instance which is de-registering
Users EC2 Instance
• Between 1 to 3600 seconds (default: 300 ELB
seconds)
• Can be disabled (set value to 0) new connections
established to all other instances
• Set to a low value if your requests are short EC2 Instance
What’s an Auto Scaling Group?
• In real-life, the load on your websites and application can change
• In the cloud, you can create and get rid of servers very quickly
• ASG are free (you only pay for the underlying EC2 instances)
Auto Scaling Group in AWS
Desired Capacity
Maximum Capacity
Auto Scaling Group in AWS With Load Balancer
Users
• EBS Volumes
• Security Groups
AMI Instance EBS Volumes
• SSH Key Pair Type
• IAM Roles for your EC2 Instances
• Network + Subnets Information Security
Groups SSH Key Pair IAM Role
• Load Balancer Information
• Min Size / Max Size / Initial Capacity …
• Scaling Policies VPC + Subnets Load
Balancer
Auto Scaling - CloudWatch Alarms & Scaling
• It is possible to scale an ASG based on CloudWatch alarms
• An alarm monitors a metric (such as Average CPU, or a custom metric)
• Metrics such as Average CPU are computed for the overall ASG instances
• Based on the alarm:
• We can create scale-out policies (increase the number of instances)
• We can create scale-in policies (decrease the number of instances)
seconds)
• During the cooldown period, the ASG will
not launch or terminate additional No Default
instances (to allow for metrics to stabilize) Launch or
Teminate Instance
Cooldown
in effect?
VS
ASYNC ASYNC
Replication Replication
RDS DB RDS DB RDS DB RDS DB
instance Same Region instance read instance Cross-Region instance read
Free replica $$$ replica
RDS Multi AZ (Disaster Recovery)
• SYNC replication
Application
• One DNS name – automatic app
failover to standby
• Increase availability writes reads
• Failover in case of loss of AZ, loss of
network, instance or storage failure
One DNS name – automatic failover
• No manual intervention in apps
• Not used for scaling
DB snapshot
Amazon Aurora
• Aurora is a proprietary technology from AWS (not open sourced)
• Postgres and MySQL are both supported as Aurora DB (that means your
drivers will work as if Aurora was a Postgres or MySQL database)
• Aurora is “AWS cloud optimized” and claims 5x performance improvement
over MySQL on RDS, over 3x the performance of Postgres on RDS
• Aurora storage automatically grows in increments of 10GB, up to 128 TB.
• Aurora can have up to 15 replicas and the replication process is faster than
MySQL (sub 10 ms replica lag)
• Failover in Aurora is instantaneous. It’s HA (High Availability) native.
• Aurora costs more than RDS (20% more) – but is more efficient
Aurora High Availability and Read Scaling
• 6 copies of your data across 3 AZ:
AZ 1 AZ 2 AZ 3
• 4 copies out of 6 needed for writes
• 3 copies out of 6 need for reads
• Self healing with peer-to-peer replication
• Storage is striped across 100s of volumes W
R R R R R
• One Aurora Instance takes writes (master)
• Automated failover for master in less than Shared storage Volume
Replication + Self Healing + Auto Expanding
30 seconds
• Master + up to 15 Aurora Read Replicas
serve reads
• Suppor t for Cross Region Replication
Aurora DB Cluster
client
Auto Scaling
W R R R R R
User
• The user hits another application
instance of our
application
• The instance retrieves the
data and the user is application
already logged in
ElastiCache – Redis vs Memcached
REDIS MEMCACHED
• Multi AZ with Auto-Failover • Multi-node for partitioning of
• Read Replicas to scale reads and data (sharding)
have high availability • No high availability (replication)
• Data Durability using AOF
persistence • Non persistent
• Backup and restore features • No backup and restore
• Supports Sets and Sorted Sets • Multi-threaded architecture
Replication
+
sharding
Caching Implementation Considerations
• Read more at: https://aws.amazon.com/caching/implementation-
considerations/
• If too many evictions happen due to memory, you should scale up or out
Final words of wisdom
• Lazy Loading / Cache aside is easy to implement and works for many
situations as a foundation, especially on the read side
• Write-through is usually combined with Lazy Loading as targeted for the
queries or workloads that benefit from this optimization
• Setting a TTL is usually not a bad idea, except when you’re using Write-
through. Set it to a sensible value for your application
• Only cache the data that makes sense (user profiles, blogs, etc…)
• Quote: There are only two hard things in Computer Science: cache
invalidation and naming things
Amazon MemoryDB for Redis
• Redis-compatible, durable, in-memory database service
• Ultra-fast performance with over 160 millions requests/second
• Durable in-memory data storage with Multi-AZ transactional log
• Scale seamlessly from 10s GBs to 100s TBs of storage
• Use cases: web and mobile apps, online gaming, media streaming, …
AZ 1 AZ 2 AZ 3
.com
example.com
www.example.com
api.example.com
DNS Terminologies
• Domain Registrar : Amazon Route 53, GoDaddy, …
• DNS Records: A, AAAA, CNAME, NS, …
• Zone File: contains DNS records
• Name Server : resolves DNS queries (Authoritative or Non-Authoritative)
• Top Level Domain (TLD): .com, .us, .in, .gov, .org, …
• Second Level Domain (SLD): amazon.com, google.com, …
URL
http://api.www.example.co Root
m.
Protocol
SLD
TLD
Sub Domain
FQDN (Fully Qualified Domain Name)
How DNS Works
Web Server
(example.com)
(IP: 9.10.11.12)
Managed by ICANN
?
p l e .com
exam
.3.4 Root DNS Server
.2
o m NS 1
.c
example.com? example.com?
Managed by IANA
9.10.11.12 (Branch of ICANN)
TTL example.com NS 5.6.7.8
TTL
Web Browser TLD DNS Server
You want to access Local DNS Server exam
p (.com)
le.co
example.com exam m?
Assigned and Managed by p le.co
m IP
your company or assigned by 9.10
.11.
your ISP dynamically 1 2 Managed by Domain Registrar
(e.g., Amazon Registrar, Inc.)
SLD DNS Server
(example.com)
Amazon Route 53
• A highly available, scalable, fully example.com?
Amazon
managed and Authoritative DNS Route 53
• Authoritative = the customer (you) Client
54.22.33.44
can update the DNS records
• Route 53 is also a Domain Registrar
• Ability to check the health of your
resources AWS Cloud
example.com?
VPC
54.22.33.44
Client Private Hosted Zone
Public Hosted Zone
db.example.internal?
al?
ern
10.0.0.35
.int
.10
pl e
DB Instance
0
am
.0.
VPC
10
i.ex
(db.example.internal)
ap
(Private IP)
• CNAME:
• Points a hostname to any other hostname. (app.mydomain.com => blabla.anything.com)
• ONLY FOR NON ROOT DOMAIN (aka. something.mydomain.com)
• Alias:
• Points a hostname to an AWS Resource (app.mydomain.com => blabla.amazonaws.com)
• Works for ROOT DOMAIN and NON ROOT DOMAIN (aka mydomain.com)
• Free of charge
• Native health check
Route 53 – Alias Records
Amazon
Route 53
• Maps a hostname to an AWS resource
• An extension to DNS functionality
• Automatically recognizes changes in the Alias Record (Enabled)
Record Name Type Value
resource’s IP addresses example.com A MyALB-123456789.us-
east-
• Unlike CNAME, it can be used for the top node 1.elb.amazonaws.com
HT /hea
•
to
Interval – 30 sec (can set to 10 sec – higher cost)
TP lth
20
req
• Supported protocol: HTTP, HTTPS and TCP
0c
ue
• If > 18% of health checkers report the endpoint is
od
st
healthy, Route 53 considers it Healthy. Otherwise, it’s
e
Unhealthy eu-west-1
Must allow incoming
• Ability to choose which locations you want Route 53 to requests from Route 53
use Health Checkers IP
• Health Checks pass only when the endpoint ALB
address range
responds with the 2xx and 3xx status codes
• Health Checks can be setup to pass / fail based on
the text in the first 5120 bytes of the response Auto Scaling group
EC2 Instance
Health Check (Primary)
(mandatory)
DNS Requests
Failover
Client
Amazon
Route 53
EC2 Instance
(Secondary – Disaster Recovery)
Routing Policies – Geolocation
A 11.22.33.44
• Different from Latency-based!
• This routing is based on user location
• Specify location by Continent, Country
or by US State (if there’s overlapping,
most precise location selected) Default
A 99.11.22.33
• Should create a “Default” record (in
case there’s no match on location)
• Use cases: website localization, restrict
content distribution, load balancing, …
• Can be associated with Health Checks
A 55.66.77.88
Routing Policies – Geoproximity
• Route traffic to your resources based on the geographic location of users and
resources
• Ability to shift more traffic to resources based on the defined bias
• To change the size of the geographic region, specify bias values:
• To expand (1 to 99) – more traffic to the resource
• To shrink (-1 to -99) – less traffic to the resource
us-west-1 us-east-1
Bias: 0 Bias: 0
Routing Policies – Geoproximity
us-west-1 us-east-1
Bias: 0 Bias: 50
purchase
example.com manage DNS records
User
Amazon
Route 53
GoDaddy as Registrar & Route 53 as DNS Service
• At the AWS Cer tified Developer Level, you should know about:
• VPC, Subnets, Internet Gateways & NAT Gateways
• Security Groups, Network ACL (NACL), VPC Flow Logs
• VPC Peering, VPC Endpoints
• Site to Site VPN & Direct Connect
• I will just give you an overview, less than1 or 2 questions at your exam.
• Later in the course, I will be highlighting when VPC concepts are helpful
VPC & Subnets Primer www
Region
Availability Zone 1 Availability Zone 2
VPC
VPC CIDR Range:
10.0.0.0/16
Public subnet Public subnet
Public Subnet
subnet
• Can have ALLOW and DENY rules
• Are attached at the Subnet level NACL
• Rules only include IP addresses
• Security Groups
• A firewall that controls traffic to and from an
ENI / an EC2 Instance
• Can have only ALLOW rules
• Rules include IP addresses and other security
groups Security group
Network ACLs vs Security Groups
https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Secur
ity.html#VPC_Security_Comparison
VPC Flow Logs
• Capture information about IP traffic going into your interfaces:
• VPC Flow Logs
• Subnet Flow Logs
• Elastic Network Interface Flow Logs
• Helps to monitor & troubleshoot connectivity issues. Example:
• Subnets to internet
• Subnets to subnets
• Internet to subnets
• Captures network information from AWS managed interfaces too: Elastic Load
Balancers, ElastiCache, RDS, Aurora, etc…
• VPC Flow logs data can go to S3, CloudWatch Logs, and Kinesis Data Firehose
VPC Peering
• Connect two VPC, privately using VPC peering
AWS’ network VPC A
Aß àB
VPC B
• Make them behave as if they were
in the same network
• Must not have overlapping CIDR (IP
address range)
VPC C
• VPC Peering connection is not VPC peering VPC peering
transitive (must be established for Aß àC Bß à C
Private Private
Direct Connect
VPC Closing Comments
• VPC: Virtual Private Cloud
• Subnets: Tied to an AZ, network partition of the VPC
• Internet Gateway: at the VPC level, provide Internet Access
• NAT Gateway / Instances: give internet access to private subnets
• NACL: Stateless, subnet rules for inbound and outbound
• Security Groups: Stateful, operate at the EC2 instance level or ENI
• VPC Peering: Connect two VPC with non overlapping IP ranges, non transitive
• VPC Endpoints: Provide private access to AWS Services within VPC
• VPC Flow Logs: network traffic logs
• Site to Site VPN: VPN over public internet between on-premises DC and AWS
• Direct Connect: direct private connection to a AWS
VPC note – AWS Certified Developer
• Don’t stress if you didn’t understand everything in that section
• I will be highlighting in the course the specific VPC features we need
• Feel free to revisit that section after you’re done in the course !
• Moving on J
Typical 3 tier solution architecture
Route 53
ElastiCache
Auto Scaling group
Availability zone 1
Multi AZ
Store / retrieve
Availability zone 2
session data
+ Cached data
ELB
Availability zone 3
Amazon RDS
Read / write data
Multi AZ
ENI
Send image
Availability zone 2
EFS
ENI
WordPress on AWS (more complicated)
https://aws.amazon.com/blogs/architecture/wordpress-best-practices-on-aws/
Amazon S3
Section introduction
• Amazon S3 is one of the main building blocks of AWS
• It’s advertised as ”infinitely scaling” storage
• Resource-Based
• Bucket Policies – bucket wide rules from the S3 console - allows cross account
• Object Access Control List (ACL) – finer grain (can be disabled)
• Bucket Access Control List (ACL) – less common (can be disabled)
S3 Bucket Policy
Allows Public Access
IAM Policy
IAM User
S3 Bucket
Example: EC2 instance access - Use IAM Roles
IAM permissions
EC2 Instance Role
EC2 Instance
S3 Bucket
Advanced: Cross-Account Access –
Use Bucket Policy
S3 Bucket Policy
Allows Cross-Account
IAM User
Other AWS account
S3 Bucket
Bucket settings for Block Public Access
• http://bucket-name.s3-website.aws-region.amazonaws.com
S3 Bucket
• If you get a 403 Forbidden error, make sure the bucket (demo-bucket)
• Availability:
• Measures how readily available a service is
• Varies depending on storage class
• Example: S3 standard has 99.99% availability = not available 53 minutes a year
S3 Standard – General Purpose
• 99.99% Availability
• Used for frequently accessed data
• Low latency and high throughput
• Sustain 2 concurrent facility failures
• Use Cases: Big Data analytics, mobile & gaming applications, content
distribution…
S3 Storage Classes – Infrequent Access
• For data that is less frequently accessed, but requires rapid access when needed
• Lower cost than S3 Standard
Availability
>= 3 >= 3 >= 3 1 >= 3 >= 3 >= 3
Zones
Min. Storage
None None 30 Days 30 Days 90 Days 90 Days 180 Days
Duration Charge
Min. Billable
None None 128 KB 128 KB 128 KB 40 KB 40 KB
Object Size
Retrieval Fee None None Per GB retrieved Per GB retrieved Per GB retrieved Per GB retrieved Per GB retrieved
https://aws.amazon.com/s3/storage-classes/
S3 Storage Classes – Price Comparison
Example: us-east-1
Glacier Instant Glacier Flexible Glacier Deep
Standard Intelligent-Tiering Standard-IA One Zone-IA
Retrieval Retrieval Archive
Storage Cost
$0.023 $0.0025 - $0.023 $0.0125 $0.01 $0.004 $0.0036 $0.00099
(per GB per month)
GET: $0.0004
GET: $0.0004
POST: $0.03
POST: $0.05
Retrieval Cost GET: $0.0004 GET: $0.0004 GET: $0.001 GET: $0.001 GET: $0.01
(per 1000 request) POST: $0.005 POST: $0.005 POST: $0.01 POST: $0.01 POST: $0.02 Expedited: $10
Standard: $0.10
Standard: $0.05
Bulk: $0.025
Bulk: free
Expedited (1 – 5 mins)
Standard (12 hours)
Retrieval Time Instantaneous Standard (3 – 5 hours)
Bulk (48 hours)
Bulk (5 – 12 hours)
Monitoring Cost
$0.0025
(pet 1000 objects)
https://aws.amazon.com/s3/pricing/
AWS CLI, SDK, IAM Roles &
Policies
EC2 Instance Metadata (IMDS)
• AWS EC2 Instance Metadata (IMDS) is powerful but one of the least known
features to developers
• It allows AWS EC2 instances to ”learn about themselves” without using an
IAM Role for that purpose.
• The URL is http://169.254.169.254/latest/meta-data
• You can retrieve the IAM Role name from the metadata, but you CANNOT
retrieve the IAM Policy.
• Metadata = Info about the EC2 instance
• Userdata = launch script of the EC2 instance
• Fun fact… the AWS CLI uses the Python SDK (boto3)
• The exam expects you to know when you should use an SDK
• We’ll practice the AWS SDK when we get to the Lambda functions
5
AWS CLI Credentials Provider Chain
• The CLI will look for credentials in this order
• The IAM Instance Profile was assigned to the EC2 instance, but it still
had access to all S3 buckets. Why?
the credentials chain is still giving priorities to the environment variables
AWS Credentials Best Practices
• Overall, NEVER EVER STORE AWS CREDENTIALS IN YOUR CODE
• Best practice is for credentials to be inherited from the credentials chain
Standard IA
• For infrequently accessed object,
move them to Standard IA Intelligent Tiering
• Good first step to put together Lifecycle Rules 8/22/2022 STANDARD 000-014
SNS
Amazon S3 SQS
SQS Resource (Access) Policy
Lambda Function
Lambda Resource Policy
S3 Event Notifications
with Amazon EventBridge
events All events rules Over 18
AWS services
as destinations
Amazon S3 Amazon
bucket EventBridge
File in S3 File in S3
Requests in parallel
S3 Select & Glacier Select
• Retrieve less data using SQL by performing server-side filtering
• Can filter by rows & columns (simple SQL statements)
• Less network transfer, less CPU cost client-side
CSV file
Server-side filtering
https://aws.amazon.com/blogs/aws/s3-glacier-select/
S3 User-Defined Object Metadata & S3 Object Tags
Metadata Tags
• S3 User-Defined Object Metadata
• When uploading an object, you can also assign metadata Key Value
• Name-value (key-value) pairs Content-Length 7.5 KB Key Value
• User-defined metadata names must begin with "x-amz-meta-” Content-Type html Project Blue
• Amazon S3 stores user-defined metadata keys in lowercase x-amz-meta-origin paris PHI True
• Metadata can be retrieved while retrieving the object
• S3 Object Tags
• Key-value pairs for objects in Amazon S3
• Useful for fine-grained permissions (only access specific objects S3 Object
with specific tags)
• Useful for analytics purposes (using S3 Analytics to group by tags)
index data in
• You cannot search the object metadata or object tags DDB Table
(searchable)
• Instead, you must use an external DB as a search index such
as DynamoDB
DynamoDB Table
Amazon S3 – Security
Amazon S3 – Object Encryption
• You can encrypt objects in S3 buckets using one of 4 methods
• It’s important to understand which ones are for which situation for the exam
Amazon S3 Encryption – SSE-S3
• Encryption using keys handled, managed, and owned by AWS
• Object is encrypted server-side
• Encryption type is AES-256
• Must set header "x-amz-server-side-encryption": "AES256"
• Enabled by default for new buckets & new objects
Amazon S3
Object
upload
Amazon S3
Object
upload
Amazon S3
Object
+
upload
File Amazon S3
+ Encryption upload
HTTP(S)
File
(encrypted) S3 Bucket
Client Key
Amazon S3 – Encryption in transit (SSL/TLS)
• Encryption in flight is also called SSL/TLS
• HTTPS is recommended
• HTTPS is mandatory for SSE-C
• Most clients would use the HTTPS endpoint by default
Amazon S3 – Force Encryption in Transit
aws:SecureTransport
Account B
User http
S3 Bucket
(my-bucket)
https
Access-Control-Allow-Origin: https://www.example.com
Access-Control-Allow-Methods: GET, PUT, DELETE
HTTPS Request
Preflight Response
Web Browser
Web Server Web Server
(Origin) GET / (Cross-Origin)
https://www.example.com Host: www.other.com https://www.other.com
Origin: https://www.example.com
CORS Headers received already by the Origin
The Web Browser can make requests
Amazon S3 – CORS
• If a client makes a cross-origin request on our S3 bucket, we need to enable
the correct CORS headers
• It’s a popular exam question
• You can allow for a specific origin or for * (all origins)
GET /index.html
Host: http://my-bucket-html.s3-website.us-west-2.amazonaws.com S3 Bucket
(my-bucket-html)
(Static Website Enabled)
index.html
GET /images/coffee.jpg
Web Browser Host: http://my-bucket-assets.s3-website.us-west-2.amazonaws.com
Origin: http://my-bucket-html.s3-website.us-west-2.amazonaws.com S3 Bucket
(my-bucket-assets)
(Static Website Enabled)
Access-Control-Allow-Origin: http://my-bucket-html.s3-website.us-west-2.amazonaws.com
Amazon S3 – MFA Delete
• MFA (Multi-Factor Authentication) – force users to generate a code on a
device (usually a mobile phone or hardware) before doing important
operations on S3
• MFA will be required to:
• Permanently delete an object version Google Authenticator
• Suspend Versioning on the bucket
• MFA won’t be required to:
• Enable Versioning
• List deleted versions MFA Hardware Device
• To use MFA Delete, Versioning must be enabled on the bucket
• Only the bucket owner (root account) can enable/disable MFA Delete
S3 Access Logs
• For audit purpose, you may want to log all access to S3 buckets
requests
• Any request made to S3, from any account, authorized or denied,
will be logged into another S3 bucket
• That data can be analyzed using data analysis tools…
• The target logging bucket must be in the same AWS region
My-bucket
Log all
requests
• The log format is at:
https://docs.aws.amazon.com/AmazonS3/latest/dev/LogFormat.html
Logging Bucket
S3 Access Logs: Warning
• Do not set your logging bucket to be the monitored bucket
• It will create a logging loop, and your bucket will grow exponentially
Logging loop
PutObject
pre-signed URL
• URL Expiration
generate
• S3 Console – 1 min up to 720 mins (12 hours)
• AWS CLI – configure expiration with --expires-in parameter in seconds URL
(default 3600 secs, max. 604800 secs ~ 168 hours)
• Users given a pre-signed URL inherit the permissions of the user
that generated the URL for GET / PUT
URL
S3 Bucket
• Examples: (Private)
• Allow only logged-in users to download a premium video from your S3
bucket
• Allow an ever-changing list of users to download files by generating URLs
dynamically URL
• Allow temporarily a user to upload a file to a precise location in your S3
bucket
S3 – Access Points
Policy
Grant R/W to
Users /finance prefix Finance S3 Bucket
(Finance) Access Point Simple Bucket
Policy
Grant R/W to
Policy
Users /finance/…
/sales prefix Sales
(Sales) Access Point
Policy /sales/…
Grant R to
Users entire bucket Analytics
(Analytics) Access Point
Origin
Forward Request
to your Origin
S3
HTTP
Local Cache
CloudFront – S3 as an Origin
AWS Cloud
Public www
Private AWS
Private AWS
Edge Edge
Los Angeles Mumbai
Origin
(EC2 Instance)
CloudFront Policies – Cache Policy
• Cache based on:
• HTTP Headers: None – Whitelist
• Cookies: None – Whitelist – Include All-Except – All
• Query Strings: None – Whitelist – Include All-Except – All
• Control the TTL (0 seconds to 1 year), can be set by the origin using
the Cache-Control header, Expires header…
• Create your own policy or use Predefined Managed Policies
• All HTTP headers, cookies, and query strings that you include in the
Cache Key are automatically included in origin requests
CloudFront Caching – Cache Policy
HTTP Headers
GET /blogs/myblog.html HTTP/1.1
Host: mywebsite.com • None:
User-Agent: Mozilla/5.0 (Mac OS X 10_15_2….)
Date: Tue, 28 Jan 2021 17:01:57 GMT • Don’t include any headers in the Cache Key
Authorization: SAPISIDHASH fdd00ecee39fe…. (except default)
Keep-Alive: 300 • Headers are not forwarded (except default)
Language: fr-fr
• Best caching performance
• Whitelist:
request
• only specified headers included in the Cache Key
CloudFront
• Specified headers are also forwarded to Origin
Client
CloudFront Cache – Cache Policy
Query Strings
• None
• Don’t include any query strings in the Cache Key
• Query strings are not forwarded
GET /image/cat.jpg?border=red&size=large HTTP/1.1 • Whitelist
… • Only specified query strings included in the Cache Key
• Only specified query strings are forwarded
• Include All-Except
• Include all query strings in the Cache Key except the
specified list
request • All query strings are forwarded except the specified list
• All
Client CloudFront • Include all query strings in the Cache Key
• All query strings are forwarded
• Worst caching performance
CloudFront Policies – Origin Request Policy
• Specify values that you want to include in origin requests without
including them in the Cache Key (no duplicated cached content)
• You can include:
• HTTP headers: None – Whitelist – All viewer headers options
• Cookies: None – Whitelist – All
• Query Strings: None – Whitelist – All
• Ability to add CloudFront HTTP headers and Custom Headers to an
origin request that were not included in the viewer request
• Create your own policy or use Predefined Managed Policies
Cache Policy vs. Origin Request Policy
GET /content/stories/example-story.html?ref=123abc&split-pages=false HTTP/1.1
Host: mywebsite.com GET /content/stories/example-story.html?ref=123abc HTTP/1.1
User-Agent: Mozilla/5.0 (Mac OS X 10_15_2….) Host: mywebsite.com
Date: Tue, 28 Jan 2021 17:01:57 GMT User-Agent: Mozilla/5.0 (Mac OS X 10_15_2….)
Authorization: SAPISIDHASH fdd00ecee39fe…. Authorization: SAPISIDHASH fdd00ecee39fe….
Keep-Alive: 300 Cookie: session_id=12344321
Accept-Ranges: bytes
Cookie: session_id=12344321
request forward
/* S3 Bucket
Signed Cookies
authenticate
Users CloudFront
Distribution /login EC2 Instance
Signed Cookies
Generate Signed
Cookies
CloudFront – Maximize cache hits by
separating static and dynamic distributions
CDN Layer
CloudFront Dynamic Content (REST, HTTP server):
ALB + EC2
Cache based on correct
headers and cookie
Dynamic
Static content
Static requests
http://d7uri8nf7uskq.cloudfront.net/tools/list-cloudfront-ips
• Signed URL = access to individual files (one signed URL per file)
• Signed Cookies = access to multiple files (one signed cookie for many files)
CloudFront Signed URL Diagram
Amazon CloudFront Amazon S3
Signed URL
OAC
Client Object
Edge location
Return
Authentication Signed URL
+ Authorization Edge location
Pre-Signed URL
Signed URL Origin
Client
Client
Edge location
CloudFront Signed URL Process
• Two types of signers:
• Either a trusted key group (recommended)
• Can leverage APIs to create and rotate keys (and IAM for API security)
• An AWS Account that contains a CloudFront Key Pair
• Need to manage keys using the root account and the AWS console
• Not recommended because you shouldn’t use the root account for this
• In your CloudFront distribution, create one or more trusted key groups
• You generate your own public / private key
• The private key is used by your applications (e.g. EC2) to sign URLs
• The public key (uploaded) is used by CloudFront to verify URLs
CloudFront - Pricing
• CloudFront Edge locations are all around the world
• The cost of data out per edge location varies
lower higher
CloudFront – Price Classes
• You can reduce the number of edge locations for cost reduction
• Three price classes:
1. Price Class All: all regions – best performance
2. Price Class 200: most regions, but excludes the most expensive regions
3. Price Class 100: only the least expensive regions
CloudFront - Price Class
Prices Class 100
Prices Class 200
Prices Class All
CloudFront – Multiple Origin
• To route to different kind of origins based on the content type
• Based on path pattern:
• /images/*
• /api/*
• /*
Cache Behaviors Origins
/api/*
Application
Load Balancer
Amazon CloudFront /*
S3 Bucket
CloudFront – Origin Groups
• To increase high-availability and do failover
• Origin Group: one primary and one secondary origin
• If the primary origin fails, the second one is used
Responds with error status code Responds with error status code
Origin A Origin A
Replication
(Primary Origin) (Primary Origin)
Client Client
Try same request Try same request
Origin B Origin B
Infrastructure Infrastructure
Getting Started with Docker
Dockerfile
image
Push Pull
Docker Repository
Amazon
ECR
Docker Containers Management on AWS
• Amazon Elastic Container Service (Amazon ECS) Amazon ECS
• Amazon’s own container platform
• AWS Fargate
AWS Fargate
• Amazon’s own Serverless container platform
• Works with ECS and with EKS
features – no Fargate)
ECS Cluster
Amazon ECS – Data Volumes (EFS)
• Mount EFS file systems onto ECS tasks
• Works for both EC2 and Fargate launch types EC2 Instance Fargate
• Tasks running in any AZ will share the same data
in the EFS file system
• Fargate + EFS = Serverless
File System
ECS Service Auto Scaling
• Automatically increase/decrease the desired number of ECS tasks
• Target Tracking – scale based on target value for a specific CloudWatch metric
• Step Scaling – scale based on a specified CloudWatch Alarm
• Scheduled Scaling – scale based on a specified date/time (predictable changes)
• ECS Service Auto Scaling (task level) ≠ EC2 Auto Scaling (EC2 instance level)
• Fargate Auto Scaling is much easier to setup (because Serverless)
EC2 Launch Type – Auto Scaling EC2 Instances
• Accommodate ECS Service Scaling by adding underlying EC2 Instances
Task 1 Task 3
(new)
Task 2
Service A
Auto Scaling
Auto Scaling Group
ca le
S
Scale ECS Capacity Providers
CloudWatch Metric Trigger (optional)
(ECS Service CPU Usage)
CloudWatch Alarm
ECS Rolling Updates
ECS Service update screen
• When updating from v1 to v2, we can
control how many tasks can be started
and stopped, and in which order
Actual Running Capacity (100%)
v1 v1 v1 v1 v1 v1 v1 v1 v1 v2 v2 v2
v1 v2 v2 v2
v1 v2 v2 v2
v1 v1 v1 v2
v1 v1 v1 v2
ECS Rolling Update – Min 100%, Max 150%
• Starting number of tasks: 4
v1 v1 v2 v2
v1 v1 v2 v2
v1 v1 v1 v1
v1 v1 v1 v1
v2 v2 v2 v2
v2 v2 v2 v2
ECS tasks invoked by Event Bridge
Region
VPC
Upload object
AWS Fargate
Ge
Client to
S3 Bucket bje
ct
Task
(new) ECS Task Role
Event (Access S3 & DynamoDB) Save result
k
Tas
n ECS
e : Ru
Rul Amazon
DynamoDB
Amazon ECS Cluster
Amazon
EventBridge
ECS tasks invoked by Event Bridge Schedule
AWS Fargate
Task
(new) ECS Task Role
Every 1 hour Access S3
Rule: Run ECS Task Batch Processing
Amazon Amazon S3
EventBridge
Amazon ECS Cluster
ECS – SQS Queue Example
Task 1 Task 3
Messages Poll for messages
SQS Queue
Task 2
Service A
ECS Service Auto Scaling
ECS – Intercept Stopped Tasks using EventBridge
ECS Task
exited
EventBridge SNS Administrator
Containers
Event Pattern
Amazon ECS – Task Definitions
• Task definitions are metadata in JSON form to tell Internet
ECS how to run a Docker container
• It contains crucial information, such as:
• Image Name EC2 Instance
8080 Host Port
80 ECS Task
• Only define the container 172.17.35.88
por t (host port is not 80 ECS Task
applicable)
80/443
• Example Users
Application
172.18.8.192
80 ECS Task
• ECS ENI Security Group Load Balancer
172.16.4.6
• Allow port 80 from the ALB
80 ECS Task
• ALB Security Group
• Allow port 80/443 from web
ECS Cluster
Amazon ECS
One IAM Role per Task Definition
Service A
ECS Task A Role
S3
Task Definition A
Service B
ECS Task B Role
DynamoDB
Task Definition B
Amazon ECS – Environment Variables
• Environment Variable
• Hardcoded – e.g., URLs
• SSM Parameter Store – sensitive variables (e.g., API keys, shared configs)
• Secrets Manager – sensitive variables (e.g., DB passwords)
• Environment Files (bulk) – Amazon S3
fetch values
SSM Parameter Store
fetch values
Secrets Manager
Task Definition
fetch files
S3 Bucket
Amazon ECS – Data Volumes (Bind Mounts)
• Share data between multiple containers in the
same Task Definition
ECS Task
• Works for both EC2 and Fargate tasks
• EC2 Tasks – using EC2 instance storage Application Metrics & Logs
• Data are tied to the lifecycle of the EC2 instance Containers Container (Sidecar)
Bind Mount
• Use cases: Shared Storage
• Share ephemeral data between multiple containers (/var/logs/)
• “Sidecar” container pattern, where the “sidecar”
container used to send metrics/logs to other
destinations (separation of conerns)
ECS Cluster
Amazon ECS – Task Placement
• When an ECS task is started with EC2 Amazon ECS
Launch Type, ECS must determine
where to place it, with the constraints New Docker
of CPU and memory (RAM) Container
• When Amazon ECS places a task, it uses the following process to select
the appropriate EC2 Container instance:
1. Identify which instances that satisfy the CPU, memory, and por t requirements
2. Identify which instances that satisfy the Task Placement Constraints
3. Identify which instances that satisfy the Task Placement Strategies
4. Select the instances
Amazon ECS – Task Placement Strategies
• Binpack
• Tasks are placed on the least available amount of CPU and Memory
• Minimizes the number of EC2 instances in use (cost savings)
Amazon ECS
Amazon ECS
ECS Cluster
Amazon ECS – Task Placement Strategies
• You can mix them together
Amazon ECS – Task Placement Constraints
• distinctInstance
• Tasks are placed on a different EC2 instance
• memberOf
• Tasks are placed on EC2 instances that satisfy a specified expression
• Uses the Cluster Query Language (advanced)
Amazon ECR
ECR Repository
• ECR = Elastic Container Registry
Docker Docker
• Store and manage Docker images on AWS Image A Image B
• Docker Commands
• Push
docker push aws_account_id.dkr.ecr.region.amazonaws.com/demo:latest
• Pull
docker pull aws_account_id.dkr.ecr.region.amazonaws.com/demo:latest
• In case an EC2 instance (or you) can’t pull a Docker image, check IAM
permissions
AWS Copilot
• CLI tool to build, release, and operate production-ready containerized apps
• Run your apps on AppRunner, ECS, and Fargate
• Helps you focus on building apps rather than setting up infrastructure
• Provisions all required infrastructure for containerized apps (ECS, VPC, ELB, ECR…)
• Automated deployments with one command using CodePipeline
• Deploy to multiple environments
• Troubleshooting, logs, health status…
Well-architected
Amazon ECS
infrastructure setup
VPC
Public subnet 1 Public subnet 2 Public subnet 3
EKS
Public
Service LB NGW NGW
ELB NGW ELB ELB
• Self-Managed Nodes
• Nodes created by you and registered to the EKS cluster and managed by an ASG
• You can use prebuilt AMI - Amazon EKS Optimized AMI
• Supports On-Demand or Spot Instances
• AWS Fargate
• No maintenance required; no nodes managed
Amazon EKS – Data Volumes
• Need to specify StorageClass manifest on your EKS cluster
• Leverages a Container Storage Interface (CSI) compliant driver
• Support for…
• Amazon EBS
• Amazon EFS (works with Fargate)
• Amazon FSx for Lustre
• Amazon FSx for NetApp ONTAP
AWS Elastic Beanstalk
Deploying applications in AWS safely and predictably
Typical architecture: Web App 3-tier
Route 53
ElastiCache
Auto Scaling group
Availability zone 1
Multi AZ
Store / retrieve
Availability zone 2
session data
+ Cached data
ELB
Availability zone 3
Amazon RDS
Read / write data
update version
SQS Queue
Availability Zone 1 ELB Availability Zone 2 Availability Zone 1 Availability Zone 2
Bucket (size 2)
v1 v2 v2 v2
• Can set the
bucket size
v1 v2 v2 v2
• Application is
running both
versions
Bucket (size 2)
v1 v1 v1 v2
simultaneously
• No additional
cost v1 v1 v1 v2
• Long
deployment
Elastic Beanstalk Deployment
Rolling with additional batches
• Application is running
at capacity v1 v1 v2 v2 v2 v2
• Can set the bucket
size v1 v1 v2 v2 v2 v2
• Application is running
both versions v1 v2 v2
simultaneously v1 v1 v1
• Small additional cost
• Additional batch is v1 v1 v1 v1 v2 v2
removed at the end
of the deployment new v2 v2 v2 v2 v2 terminated
• Longer deployment
• Good for prod new
v2 v2 v2 v2 v2 terminated
Elastic Beanstalk Deployment
Immutable Current ASG Current ASG Current ASG Current ASG
• Zero downtime v1 v1 v1
V1 terminated
• New Code is deployed to new v1 v1 v1
instances on a temporary ASG
v1 v1 v1
• High cost, double capacity
• Longest deployment v2 v2
Temp ASG
Elastic Beanstalk Deployment
Blue / Green
• Not a “direct feature” of Elastic Beanstalk
Environment “blue”
v1
• Zero downtime and release facility
• Create a new “stage” environment and v1
deploy v2 there
90
• The new environment (green) can be v1
%
validated independently and roll back if Web traffic
issues
• Route 53 can be setup using weighted
Environment “green”
v2
policies to redirect a little bit of traffic to
%
Amazon
Route 53
10
the stage environment v2
• Using Beanstalk, “swap URLs” when done
with the environment test v2
Elastic Beanstalk - Traffic Splitting
• Canary Testing
v1
• New application version is deployed to a
temporary ASG with the same capacity
Main ASG
v1
• A small % of traffic is sent to the
temporary ASG for a configurable amount
Migrate Instances
90
of time v1
%
• Deployment health is monitored
• If there’s a deployment failure, this triggers
an automated rollback (very quick) v2 ALB
Temporary ASG
%
• No application downtime
10
• New instances are migrated from the v2
temporary to the original ASG
• Old application version is then terminated v2
Elastic Beanstalk Deployment Summary
from AWS Doc
• https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-
features.deploy-existing-version.html
Elastic Beanstalk CLI
• We can install an additional CLI called the “EB cli” which makes working with
Beanstalk from the CLI easier
• Basic commands are:
• eb create
• eb status
• eb health
• eb events
• eb logs
• eb open
• eb deploy
• eb config
• eb terminate
• It’s helpful for your automated deployment pipelines!
Elastic Beanstalk Deployment Process
• Describe dependencies
(requirements.txt for Python, package.json for Node.js)
• Package code as zip, and describe dependencies
• Python: requirements.txt
• Node.js: package.json
• Console: upload zip file (creates new app version), and then deploy
• CLI: create new app version using CLI (uploads zip), and then deploy
• Elastic Beanstalk will deploy the zip on each EC2 instance, resolve
dependencies and start the application
Beanstalk Lifecycle Policy
• Elastic Beanstalk can store at most 1000 application versions
• If you don’t remove old versions, you won’t be able to deploy anymore
• To phase out old application versions, use a lifecycle policy
• Based on time (old versions are removed)
• Based on space (when you have too many versions)
• Versions that are currently used won’t be deleted
• Option not to delete the source bundle in S3 to prevent data loss
Elastic Beanstalk Extensions
• A zip file containing our code must be deployed to Elastic Beanstalk
• All the parameters set in the UI can be configured with code using files
• Requirements:
• in the .ebextensions/ directory in the root of source code
• YAML / JSON format
• .config extensions (example: logging.config)
• Able to modify some default settings using: option_settings
• Ability to add resources such as RDS, ElastiCache, DynamoDB, etc…
❤
Elastic Beanstalk CloudFormation
• Then CloudFormation creates those for you, in the right order, with the
exact configuration that you specify
Benefits of AWS CloudFormation (1/2)
• Infrastructure as code
• No resources are manually created, which is excellent for control
• The code can be version controlled for example using git
• Changes to the infrastructure are reviewed through code
• Cost
• Each resources within the stack is tagged with an identifier so you can easily see how
much a stack costs you
• You can estimate the costs of your resources using the CloudFormation template
• Savings strategy: In Dev, you could automation deletion of templates at 5 PM and
recreated at 8 AM, safely
Benefits of AWS CloudFormation (2/2)
• Productivity
• Ability to destroy and re-create an infrastructure on the cloud on the fly
• Automated generation of Diagram for your templates!
• Declarative programming (no need to figure out ordering and orchestration)
• Separation of concern: create many stacks for many apps, and many layers. Ex:
• VPC stacks
• Network stacks
• App stacks
• Automated way:
• Editing templates in a YAML file
• Using the AWS CLI (Command Line Interface) to deploy the templates
• Recommended way when you fully want to automate your flow
CloudFormation Building Blocks
Templates components (one course section for each):
1. Resources: your AWS resources declared in the template (MANDATORY)
2. Parameters: the dynamic inputs for your template
3. Mappings: the static variables for your template
4. Outputs: References to what has been created
5. Conditionals: List of conditions to perform resource creation
6. Metadata
Templates helpers:
1. References
2. Functions
Note:
This is an introduction to CloudFormation
• It can take over 3 hours to properly learn and master CloudFormation
• This section is meant so you get a good idea of how it works
• We’ll be slightly less hands-on than in other sections
• We’ll see how in no-time, we are able to get started with CloudFormation!
YAML Crash Course
• YAML and JSON are the languages you can
use for CloudFormation.
• JSON is horrible for CF
• YAML is great in so many ways
• Let’s learn a bit about it!
• The logical ID is for you to choose. It’s how you name condition
• The intrinsic function (logical) can be any of the following:
• Fn::And
• Fn::Equals
• Fn::If
• Fn::Not
• Fn::Or
Using a Condition
• Conditions can be applied to resources / outputs / etc…
CloudFormation
Must Know Intrinsic Functions
• Ref
• Fn::GetAtt
• Fn::FindInMap
• Fn::ImportValue
• Fn::Join
• Fn::Sub
• Condition Functions (Fn::If, Fn::Not, Fn::Equals, etc…)
Fn::Ref
• The Fn::Ref function can be leveraged to reference
• Parameters => returns the value of the parameter
• Resources => returns the physical ID of the underlying resource (ex: EC2 ID)
• The shorthand for this in YAML is !Ref
Fn::GetAtt
• Attributes are attached to any resources you create
• To know the attributes of your resources, the best place to look at is
the documentation.
• For example: the AZ of an EC2 machine!
Fn::FindInMap
Accessing Mapping Values
• We use Fn::FindInMap to return a named value from a specific key
• !FindInMap [ MapName, TopLevelKey, SecondLevelKey ]
Fn::ImportValue
• Import values that are exported in other templates
• For this, we use the Fn::ImportValue function
Fn::Join
• Join values with a delimiter
• The logical ID is for you to choose. It’s how you name condition
• The intrinsic function (logical) can be any of the following:
• Fn::And
• Fn::Equals
• Fn::If
• Fn::Not
• Fn::Or
CloudFormation Rollbacks
• Stack Creation Fails:
• Default: everything rolls back (gets deleted). We can look at the log
• Option to disable rollback and troubleshoot what happened
User SNS
ChangeSets
• When you update a stack, you need to know what changes before it
happens for greater confidence
• ChangeSets won’t say if the update will be successful
3. (optional) Create
Additional change sets
From: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stac
changesets.html
Nested stacks
• Nested stacks are stacks as part of other stacks
• They allow you to isolate repeated patterns / common components in
separate stacks and call them from other stacks
• Example:
• Load Balancer configuration that is re-used
• Security Group that is re-used
• Nested stacks are considered best practice
• To update a nested stack, always update the parent (root stack)
CloudFormation – Cross vs Nested Stacks
• Cross Stacks Stack 1
• Helpful when stacks have different lifecycles
• Use Outputs Export and Fn::ImportValue VPC
• When you need to pass export values to Stack 2
Stack
many stacks (VPC Id, etc…)
Stack 3
• Nested Stacks
• Helpful when components must be re-used
• Ex: re-use how to properly configure an App Stack App Stack
Application Load Balancer RDS RDS
• The nested stack only is important to the Stack Stack
higher level stack (it’s not shared)
ASG ELB ASG ELB
Stack Stack Stack Stack
CloudFormation - StackSets
• Create, update, or delete stacks CloudFormation StackSet
Admin Account
across multiple accounts and regions
with a single operation
• Administrator account to create
StackSets
• Trusted accounts to create, update,
delete stack instances from StackSets
• When you update a stack
set, all associated stack instances are
updated throughout all accounts and Account A Account A Account B
us-east-1 ap-south-1 eu-west-2
regions.
CloudFormation Drift
• CloudFormation allows you to create infrastructure
• But it doesn’t protect you against manual configuration changes
• How do we know if our resources have drifted?
Producer
Consumer
Send messages
Producer Poll messages
Consumer
Consumer
Amazon SQS – Standard Queue
• Oldest offering (over 10 years old)
• Fully managed service, used to decouple applications
• Attributes:
• Unlimited throughput, unlimited number of messages in queue
• Default retention of messages: 4 days, maximum of 14 days
• Low latency (<10 ms on publish and receive)
• Limitation of 256KB per message sent
Poll / Receive
messages insert
Consumer
DeleteMessage
SQS – Multiple EC2 Instances Consumers
• Consumers receive and process
messages in parallel
• At least once delivery
SQS Queue • Best-effort message ordering
• Consumers delete messages
after processing them
• We can scale consumers
horizontally to improve
poll throughput of processing
SQS with Auto Scaling Group (ASG)
Poll for messages
EC2 Instances
SQS Queue
scale
Back-end processing
Front-end web app Application
(video processing)
SQS Queue
Auto-Scaling Auto-Scaling
Amazon SQS - Security
• Encryption:
• In-flight encryption using HTTPS API
• At-rest encryption using KMS keys
• Client-side encryption if the client wants to perform encryption/decryption itself
Visibility timeout
Visibility timeout
• If a message is not processed within the visibility timeout, it will be processed twice
• A consumer could call the ChangeMessageVisibility API to get more time
• If visibility timeout is high (hours), and consumer crashes, re-processing will take time
• If visibility timeout is too low (seconds), we may get duplicates
Amazon SQS – FIFO Queue
• FIFO = First In First Out (ordering of messages in the queue)
them
• When our code is fixed, we can
redrive the messages from the ✅
Manual inspection
DLQ back into the source And debugging
queue (or any other queue) in
batches without writing custom Consumer
code
Amazon SQS – Delay Queue
• Delay a message (consumers don’t see it immediately) up to 15 minutes
• Default is 0 seconds (message is available right away)
• Can set a default at queue level
• Can override the default on send using the DelaySeconds parameter
SQS Queue
Amazon S3
bucket
SQS – Must know API
• CreateQueue (MessageRetentionPeriod), DeleteQueue
• PurgeQueue: delete all the messages in queue
• SendMessage (DelaySeconds), ReceiveMessage, DeleteMessage
• MaxNumberOfMessages: default 1, max 10 (for ReceiveMessage API)
• ReceiveMessageWaitTimeSeconds: Long Polling
• ChangeMessageVisibility: change the message timeout
SHA-256:
b94d27b9934d3e08a52e52d7da7dabfa
Hello world c484efe37a5380ee9088f7ace2efcde9
Producer
Hello world
SQS FIFO – Message Grouping
• If you specify the same value of MessageGroupID in an SQS FIFO queue,
you can only have one consumer, and all the messages are in order
• To get ordering at the level of a subset of messages, specify different values
for MessageGroupID
• Messages that share a common Message Group ID will be in order within the group
• Each Group ID can have a different consumer (parallel processing!)
• Ordering across groups is not guaranteed
Fraud Fraud
Service Service
Buying Buying
Service Service
Shipping Shipping
Service SNS Topic Service
SNS
HTTP(S) SMS & Emails
Endpoints Mobile Notifications
SNS integrates with a lot of AWS services
• Many AWS services can send data directly to SNS for notifications
…
CloudWatch Alarms AWS Budgets Lambda
… publish
Auto Scaling Group S3 Bucket DynamoDB
(Notifications) (Events)
SNS
…
CloudFormation AWS DMS RDS Events
(State Changes) (New Replic)
Amazon SNS – How to publish
• Topic Publish (using the SDK)
• Create a topic
• Create a subscription (or many)
• Publish to the topic
SQS Queue
• Push once in SNS, receive in all SQS queues that are subscribers
• Fully decoupled, no data loss
• SQS allows for: data persistence, delayed processing and retries of work
• Ability to add more SQS subscribers over time
• Make sure your SQS queue access policy allows for SNS to write
• Cross-Region Delivery: works with SQS Queues in other regions
Application: S3 Events to multiple queues
• For the same combination of: event type (e.g. object create) and prefix
(e.g. images/) you can only have one S3 Event rule
• If you want to send the same S3 event to many SQS queues, use fan-out
SQS Queues
Fan-out
S3 Object events
created…
SNS Topic
Amazon S3
Lambda Function
Application: SNS to Amazon S3 through
Kinesis Data Firehose
• SNS can send to Kinesis and therefore we can have the following
solutions architecture:
Buying Amazon S3
Service
SNS Topic Kinesis Data
Firehose
Any supported KDF
Desxnaxon
Amazon SNS – FIFO Topic
• FIFO = First In First Out (ordering of messages in the topic)
SQS Queue
(All)
Kinesis Overview
• Makes it easy to collect, process, and analyze streaming data in real-time
• Ingest real-time data such as: Application logs, Metrics, Website clickstreams,
IoT telemetry data…
2 MB/sec (shared)
Kinesis Data
1 MB/sec
SDK, KPL Per shard all consumers Firehose
or 1000 msg/sec
per shard Shard N
OR
Kinesis Agent Kinesis Data
Stream 2 MB/sec (enhanced) Analytics
Per shard per consumer
• On-demand mode:
• No need to provision or manage the capacity
• Default capacity provisioned (4 MB/s in or 4000 records per second)
• Scales automatically based on observed throughput peak during the last 30 days
• Pay per stream per hour & data in/out per GB
Kinesis Data Streams Security
• Control access / authorization using Region
IAM policies
VPC
• Encryption in flight using HTTPS
endpoints Private subnet
Shard 1
• Encryption at rest using KMS HTTPS HTTPS
Shard 2
• You can implement EC2 Instance VPC Endpoint
encryption/decryption of data on Shard 3
client side (harder)
• VPC Endpoints available for Kinesis to Stream
444555666
444555666
Device Id
444555666 Data Blob Shard N
(up to 1 MB)
1 MB/sec
or 1000 records/sec Stream
per shard
IoT Devices Kinesis Data Streams
Use highly distributed partition key to avoid “hot partition”
Kinesis - ProvisionedThroughputExceeded
1 MB/sec or 1000 records/sec
Applications
Shard 1
Client Shard 2
• AWS Lambda
• Kinesis Data Analytics
• Kinesis Data Firehose
• Custom Consumer (AWS SDK) – Classic or Enhanced Fan-Out
• Kinesis Client Library (KCL): library to simplify reading from data stream
Kinesis Consumers – Custom Consumer
Shared (Classic) Fan-out Consumer Enhanced Fan-out Consumer
2 MB/sec per consumer
2 MB/sec per shard
per shard
across all consumers
GetRecord SubscribeT
s () oShard()
Shard 1 Consumer Shard 1 Consumer
2 MB/sec
Applicaxon A Application A
Shard 2 Shard 2 2M
B/s
ec
Consumer 2M Consumer
Application B B/ Application B
se
c
Checkpointing progress
Shard 1
KCL app
Running on EC2
Shard 2
Shard 3
KCL app
Running on EC2
Shard 4
Stream Amazon
DynamoDB
Kinesis Data Streams
KCL Example: 4 shards, Scaling KCL App
Shard 1
KCL app
Checkpointing progress
Running on EC2
Shard 2
KCL app
Running on EC2
Shard 3
KCL app
Running on EC2
Shard 4
KCL app
Stream Running on EC2
Kinesis Data Streams Amazon
DynamoDB
KCL Example: 6 shards, Scaling Kinesis
Shard 1
KCL app
Checkpointing progress
Running on EC2
Shard 2
Shard 6
KCL app
Running on EC2
Stream
Kinesis Data Streams Amazon
DynamoDB
KCL Example: 6 shards, Scaling KCL App
Shard 1
Checkpointing progress
Shard 2
Shard 3
Shard 4
Shard 5
Shard 6
Stream
Kinesis Data Streams Amazon
DynamoDB
Kinesis Operation – Shard Splitting
• Used to increase the Stream
capacity (1 MB/s data in per shard)
• Used to divide a “hot shard”
Shard 1
• The old shard is closed and will be Shard 1
Split
deleted once the data is expired Shard 4 (new)
Shard 2
• No automatic scaling (manually Shard 5 (new)
increase/decrease capacity) Shard 3
Shard 3
• Can’t split into more than two Stream
Stream
shards in a single operation Kinesis Data Stream
Kinesis Data Stream
Lambda
function Datadog
Applications
Kinesis Data
Data Streams transformation AWS Destinations
Amazon S3
Client Record
Up to 1 MB
Amazon Redshift
Amazon Batch writes (COPY through S3)
SDK, KPL CloudWatch
(Logs & Events) Kinesis
Data Firehose Amazon OpenSearch
• Streaming service for ingest at scale • Load streaming data into S3 / Redshift /
• Write custom code (producer / OpenSearch / 3rd party / custom HTTP
consumer) • Fully managed
• Real-time (~200 ms) • Near real-time (buffer time min. 60 sec)
• Manage scaling (shard splitting / • Automatic scaling
merging) • No data storage
• Data storage for 1 to 365 days • Doesn’t support replay capability
• Supports replay capability
Kinesis Data Analytics for SQL applications
SQL Kinesis
Statements Data Streams AWS Lambda anywhere
Kinesis Amazon S3
Data Firehose
Kinesis Amazon Redshift
Kinesis Data Analytics (COPY through S3)
Data Firehose for SQL Applications
Other Firehose destinations…
Sources Sinks
Reference Data in S3
Kinesis Data Analytics (SQL application)
• Real-time analytics on Kinesis Data Streams & Firehose using SQL
• Add reference data from Amazon S3 to enrich streaming data
• Fully managed, no servers to provision
• Automatic scaling
• Pay for actual consumption rate
• Output:
• Kinesis Data Streams: create streams out of the real-time analytics queries
• Kinesis Data Firehose: send analytics query results to destinations
• Use cases:
• Time-series analytics
• Real-time dashboards
• Real-time metrics
Kinesis Data Analytics for Apache Flink
• Use Flink (Java, Scala or SQL) to process and analyze streaming data
Kinesis Data
Streams
• You want to scale the number of consumers, but you want messages to be “grouped”
when they are related to each other
• Then you use a Group ID (similar to Partition Key in Kinesis)
https://aws.amazon.com/blogs/compute/solving-complex-ordering-challenges-with-amazon-sqs-fifo-queues/
Kinesis vs SQS ordering
• Let’s assume 100 trucks, 5 kinesis shards, 1 SQS FIFO
• Kinesis Data Streams:
• On average you’ll have 20 trucks per shard
• Trucks will have their data ordered within each shard
• The maximum amount of consumers in parallel we can have is 5
• Can receive up to 5 MB/s of data
• SQS FIFO
• You only have one SQS FIFO queue
• You will have 100 Group ID
• You can have up to 100 Consumers (due to the 100 Group ID)
• You have up to 300 messages per second (or 3000 if using batching)
SQS vs SNS vs Kinesis
SQS: SNS: Kinesis:
• Consumer “pull data” • Push data to many • Standard: pull data
subscribers
• 2 MB per shard
• Data is deleted after being • Up to 12,500,000 subscribers
consumed • Enhanced-fan out: push data
• Data is not persisted (lost if • 2 MB per shard per consumer
• Can have as many workers not delivered)
(consumers) as we want • Pub/Sub • Possibility to replay data
• No need to provision • Up to 100,000 topics • Meant for real-time big data,
throughput analytics and ETL
• No need to provision
• Ordering guarantees only on throughput • Ordering at the shard level
FIFO queues • Integrates with SQS for fan- • Data expires after X days
• Individual message delay out architecture pattern
• Provisioned mode or on-
capability • FIFO capability for SQS FIFO
demand capacity mode
AWS Monitoring, Troubleshooting
& Audit
CloudWatch, X-Ray and CloudTrail
Why Monitoring is Important
• We know how to deploy applications
• Safely
• Automatically
• Using Infrastructure as Code
• Leveraging the best AWS components!
• Our applications are deployed, and our users don’t care how we did it…
• Our users only care that the application is working!
• Application latency: will it increase over time?
• Application outages: customer experience should not be degraded
• Users contacting the IT department or complaining is not a good outcome
• Troubleshooting and remediation
• Internal monitoring:
• Can we prevent issues before they happen?
• Performance and Cost
• Trends (scaling patterns)
• Learning and Improvement
Monitoring in AWS
• AWS CloudWatch:
• Metrics: Collect and track key metrics
• Logs: Collect, monitor, analyze and store log files
• Events: Send notifications when certain events happen in your AWS
• Alarms: React in real-time to metrics / events
• AWS X-Ray:
• Troubleshooting application performance and errors
• Distributed tracing of microservices
• AWS CloudTrail:
• Internal monitoring of API calls being made
• Audit changes to AWS Resources by your users
AWS CloudWatch Metrics
• CloudWatch provides metrics for every services in AWS
• Metric is a variable to monitor (CPUUtilization, NetworkIn…)
• Metrics belong to namespaces
• Dimension is an attribute of a metric (instance id, environment, etc…).
• Up to 30 dimensions per metric
• Metrics have timestamps
• Can create CloudWatch dashboards of metrics
EC2 Detailed monitoring
• EC2 instance metrics have metrics “every 5 minutes”
• With detailed monitoring (for a cost), you get data “every 1 minute”
• Use detailed monitoring if you want to scale faster for your ASG!
https://mng.workshop.aws/operations-2022/detect/cwlogs.html
CloudWatch Logs Insights
• Search and analyze log data stored in CloudWatch Logs
• Example: find a specific IP inside a log, count occurrences of
“ERROR” in your logs…
• Provides a purpose-built query language
• Automatically discovers fields from AWS services and JSON log
events
• Fetch desired event fields, filter based on conditions, calculate
aggregate statistics, sort events, limit number of events…
• Can save queries and add them to CloudWatch Dashboards
• Can query multiple Log Groups in different AWS accounts
• It’s a query engine, not a real-time engine
CloudWatch Logs – S3 Export
real-time OpenSearch
Service
Lambda
near
logs real-time
S3
…
KDF KDA EC2 Lambda
Kinesis Data Streams
CloudWatch Logs Aggregation
Multi-Account & Multi Region
ACCOUNT A
REGION 1
ACCOUNT B Near
REGION 2 Real Time
CloudWatch Logs Subscription Filter Kinesis Data Streams Kinesis Data Firehose Amazon S3
ACCOUNT B
REGION 3
logs logs
• Reminder: out-of-the box metrics for EC2 – disk, CPU, network (high level)
CloudWatch Logs Metric Filter
• CloudWatch Logs can use filter expressions
• For example, find a specific IP inside of a log
• Or count occurrences of “ERROR” in your logs
• Metric filters can be used to trigger alarms
• Filters do not retroactively filter data. Filters only publish the metric data
points for events that happen after the filter was created.
• Ability to specify up to 3 Dimensions for the Metric Filter (optional)
stream
CloudWatch
Logs Agent
EC2 Instance CW Logs Metric Filters CW Alarm SNS
CloudWatch Alarms
• Alarms are used to trigger notifications for any metric
• Various options (sampling, %, max, min, etc…)
• Alarm States:
• OK
• INSUFFICIENT_DATA
• ALARM
• Period:
• Length of time in seconds to evaluate the metric
• High resolution custom metrics: 10 sec, 30 sec or multiples of 60 sec
CloudWatch Alarm Targets
• Stop, Terminate, Reboot, or Recover an EC2 Instance
• Trigger Auto Scaling Action
• Send notification to SNS (from which you can do pretty much anything)
CW Alarm - A trigger
EC2 Instance
CW Alarm - B
EC2 Instance Recovery
• Status Check:
• Instance status = check the EC2 VM
• System status = check the underlying hardware
monitor alert
Alert
CW Logs CW Alarm
Amazon SNS
• To test alarms and notifications, set the alarm state to Alarm using CLI
aws cloudwatch set-alarm-state --alarm-name "myalarm" --state-value
ALARM --state-reason "testing purposes"
CloudWatch Synthetics Canary Users
IAM Root User Sign in Event SNS Topic with Email Noificaion
Compute
JSON
{
"version": "0", Lambda AWS Batch ECS Task
EC2 Instance CodeBuild "id": "6a7e8feb-b491",
Integration
"detail-type": "EC2 Instance
(ex: Start Instance) (ex: failed build) State-change NoZficaZon",
Filter events ….
(optional) }
SQS SNS Kinesis Data
Streams
Maintenance Orchestration
S3 Event Trusted Advisor
(ex: upload object) (ex: new Finding) Amazon
EventBridge
Step CodePipeline CodeBuild
Funcuons
CloudTrail Schedule or Cron
(any API call) (ex: every 4 hours)
• Event buses can be accessed by other AWS accounts using Resource-based Policies
• You can archive events (all/filter) sent to an event bus (indefinitely or set period)
• Ability to replay archived events
Amazon EventBridge – Schema Registry
• EventBridge can analyze the events in
your bus and infer the schema
PutEvents
Central Account
trigger
Event Bus SNS
Event Rule
Resource policy
Account C Account D
• The X-Ray daemon / agent has a config to send traces cross account:
• make sure the IAM permissions are correct – the agent will assume the role
• This allows to have a central account for all your application tracing
X-Ray Sampling Rules
• With sampling rules, you control the amount of data that you record
• You can modify sampling rules without changing your code
• By default, the X-Ray SDK records the first request each second, and
five percent of any additional requests.
• One request per second is the reservoir, which ensures that at least
one trace is recorded each second as long the service is serving
requests.
• Five percent is the rate at which additional requests beyond the
reservoir size are sampled.
X-Ray Custom Sampling Rules
• You can create your own rules with the reservoir and rate
X-Ray Write APIs (used by the X-Ray daemon)
• PutTraceSegments: Uploads segment
documents to AWS X-Ray
• PutTelemetryRecords: Used by the AWS
X-Ray daemon to upload telemetry.
• SegmentsReceivedCount,
SegmentsRejectedCounts,
BackendConnectionErrors…
• GetSamplingRules: Retrieve all sampling
rules (to know what/when to send)
• GetSamplingTargets &
GetSamplingStatisticSummaries: advanced
• The X-Ray daemon needs to have an IAM
policy authorizing the correct API calls to
arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess function correctly
X-Ray Read APIs – continued
• GetServiceGraph: main graph
• BatchGetTraces: Retrieves a list of
traces specified by ID. Each trace is a
collection of segment documents that
originates from a single request.
• GetTraceSummaries: Retrieves IDs
and annotations for traces available for
a specified time frame using an
optional filter. To get the full traces,
pass the trace IDs to BatchGetTraces.
• GetTraceGraph: Retrieves a service
graph for one or more specific trace
IDs.
X-Ray with Elastic Beanstalk ❤
• Make sure to give your instance profile the correct IAM permissions so that
the X-Ray daemon can function correctly
• Then make sure your application code is instrumented with the X-Ray SDK
• Note: The X-Ray daemon is not provided for Multicontainer Docker
ECS + X-Ray integration options ❤
https://docs.aws.amazon.com/xray/latest/devguide/xray-daemon-ecs.html#xray-daemon-ecs-build
AWS Distro for OpenTelemetry
• Secure, production-ready AWS-supported distribution of the open-source
project OpenTelemetry project
• Provides a single set of APIs, libraries, agents, and collector services
• Collects distributed traces and metrics from your apps
• Collects metadata from your AWS resources and services
• Auto-instrumentation Agents to collect traces without changing your code
• Send traces and metrics to multiple AWS services and partner solutions
• X-Ray, CloudWatch, Prometheus…
• Instrument your apps running on AWS (e.g., EC2, ECS, EKS, Fargate, Lambda) as
well as on-premises
• Migrate from X-Ray to AWS Distro for Temeletry if you want to standardize
with open-source APIs from Telemetry or send traces to multiple
destinations simultaneously
AWS Distro for OpenTelemetry
AWS X-Ray
Amazon
CloudWatch
Partner Monitoring
Solueons
AWS CloudTrail
• Provides governance, compliance and audit for your AWS Account
• CloudTrail is enabled by default!
• Get an history of events / API calls made within your AWS Account
by:
• Console
• SDK
• CLI
• AWS Services
• Can put logs from CloudTrail into CloudWatch Logs or S3
• A trail can be applied to All Regions (default) or a single Region.
• If a resource is deleted in AWS, investigate CloudTrail first!
CloudTrail Diagram
SDK
CloudWatch Logs
CloudTrail Console
CLI
Console
Inspect & Audit S3 Bucket
• Data Events:
• By default, data events are not logged (because high volume operations)
• Amazon S3 object-level activity (ex: GetObject, DeleteObject, PutObject): can separate Read and Write Events
• AWS Lambda function execution activity (the Invoke API)
CloudTrail Insights
EventBridge event
CloudTrail Events Retention
• Events are stored for 90 days in CloudTrail
• To keep events beyond this period, log them to S3 and use Athena
90 days S3 Bucket
Insights Events Long-term retention
retenxon
Amazon EventBridge – Intercept API Calls
User
AssumeRole
IAM CloudTrail EventBridge SNS
User
IAM Role
AuthorizeSecurityGroupIngress
EC2 CloudTrail EventBridge SNS
edit SG
User Security Group
Inbound Rules
CloudTrail vs CloudWatch vs X-Ray
• CloudTrail:
• Audit API calls made by users / services / AWS console
• Useful to detect unauthorized calls or root cause of changes
• CloudWatch:
• CloudWatch Metrics over time for monitoring
• CloudWatch Logs for storing application log
• CloudWatch Alarms to send notifications in case of unexpected metrics
• X-Ray:
• Automated Trace Analysis & Central Service Map Visualization
• Latency, Errors and Fault analysis
• Request tracking across distributed systems
AWS Lambda
It’s a serverless world
What’s serverless?
• Serverless is a new paradigm in which the developers don’t have to
manage servers anymore…
• They just deploy code
• They just deploy… functions !
• Initially... Serverless == FaaS (Function as a Service)
• Serverless was pioneered by AWS Lambda but now also includes
anything that’s managed: “databases, messaging, storage, etc.”
• Serverless does not mean there are no servers…
it means you just don’t manage / provision / see them
Serverless in AWS Users
t
•
en
AWS Lambda
Lo
REST API
nt
gi
co
n
• DynamoDB
tic
Sta
• AWS Cognito
• AWS API Gateway
• Amazon S3
S3 bucket API Gateway Cognito
• AWS SNS & SQS
• AWS Kinesis Data Firehose
• Aurora Serverless
• Step Functions Lambda
• Fargate
DynamoDB
Why AWS Lambda
• Virtual Servers in the Cloud
• Limited by RAM and CPU
• Continuously running
Amazon EC2 • Scaling means intervention to add / remove servers
u sh
p
New thumbnail in S3
trigger
pu
Image name
sh
New image in S3 AWS Lambda Function Image size
Creates a Thumbnail Creaxon date
etc…
Metadata in DynamoDB
Example: Serverless CRON Job
Trigger
Every 1 hour
CloudWatch Events
EventBridge AWS Lambda Function
Perform a task
AWS Lambda Pricing: example
• You can find overall pricing information here:
https://aws.amazon.com/lambda/pricing/
• Pay per calls:
• First 1,000,000 requests are free
• $0.20 per 1 million requests thereafter ($0.0000002 per request)
• Pay per duration: (in increment of 1 ms)
• 400,000 GB-seconds of compute time per month for FREE
• == 400,000 seconds if function is 1GB RAM
• == 3,200,000 seconds if function is 128 MB RAM
• After that $1.00 for 600,000 GB-seconds
• It is usually very cheap to run AWS Lambda so it’s very popular
Lambda – Synchronous Invocations
• Synchronous: CLI, SDK, API Gateway, Application Load Balancer
• Results is returned right away
• Error handling must happen client side (retries, exponential backoff, etc…)
invoke
SDK/CLI Do something
Response
invoke proxy
Client Do something
Response Response
API Gateway
Lambda - Synchronous Invocations - Services
• User Invoked:
• Elastic Load Balancing (Application Load Balancer)
• Amazon API Gateway
• Amazon CloudFront (Lambda@Edge)
• Amazon S3 Batch
• Service Invoked:
• Amazon Cognito
• AWS Step Functions
• Other Services:
• Amazon Lex
• Amazon Alexa
• Amazon Kinesis Data Firehose
Lambda Integration with ALB
• To expose a Lambda function as an HTTP(S) endpoint…
• You can use the Application Load Balancer (or an API Gateway)
• The Lambda function must be registered in a target group
Target Group
HTTP/HTTPS INVOKE SYNC
ELB information
Lambda
ALB + Lambda – Permissios
SQS
Simple S3 Event Pattern – Metadata Sync
Table in RDS
New file event
S3 bucket
Update metadata table
DynamoDB Table
Lambda – Event Source Mapping
• Kinesis Data Streams
Kinesis
• SQS & SQS FIFO queue
• DynamoDB Streams POLL RETURN BATCH
h_ps://aws.amazon.com/blogs/compute/new-aws-lambda-scaling-controls-for-kinesis-and-dynamodb-event-sources/
Streams & Lambda – Error Handling
• By default, if your function returns an error, the entire batch is
reprocessed until the function succeeds, or the items in the batch
expire.
• To ensure in-order processing, processing for the affected shard is
paused until the error is resolved
• You can configure the event source mapping to:
• discard old events
• restrict the number of retries
• split the batch on error (to work around Lambda timeout issues)
• Discarded events can go to a Destination
Lambda – Event Source Mapping
SQS & SQS FIFO
• Event Source Mapping will SQS
poll SQS (Long Polling)
• Specify batch size (1-10
messages) POLL RETURN BATCH
• Recommended: Set the
queue visibility timeout to
6x the timeout of your Lambda
Lambda function Event Source Mapping
• To use a DLQ
• set-up on the SQS queue,
not Lambda (DLQ for INVOKE WITH EVENT BATCH
Lambda is only for async
invocations)
• Or use a Lambda destination Lambda Function
for failures
Queues & Lambda
• Lambda also supports in-order processing for FIFO (first-in, first-out) queues,
scaling up to the number of active message groups.
• For standard queues, items aren't necessarily processed in order.
• Lambda scales up to process a standard queue as quickly as possible.
• When an error occurs, batches are returned to the queue as individual items
and might be processed in a different grouping than the original batch.
• Occasionally, the event source mapping might receive the same item from
the queue twice, even if no function error occurred.
• Lambda deletes items from the queue after they're processed successfully.
• You can configure the source queue to send items to a dead-letter queue if
they can't be processed.
Lambda Event Mapper Scaling
• Kinesis Data Streams & DynamoDB Streams:
• One Lambda invocation per stream shard
• If you use parallelization, up to 10 batches processed per shard simultaneously
• SQS Standard:
• Lambda adds 60 more instances per minute to scale up
• Up to 1000 batches of messages processed simultaneously
• SQS FIFO:
• Messages with the same GroupID will be processed in order
• The Lambda function scales to the number of active message groups
Lambda – Event and Context Objects
EventBridge Lambda Function
invoke
• Context Object
• Provides methods and properties that provide information about the invocation,
function, and runtime environment
• Passed to your function by Lambda at runtime
• Example: aws_request_id, function_name, memory_limit_in_mb, …
Lambda – Event and Context Objects
Access Event & Context Objects using Python
Lambda – Destinations
• Nov 2019: Can configure to send result to a
destination
• Asynchronous invocations - can define destinations
for successful and failed event:
• Amazon SQS
• Amazon SNS
• AWS Lambda
https://docs.aws.amazon.com/lambda/latest/dg/invocation-async.html
• Amazon EventBridge bus
• Note: AWS recommends you use destinations instead of
DLQ now (but both can be used at the same time)
h_ps://docs.aws.amazon.com/lambda/latest/dg/invocaZon-eventsourcemapping.html
Lambda Execution Role (IAM Role)
• Grants the Lambda function permissions to AWS services / resources
• Sample managed policies for Lambda:
• AWSLambdaBasicExecutionRole – Upload logs to CloudWatch.
• AWSLambdaKinesisExecutionRole – Read from Kinesis
• AWSLambdaDynamoDBExecutionRole – Read from DynamoDB Streams
• AWSLambdaSQSQueueExecutionRole – Read from SQS
• AWSLambdaVPCAccessExecutionRole – Deploy Lambda function in VPC
• AWSXRayDaemonWriteAccess – Upload trace data to X-Ray.
• When you use an event source mapping to invoke your function, Lambda
uses the execution role to read event data.
• Best practice: create one Lambda Execution Role per function
Lambda Resource Based Policies
• Use resource-based policies to give other accounts and AWS services
permission to use your Lambda resources
• Similar to S3 bucket policies for S3 bucket
• An IAM principal can access Lambda:
• if the IAM policy attached to the principal authorizes it (e.g. user access)
• OR if the resource-based policy authorizes (e.g. service access)
• When an AWS service like Amazon S3 calls your Lambda function, the
resource-based policy gives it access.
Lambda Environment Variables
• Environment variable = key / value pair in “String” form
• Adjust the function behavior without updating code
• The environment variables are available to your code
• Lambda Service adds its own system environment variables as well
Private RDS
Lambda in VPC Lambda Function
Amazon RDS
In VPC
Lambda in VPC – Internet Access
External API
BAD! GOOD!
+
my_heavy_library_1 files Lambda Layer 1 (10 MB)
+ my_heavy_library_1 files
my_heavy_library_2 files
Lambda Layer 2 (30 MB)
my_heavy_library_2 files
Lambda – File Systems Mounting
• Lambda functions can access EFS file Availability Zone A Availability Zone B
5 layers per
Max. Size 10,240 MB function up to Elastic Elastic
250MB total
Persistence Ephemeral Durable Durable Durable
Content Dynamic Static Dynamic Dynamic
Storage Type File System Archive Object File System
Operations supported any File System operation Immutable Atomic with Versioning any File System operation
1000 concurrent
executions
Many users Application Load Balancer
THROTTLE!
Few users
API Gateway
THROTTLE!
SDK / CLI
Concurrency and Asynchronous Invocations
• If the function doesn't have enough
concurrency available to process all
events, additional requests are
New file event throttled.
• For throttling errors (429) and
New file event system errors (500-series), Lambda
returns the event to the queue and
S3 bucket attempts to run the function again
for up to 6 hours.
New file event • The retry interval increases
exponentially from 1 second after
the first attempt to a maximum of
5 minutes.
Cold Starts & Provisioned Concurrency
• Cold Star t:
• New instance => code is loaded and code outside the handler run (init)
• If the init is large (code, dependencies, SDK…) this process can take some time.
• First request served by new instances has higher latency than the rest
• Provisioned Concurrency:
• Concurrency is allocated before the function is invoked (in advance)
• So the cold start never happens and all invocations have low latency
• Application Auto Scaling can manage concurrency (schedule or target utilization)
• Note:
• Note: cold starts in VPC have been dramatically reduced in Oct & Nov 2019
• https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/
Reserved and Provisioned Concurrency
https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html
Lambda Function Dependencies
• If your Lambda function depends on external libraries:
for example AWS X-Ray SDK, Database Clients, etc…
• You need to install the packages alongside your code and zip it
together
• For Node.js, use npm & “node_modules” directory
• For Python, use pip --target options
• For Java, include the relevant .jar files
• Upload the zip straight to Lambda if less than 50MB, else to S3 first
• Native libraries work: they need to be compiled on Amazon Linux
• AWS SDK comes by default with every Lambda function
Lambda and CloudFormation – inline
Account 1 CloudFormaxon
S3 bucket with
Lambda Code Execution Role
Allow get & list
To S3 bucket
Account 3
Bucket policy
Allow Principal: [Accounts ID…]
CloudFormation
Execution Role
Lambda Container Images
Application Code
• Deploy Lambda function as container
images of up to 10GB from ECR Dependencies, datasets
• Pack complex dependencies, large Base Image must implement the
dependencies in a container Lambda Runime API
CloudFront S3 Bucket
(Static Website Hosting)
example.com
Users api.example.com
Lambda Function URL
Lambda – Function URL Security
• AuthType NONE – allow public and unauthenticated access
• Resource-based Policy is always in effect (must grant public access)
Account A
(123456789012)
Resource-based Policy
Lambda – Function URL Security
• AuthType AWS_IAM – IAM is used to authenticate and authorize requests
• Both Principal’s Identity-based Policy & Resource-based Policy are evaluated
• Principal must have lambda:InvokeFunctionUrl permissions
• Same account – Identity-based Policy OR Resource-based Policy as ALLOW
• Cross account – Identity-based Policy AND Resource Based Policy as ALLOW
Account A Account B
(123456789012) (444455556666)
Amazon RDS
Clients Elasxc Load Balancer EC2 Instance EC2 Instance (MySQL, PostgreSQL, …)
Partition Key
User_ID First_Name Last_Name Age
7791a3d6-… John William 46
873e0634-… Oliver 24
a80f73a1-… Katie Lucas 31
DynamoDB – Primary Keys
• Option 2: Par tition Key + Sor t Key (HASH + RANGE)
• The combination must be unique for each item
• Data is grouped by partition key
• Example: users-games table, “User_ID” for Partition Key and “Game_ID” for Sort Key
• On-Demand Mode
• Read/writes automatically scale up/down with your workloads
• No capacity planning needed
• Pay for what you use, more expensive ($$$)
8 7 3 e0
d6
a3
91
• WCUs and RCUs are spread evenly across par titions
634…
Table
77
ID_13 ID_45
Partition 1 Partition 2 …
DynamoDB – Throttling
• If we exceed provisioned RCUs or WCUs, we get
“ProvisionedThroughputExceededException”
• Reasons:
• Hot Keys – one partition key is being read too many times (e.g., popular item)
• Hot Par titions
• Very large items, remember RCU and WCU depends on size of items
• Solutions:
• Exponential backoff when exception is encountered (already in SDK)
• Distribute par tition keys as much as possible
• If RCU issue, we can use DynamoDB Accelerator (DAX)
R/W Capacity Modes – On-Demand
• Read/writes automatically scale up/down with your workloads
• No capacity planning needed (WCU / RCU)
• Unlimited WCU & RCU, no throttle, more expensive
• You’re charged for reads/writes that you use in terms of RRU and
WRU
• Read Request Units (RRU) – throughput for reads (same as RCU)
• Write Request Units (WRU) – throughput for writes (same as WCU)
• 2.5x more expensive than provisioned capacity (use with care)
• Use cases: unknown workloads, unpredictable application traffic, …
DynamoDB – Writing Data
• PutItem
• Creates a new item or fully replace an old item (same Primary Key)
• Consumes WCUs
• UpdateItem
• Edits an existing item’s attributes or adds a new item if it doesn’t exist
• Can be used to implement Atomic Counters – a numeric attribute that’s
unconditionally incremented
• Conditional Writes
• Accept a write/update/delete only if conditions are met, otherwise returns an error
• Helps with concurrent access to items
• No performance impact
DynamoDB – Reading Data
• GetItem
• Read based on Primary key
• Primary Key can be HASH or HASH+RANGE
• Eventually Consistent Read (default)
• Option to use Strongly Consistent Reads (more RCU - might take longer)
• ProjectionExpression can be specified to retrieve only certain attributes
DynamoDB – Reading Data (Query)
• Query returns items based on:
• KeyConditionExpression
• Partition Key value (must be = operator) – required
• Sort Key value (=, <, <=, >, >=, Between, Begins with) – optional
• FilterExpression
• Additional filtering after the Query operation (before data returned to you)
• Use only with non-key attributes (does not allow HASH or RANGE attributes)
• Returns:
• The number of items specified in Limit
• Or up to 1 MB of data
• Ability to do pagination on the results
• Can query table, a Local Secondary Index, or a Global Secondary Index
DynamoDB – Reading Data (Scan)
• Scan the entire table and then filter out data (inefficient)
• Returns up to 1 MB of data – use pagination to keep on reading
• Consumes a lot of RCU
• Limit impact using Limit or reduce the size of the result and pause
• For faster performance, use Parallel Scan
• Multiple workers scan multiple data segments at the same time
• Increases the throughput and RCU consumed
• Limit the impact of parallel scans just like you would for Scans
• Can use ProjectionExpression & FilterExpression (no changes to
RCU)
DynamoDB – Deleting Data
• DeleteItem
• Delete an individual item
• Ability to perform a conditional delete
• DeleteTable
• Delete a whole table and all its items
• Much quicker deletion than calling DeleteItem on all items
DynamoDB – Batch Operations
• Allows you to save in latency by reducing the number of API calls
• Operations are done in parallel for better efficiency
• Part of a batch can fail; in which case we need to try again for the failed items
• BatchWriteItem
• Up to 25 PutItem and/or DeleteItem in one call
• Up to 16 MB of data written, up to 400 KB of data per item
• Can’t update items (use UpdateItem)
• UnprocessedItems for failed write operations (exponential backoff or add WCU)
• BatchGetItem
• Return items from one or more tables
• Up to 100 items, up to 16 MB of data
• Items are retrieved in parallel to minimize latency
• UnprocessedKeys for failed read operations (exponential backoff or add RCU)
DynamoDB – PartiQL
• SQL-compatible query language for DynamoDB
• Allows you to select, insert, update, and delete
data in DynamoDB using SQL
• Run queries across multiple DynamoDB tables
• Run PartiQL queries from:
• AWS Management Console
• NoSQL Workbench for DynamoDB
• DynamoDB APIs
• AWS CLI
• AWS SDK
DynamoDB – Conditional Writes
• For PutItem, UpdateItem, DeleteItem, and BatchWriteItem
• You can specify a Condition expression to determine which items should be modified:
• attribute_exists
• attribute_not_exists
• attribute_type
• contains (for string)
• begins_with (for string)
• ProductCategory IN (:cat1, :cat2) and Price between :low and :high
• size (string length)
• Note: Filter Expression filters the results of read queries, while Condition
Expressions are for write operations
Conditional Writes – Example on Update Item
values.json
Conditional Writes – Example on Delete Item
• attribute_not_exists
• Only succeeds if the attribute doesn’t exist yet (no value)
• attribute_exists
• Opposite of attribute_not_exists
Conditional Writes –
Do Not Overwrite Elements
• attribute_not_exists(par tition_key)
• Make sure the item isn’t overwritten
values.json
Conditional Writes – Example of String
Comparisons
• begins_with – check if prefix matches
• contains – check if string is contained in another string
values.json
DynamoDB – Local Secondary Index (LSI)
• Alternative Sor t Key for your table (same Par tition Key as that of base table)
• The Sort Key consists of one scalar attribute (String, Number, or Binary)
• Up to 5 Local Secondary Indexes per table
• Must be defined at table creation time
• Attribute Projections – can contain some or all the attributes of the base table
(KEYS_ONLY, INCLUDE, ALL)
Primary Key Attributes
Partition Key Sort Key Attributes Partition Key Sort Key Attributes
User_ID Game_ID Game_TS Game_ID Game_TS User_ID
7791a3d6-… 4421 “2021-03-15T17:43:08” 4421 “2021-03-15T17:43:08” 7791a3d6-…
873e0634-… 4521 “2021-06-20T19:02:32” 4521 “2021-06-20T19:02:32” 873e0634-…
a80f73a1-… 1894 “2021-02-11T04:11:31” 1894 “2021-02-11T04:11:31” a80f73a1-…
TABLE (query by “User_ID”) INDEX GSI (query by “Game_ID”)
DynamoDB – Indexes and Throttling
• Global Secondary Index (GSI):
• If the writes are throttled on the GSI, then the main table will be throttled!
• Even if the WCU on the main tables are fine
• Choose your GSI partition key carefully!
• Assign your WCU capacity carefully!
DynamoDB Table
Upda
te:
Client 1 only Name = Jo
if ver
sion = hn User_ID First_Name Version
1
Item 7791a3d6-… Michael 1
Lisa 2
p date : N ame = Lisa
U
sion = 1
only if ver
Client 2
DynamoDB Accelerator (DAX)
• Fully-managed, highly available, seamless in-memory Application
cache for DynamoDB
• Microseconds latency for cached reads & queries
• Doesn’t require application logic modification
(compatible with existing DynamoDB APIs) DAX Cluster
• Solves the “Hot Key” problem (too many reads)
• 5 minutes TTL for cache (default) …
• Up to 10 nodes in the cluster Nodes
Amazon
ElasiCache
Store Aggregation Result
Amazon
DynamoDB
DynamoDB Accelerator (DAX)
DynamoDB Streams
• Ordered stream of item-level modifications (create/update/delete) in a table
• Stream records can be:
• Sent to Kinesis Data Streams
• Read by AWS Lambda
• Read by Kinesis Client Library applications
• Data Retention for up to 24 hours
• Use cases:
• react to changes in real-time (welcome email to users)
• Analytics
• Insert into derivative tables
• Insert into OpenSearch Service
• Implement cross-region replication
DynamoDB Streams
messaging, notifications
Processing Layer Amazon SNS
KCL App
filtering, transforming, …
DDB Table
create/update/delete Lambda
archiving
Amazon S3
Lambda function
DynamoDB – Time To Live (TTL) Current Tim
items)
• Use cases: reduce stored data by keeping only scan &
delete items
current items, adhere to regulatory obligations, …
Deletion Process
DynamoDB CLI – Good to Know
• --projection-expression: one or more attributes to retrieve
• --filter-expression: filter items before returned to you
UpdateItem PutItem
• vs. ElastiCache
• ElastiCache is in-memory, but DynamoDB is serverless
• Both are key/value stores
• vs. EFS
• EFS must be attached to EC2 instances as a network drive
• vs. EBS & Instance Store
• EBS & Instance Store can only be used for local caching, not shared caching
• vs. S3
• S3 is higher latency, and not meant for small objects
DynamoDB Write Sharding
• Imagine we have a voting application with two
candidates, candidate A and candidate B
Partition Key Sort Key Attributes
• If Partition Key is “Candidate_ID”, this results
Candidate_ID Vote_ts Voter_ID
into two partitions, which will generate issues
(e.g., Hot Partition) Candidate_A-11 1631188571 7791
Update: Update:
Update: value = 1 Update: value = 2
INCREASE value by 1 INCREASE value by 2
617055.jpg 617055.jpg
upload download
Application S3 Bucket Application
(media-assets-bucket)
store get
metadata metadata
Products (Table)
- Search by date
- Total storage used by a customer
- List of all objects with certain attributes
- Find all objects uploaded within a date range
Client Application
(API for objects’ metadata)
DynamoDB Operations
AWS Data Pipeline
• Table Cleanup
• Option 1: Scan + DeleteItem
launches an
• Very slow, consumes RCU & WCU, expensive
• Option 2: Drop Table + Recreate table
• Fast, efficient, cheap Amazon EMR
Cluster
to
rea
om
tes
ds
• Option 1: Using AWS Data Pipeline
wr
wri
fro
s fr
ite
m
st
d
• Option 2: Backup and restore into a new table
rea
o
• Takes some time
• Option 3: Scan + PutItem or BatchWriteItem
• Write your own code
DynamoDB S3 Bucket
Table
DynamoDB – Security & Other Features
• Security
• VPC Endpoints available to access DynamoDB without using the Internet
• Access fully controlled by IAM
• Encryption at rest using AWS KMS and in-transit using SSL/TLS
• Backup and Restore feature available
• Point-in-time Recovery (PITR) like RDS
• No performance impact
• Global Tables
• Multi-region, multi-active, fully replicated, high performance
• DynamoDB Local
• Develop and test apps locally without accessing the DynamoDB web service (without Internet)
• AWS Database Migration Service (AWS DMS) can be used to migrate to
DynamoDB (from MongoDB, Oracle, MySQL, S3, …)
DynamoDB –
Users Interact with DynamoDB Directly
login Identity Providers
permissions
Facebook
OpenID Connect
DynamoDB Table
DynamoDB – Fine-Grained Access Control
• Using Web Identity Federation or
Cognito Identity Pools, each user
gets AWS credentials
• You can assign an IAM Role to
these users with a Condition to
limit their API access to
DynamoDB
• LeadingKeys – limit row-level
access for users on the Primary
Key
• Attributes – limit specific
attributes the user can see
More at: https://docs.aws.amazon.com/amazondynamodb/latest/
developerguide/specifying-conditions.html
Amazon API Gateway
Build, Deploy and Manage APIs
Example: Building a Serverless API
store .json
requests send records files
https://api.example.com/v1 v1 Stage
V1
V1 Client
https://api.example.com/v2 v2 Stage
New URL!
V2
V2 Client
API Gateway – Stage Variables
• Stage variables are like environment variables for API Gateway
• Use them to change often changing configuration values
• They can be used in:
• Lambda function ARN
• HTTP Endpoint
• Parameter mapping templates
• Use cases:
• Configure HTTP endpoints your stages talk to (dev, test, prod…)
• Pass configuration parameters to AWS Lambda through mapping templates
• Stage variables are passed to the ”context” object in AWS Lambda
• Format: ${stageVariables.variableName}
API Gateway Stage Variables & Lambda Aliases
• We create a stage variable to indicate the corresponding Lambda alias
• Our API gateway will automatically invoke the right Lambda function!
PROD Alias
Prod Stage 95%
V1
HTTP_PROXY
HTTP Request Request/Responses are proxied
+ (optional)
HTTP Header
Client API Gateway API Key: asjdh2j3jh3j… Application Load
Balancer
Mapping Templates (AWS & HTTP Integration)
• Mapping templates can be used to modify request / responses
• Rename / Modify query string parameters
• Modify body content
• Add headers
• Uses Velocity Template Language (VTL): for loop, if etc…
• Filter output results (remove unnecessary data)
• Content-Type can be set to application/json or application/xml
Mapping Example: JSON to XML with SOAP
• SOAP API are XML based, whereas REST API are JSON based
RESTful, JSON Payload XML Payload
HTTP
http://example.com/path?name=foo&other=bar
JSON
{ “my_variable”: “foo”, “other_variable”: “bar” } You can rename variables!
(map them to anything you want)
Lambda
API Gateway - Open API spec
• Common way of defining REST APIs, using API definition as code
• Import existing OpenAPI 3.0 spec to API Gateway
• Method
• Method Request
• Integration Request
• Method Response
• + AWS extensions for API gateway and setup every single option
• Can export current API as OpenAPI spec
• OpenAPI specs can be written in YAML or JSON
• Using OpenAPI we can generate SDK for our applications
REST API – Request Validation
• You can configure API Gateway to perform basic validation of an API
request before proceeding with the integration request
• When the validation fails, API Gateway immediately fails the request
• Returns a 400-error response to the caller
• This reduces unnecessary calls to the backend
• Checks:
• The required request parameters in the URI, query string, and headers of an
incoming request are included and non-blank
• The applicable request payload adheres to the configured JSON Schema
request model of the method
REST API – RequestValidation – OpenAPI
• Setup request validation by importing OpenAPI definitions file
• Callers of the API must supply an assigned API key in the x-api-key header in
requests to the API.
API Gateway – Logging & Tracing
• CloudWatch Logs
• Log contains information about request/response body
• Enable CloudWatch logging at the Stage level (with Log Level - ERROR, DEBUG, INFO)
• Can override settings on a per API basis
API Gateway
request request
response response
User backend
request response
CloudWatch Logs
• X-Ray
• Enable tracing to get extra information about requests in API Gateway
• X-Ray API Gateway + AWS Lambda gives you the full picture
API Gateway – CloudWatch Metrics
• Metrics are by stage, Possibility to enable detailed metrics
• CacheHitCount & CacheMissCount: efficiency of the cache
• Count: The total number API requests in a given period.
• IntegrationLatency: The time between when API Gateway relays a
request to the backend and when it receives a response from the
backend.
• Latency: The time between when API Gateway receives a request from
a client and when it returns a response to the client. The latency
includes the integration latency and other API Gateway overhead.
• 4XXError (client-side) & 5XXError (server-side)
API Gateway Throttling
• Account Limit
• API Gateway throttles requests at10000 rps across all API
• Soft limit that can be increased upon request
• In case of throttling => 429 Too Many Requests (retriable error)
• Can set Stage limit & Method limits to improve performance
• Or you can define Usage Plans to throttle per customer
• Just like Lambda Concurrency, one API that is overloaded, if not limited,
can cause the other APIs to be throttled
API Gateway - Errors
• 4xx means Client errors
• 400: Bad Request
• 403: Access Denied, WAF filtered
• 429: Quota exceeded, Throttle
Preflight Response
Access-Control-Allow-Origin: https://www.example.com
Access-Control-Allow-Methods: GET, PUT, DELETE
Web Browser
S3 bucket
static content GET /
Host: api.example.com
Origin: https://www.example.com
CORS Headers received previously allowed the origin
The web browser can now make the requests
API Gateway – Security
IAM Permissions
• Create an IAM policy authorization and attach to User / Role
• Authentication = IAM | Authorization = IAM Policy
• Good to provide access within AWS (EC2, Lambda, IAM users…)
• Leverages “Sig v4” capability where IAM credential are in headers
IAM
Amazon DynamoDB
Connecting to the API
WebSocket URL
wss://[some-uniqueid].execute-api.[region].amazonaws.com/[stage-name]
connect invoke
connectionId connectionId
Clients Lambda function
Amazon API Gateway Amazon DynamoDB
WebSocket API (onConnect)
Client to Server Messaging
ConnectionID is re-used
WebSocket URL
wss://abcdef.execute-api.us-west-1.amazonaws.com/dev
send message
invoke
frames
frames
connectionId connectionId
Clients Lambda funcxon
Amazon API Gateway Amazon DynamoDB
WebSocket API (sendMessage)
Server to Client Messaging
WebSocket URL
wss://abcdef.execute-api.us-west-1.amazonaws.com/dev
send message
invoke
connectionId connectionId
Clients Lambda function
Amazon API Gateway Amazon DynamoDB
WebSocket API (sendMessage)
OperaNon Action
POST Sends a message from the Server to the connected WS Client
GET Gets the latest connection status of the connected WS Client
DELETE Disconnect the connected Client from the WS connection
API Gateway – WebSocket API – Routing
INCOMING DATA
• Incoming JSON messages are routed to {
different backend "service" : "chat",
"action" : "join",
• If no routes => sent to $default "data" : {
• You request a route selection expression to "room" : "room1234"
}
select the field on JSON to route from }
ROUTE KEY TABLE – API GATEWAY
• Sample expression: $request.body.action
$connect
• The result is evaluated against the route keys
$disconnect
available in your API Gateway
$default
• The route is then connected to the backend join
Backend
you’ve setup through API Gateway quit
integration
delete
…
API Gateway - Architecture
Client Client
• Create a single interface for
all the microservices in your customer1.example.com customer2.example.com
company
API Gateway
• Use API endpoints with
various resources Route 53
Domain Registrar, DN
• Apply a simple domain name /service1 /docs /service2
and SSL certificates
• Can apply forwarding and ELB ELB
transformation rules at the S3 Bucket
API Gateway level
• We would like our code “in a repository” and have it deployed onto AWS
• Automatically
• The right way
• Making sure it’s tested before being deployed
• With possibility to go into different stages (dev, test, staging, prod)
• With manual approval where needed
• To be a proper AWS developer… we need to learn AWS CICD
CICD – Introduction
• This section is all about automating the deployment we’ve done so far
while adding increased safety
deploy every
push code fetch code passed build
Application Application
Server v1 Server v2
Technology Stack for CICD
EC2 Instances
GitHub Jenkins CI
On-premises
Instances
AWS Lambda
any 3!" party any 3!" party AWS CodeDeploy
Code Repository CI Servers Amazon ECS
orchestrate using:
AWS CodePipeline
AWS CodeCommit
• Version control is the ability to understand the various changes that
happened to the code over time (and possibly roll back)
• All these are enabled by using a version control system such as Git
• A Git repository can be synchronized on your computer, but it usually is
uploaded on a central online repository
• Benefits are:
• Collaborate with other developers
• Make sure the code is backed-up somewhere
• Make sure it’s fully viewable and auditable
AWS CodeCommit
Emma John
(Developer) (Developer)
• Git repositories can be expensive
• The industry includes GitHub, GitLab, Bitbucket, …
• And AWS CodeCommit:
• Private Git repositories
• No size limit on repositories (scale seamlessly)
push code
• Fully managed, highly available
• Code only in AWS Cloud account => increased security
and compliance
• Security (encrypted, access control, …)
• Integrated with Jenkins, AWS CodeBuild, and other CI tools
Code Repository
CodeCommit – Security
• Interactions are done using Git (standard)
• Authentication
• SSH Keys – AWS Users can configure SSH keys in their IAM Console
• HTTPS – with AWS CLI Credential helper or Git Credentials for IAM user
• Authorization
• IAM policies to manage users/roles permissions to repositories
• Encryption
• Repositories are automatically encrypted at rest using AWS KMS
• Encrypted in transit (can only use HTTPS or SSH – both secure)
• Cross-account Access
• Do NOT share your SSH keys or your AWS credentials
• Use an IAM Role in your AWS account and use AWS STS (AssumeRole API)
CodeCommit vs. GitHub
CodeCommit GitHub
Support Code Review (Pull Requests)
Integration with AWS CodeBuild
EC2 Instances
GitHub Jenkins CI
On-premises
Instances
AWS Lambda
any 3!" party any 3!" party AWS CodeDeploy
Code Repository CI Servers Amazon ECS
orchestrate using:
AWS CodePipeline
CodePipeline – Artifacts
• Each pipeline stage can create ar tifacts
• Artifacts stored in an S3 bucket and passed on to the next stage
AWS CodePipeline
output input output input
Push code artifacts artifacts artifacts artifacts deploy
S3 Bucket
CodePipeline – Troubleshooting
• For CodePipeline Pipeline/Action/Stage Execution State Changes
• Use CloudWatch Events (Amazon EventBridge). Example:
• You can create events for failed pipelines
• You can create events for cancelled stages
• If CodePipeline fails a stage, your pipeline stops, and you can get
information in the console
• If pipeline can’t perform an action, make sure the “IAM Service Role”
attached does have enough IAM permissions (IAM Policy)
• AWS CloudTrail can be used to audit AWS API calls
CodePipeline – Events vs. Webhooks vs. Polling
Events Webhooks
Script
CodeCommit EventBridge CodePipeline CodePipeline
Polling
trigger
regular checks
• Owner AWS
AWS
Source
Source
S3
CodeCommit
0
0
1
1
• AWS – for AWS services AWS Source ECR 0 1
trigger
approve
store logs
store
retrieve
reusbale pieces
• https://docs.aws.amazon.com/codebuild/latest/userguide/use-codebuild-
agent.html
CodeBuild – Inside VPC
VPC
• By default, your CodeBuild containers are
launched outside your VPC Private Subnet
DELETE_ONLY
ALB
AWS CodeDeploy
• Deployment service that automates
application deployment
• Deploy new applications versions to EC2 v1 v2
Instances, On-premises servers, Lambda
functions, ECS Services v1 v2
• Automated Rollback capability in case of
failed deployments, or trigger CloudWatch
Alarm v1 v2
v1 v2 v2 v2
Half
v1 v2 v2 v2
v1 v1 v1 v2
v1 v1 Other Half v1 v2
CodeDeploy – Blue-Green Deployment
Auto Scaling Group Auto Scaling Group
v1 v1
v1 v1
Application Application
v1 v1
Load Balancer Load Balancer
v2 v2
v2 v2
v2 Application v2
Load Balancer
CodeDeploy Agent EC2 Instance
With CodeDeploy Agent
Half
• Will do In-place update to v1 v2 v2 v2
your fleet of EC2 instances
Other Half
• Can use hooks to verify the v1 v1 v1 v2
AWS CodeDeploy
CodeDeploy – Redeploy & Rollbacks
• Rollback = redeploy a previously deployed revision of your application
• Deployments can be rolled back:
• Automatically – rollback when a deployment fails or rollback when a
CloudWatch Alarm thresholds are met
• Manually
• Disable Rollbacks — do not perform rollbacks for this deployment
InvalidSignatureException
correctly, they might not match the signature date of your
deploy
deployment request, which CodeDeploy rejects
error
• Check log files to understand deployment issues
• For Amazon Linux, Ubuntu, and RHEL log files stored at
/opt/codedeploy-agent/deployment-root/deployment-
logs/codedeploy-agent-deployments.log EC2 Instance
14/06/2020 @ 4:30am
AWS CodeStar
• An integrated solution that groups: GitHub, CodeCommit, CodeBuild,
CodeDeploy, CloudFormation, CodePipeline, CloudWatch, …
• Quickly create “CICD-ready” projects for EC2, Lambda, Elastic Beanstalk
• Supported languages: C#, Go, HTML 5, Java, Node.js, PHP, Python, Ruby
• Issue tracking integration with JIRA / GitHub Issues
• Ability to integrate with Cloud9 to obtain a web IDE (not all regions)
• One dashboard to view all your components
• Free service, pay only for the underlying usage of other services
• Limited Customization
AWS CodeArtifact
• Software packages depend on each other to be built (also called code
dependencies), and new ones are created
• Storing and retrieving these dependencies is called ar tifact
management
• Traditionally you need to setup your own artifact management system
• CodeAr tifact is a secure, scalable, and cost-effective ar tifact
management for software development
• Works with common dependency management tools such as Maven,
Gradle, npm, yarn, twine, pip, and NuGet
• Developers and CodeBuild can then retrieve dependencies straight
from CodeAr tifact
AWS CodeArtifact
VPC
npm
fetch
Public Artifact Repositories AWS CodeArtifact
JavaScript
Domain AWS CodeBuild
pip
NuGet proxy
Repository A Repository B Python
NuGet
Java
IT Leader
CodeArtifact – EventBridge Integration
invoke Lambda
Function
Event is created when a Package
version is created, modified, or deleted
activate Step Functions
State Machine
events message
SNS
CodePipeline
start Rebuild & Redeploy
an Application
with the latest
security fixes
CodeCommit CodeBuild CodeDeploy
CodeArtifact – Resource Policy
• Can be used to authorize another
account to access CodeArtifact
• A given principal can either read all the
packages in a repository or none of them
Account B Account A
(222333344555) (123456789012)
read packages
Upstream
• Example – Connect to npmjs.com
• Configure one CodeArtifact Repository in your domain with an
external connection to npmjs.com Repo B Repo C Repo D …
• Configure all the other repositories with an upstream to it
• Packages fetched from npmjs.com are cached in the Upstream
Repository, rather than fetching and storing them in each
Repository
Developer
(npm)
CodeArtifact – Retention Public
Repository
Lodash
• If a requested package version is found in an Upstream (v4.17.20)
Repository, a reference to it is retained and is always available External Connectio
from the Downstream Repository
• The retained package version is not affected by changes to the CodeArtifact
Upstream Repository (deleting it, updating the package, …)
• Intermediate repositories do not keep the package Lodash Repository C
• Example – Fetching Package from npmjs.com (v4.17.20)
https://aws.amazon.com/codeguru/features/
Amazon CodeGuru Profiler
• Helps understand the runtime behavior of your
application
• Example: identify if your application is consuming
excessive CPU capacity on a logging routine
• Features:
• Identify and remove code inefficiencies
• Improve application performance (e.g., reduce CPU
utilization)
• Decrease compute costs
• Provides heap summary (identify which objects using
up memory)
• Anomaly Detection
• Support applications running on AWS or on-
premise
• Minimal overhead on application
https://aws.amazon.com/codeguru/features/
Amazon CodeGuru – Agent Configuration
• MaxStackDepth – the maximum depth of the stacks in the code that is
represented in the profile
• Example: if CodeGuru Profiler finds a method A, which calls method B, which calls
method C, which calls method D, then the depth is 4
• If the MaxStackDepth is set to 2, then the profiler evaluates A and B
• MemoryUsageLimitPercent – the memory percentage used by the profiler
• MinimumTimeForReportingInMilliseconds – the minimum time between
sending reports (milliseconds)
• Repor tingIntervalInMilliseconds – the reporting interval used to report
profiles (milliseconds)
• SamplingIntervalInMilliseconds – the sampling interval that is used to profile
samples (milliseconds)
• Reduce to have a higher sampling rate
AWS Cloud9
• Cloud-based Integrated Development
Environment (IDE)
• Code editor, debugger, terminal in a browser
• Work on your projects from anywhere with
an Internet connection
• Prepackaged with essential tools for popular
programming languages (JavaScript, Python,
PHP, …)
• Share your development environment with
your team (pair programming)
• Fully integrated with AWS SAM & Lambda
to easily build serverless applications https://aws.amazon.com/cloud9/
AWS Serverless Application
Model (SAM)
Taking your Serverless Development to the next level
AWS SAM
• SAM = Serverless Application Model
• Framework for developing and deploying serverless applications
• All the configuration is YAML code
• Generate complex CloudFormation from simple SAM YAML file
• Supports anything from CloudFormation: Outputs, Mappings,
Parameters, Resources…
• Only two commands to deploy to AWS
• SAM can use CodeDeploy to deploy Lambda functions
• SAM can help you to run Lambda, API Gateway, DynamoDB locally
AWS SAM – Recipe
• Transform Header indicates it’s SAM template:
• Transform: 'AWS::Serverless-2016-10-31'
• Write Code
• AWS::Serverless::Function
• AWS::Serverless::Api
• AWS::Serverless::SimpleTable
• Package & Deploy:
• aws cloudformation package / sam package
• aws cloudformation deploy / sam deploy
Deep Dive into SAM Deployment
Build the application locally Package the application Deploy the application
(sam build) (sam package OR (sam deploy OR
aws cloudformation package) aws cloudformation deploy)
Lambda
Function
SAM – Exam Summary
• SAM is built on CloudFormation
• SAM requires the Transform and Resources sections
• Commands to know:
• sam build: fetch dependencies and create local deployment artifacts
• sam package: package and upload to Amazon S3, generate CF template
• sam deploy: deploy to CloudFormation
• SAM Policy templates for easy IAM policy definition
• SAM is integrated with CodeDeploy to do deploy to Lambda aliases
Serverless Application Repository (SAR)
• Managed repository for serverless S3 Bucket
applications
• The applications are packaged using SAM Serverless Application
Repository
• Build and publish applications that can be
re-used by organizations
• Can share publicly
• Can share with specific AWS accounts
publish
deploy
• This prevents duplicate work, and just go
straight to publishing
Account
• Application settings and behaviour can be
customized using Environment variables
AWS Cloud Development Kit
AWS Cloud Development Kit (CDK)
• Define your cloud infrastructure using a
familiar language:
• JavaScript/TypeScript, Python, Java, and .NET
• Contains high level components called
constructs
• The code is “compiled” into a
CloudFormation template (JSON/YAML)
• You can therefore deploy infrastructure
and application runtime code together
• Great for Lambda functions
• Great for Docker containers in ECS / EKS
CDK in a diagram
CDK Application Constructs
CloudFormation
CloudFormation
CDK CLI Template
cdk synth
Programming
Languages
CDK vs SAM
• SAM:
• Serverless focused
• Write your template declaratively in JSON or YAML
• Great for quickly getting started with Lambda
• Leverages CloudFormation
• CDK:
• All AWS services
• Write infra in a programming language JavaScript/TypeScript, Python, Java, and
.NET
• Leverages CloudFormation
CDK + SAM
• You can use SAM CLI to locally test your CDK apps
• You must first run cdk synth
Lambda function
(myFunction) cdk synth CloudFormation
SAM CLI
Template
(MyCDKStack.template.json)
CDK Hands-On
Amazon
Rekognition
ge
ma
s
ul t
i
ze
res
aly
an
Upload image trigger save results
Command Description
npm install -g aws-cdk-lib Install the CDK CLI and libraries
cdk init app Create a new CDK project from a specified template
cdk synth Synthesizes and prints the CloudFormation template
cdk bootstrap Deploys the CDK Toolkit staging Stack
cdk deploy Deploy the Stack(s)
cdk diff View differences of local CDK and deployed Stack
cdk destroy Destroy the Stack(s)
CDK – Bootstrapping
User
• The process of provisioning resources for CDK
before you can deploy CDK apps into an cdk bootstrap
AWS environment aws://123456789012/eu-west-1
and contains:
AWS Region (eu-west-1)
• S3 Bucket – to store files
• IAM Roles – to grant permissions to perform
deployments CloudFormation Stack
(CDKToolkit)
• You must run the following command for each new
environment: S3 Bucket
• cdk bootstrap aws://<aws_account>/<aws_region>
• Otherwise, you will get an error “Policy contains a IAM Role
statement with one or more invalid principal”
CDK – Testing
• To test CDK apps, use CDK Asser tions Module
combined with popular test frameworks such as
Jest (JavaScript) or Pytest (Python)
• Verify we have specific resources, rules, conditions,
parameters…
• Two types of tests:
• Fine-grained Assertions (common) – test specific Fine-grained Assertions
aspects of the CloudFormation template (e.g., check if a
resource has this property with this value)
• Snapshot Tests – test the synthesized CloudFormation
template against a previously stored baseline template
• To import a template
• Template.fromStack(MyStack) : stack built in CDK Snapshot Test
• Template.fromString(mystring) : stack build outside CDK
Amazon Cognito
Amazon Cognito
• Give users an identity to interact with our web or mobile application
• Cognito User Pools:
• Sign in functionality for app users
• Integrate with API Gateway & Application Load Balancer
Web Applications
Federation through
Third Party Identity Provider (IdP)
Database of users
Cognito User Pools (CUP) - Integrations
• CUP integrates with API Gateway and Application Load Balancer
Backend
Cognito User Pools – Lambda Triggers
• CUP can invoke a Lambda function synchronously on these triggers:
User Pool Flow Operation Description
Authentication Pre Authentication Lambda Trigger Custom validation to accept or deny the sign-in request
Events
Post Authentication Lambda Trigger Event logging for custom analytics
Pre Token Generation Lambda Trigger Augment or suppress token claims
Sign-Up Pre Sign-up Lambda Trigger Custom validation to accept or deny the sign-up
request
Post Confirmation Lambda Trigger Custom welcome messages or event logging for
custom analytics
Migrate User Lambda Trigger Migrate a user from an existing user directory to user
pools
Messages Custom Message Lambda Trigger Advanced customization and localization of messages
Token Creation Pre Token Generation Lambda Trigger Add or remove attributes in Id tokens
https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-user-identity-pools-working-with-aws-lambda-triggers.html
Cognito User Pools – Hosted Authentication UI
• Cognito has a hosted authentication UI
that you can add to your app to handle
sign-up and sign-in workflows
• Using the hosted UI, you have a
foundation for integration with social
logins, OIDC or SAML
• Can customize with a custom logo and
custom CSS
https://aws.amazon.com/blogs/aws/launch-amazon-cognito-user-pools-general-availability-app-integration-and-federation/
CUP – Hosted UI Custom Domain
• For custom domains, you must create an ACM certificate in us-east-1
• The custom domain must be defined in the “App Integration” section
CUP – Adaptive Authentication
User
• Block sign-ins or require MFA if the login appears
suspicious
• Cognito examines each sign-in attempt and generates a risk sign-in using
Password
score (low, medium, high) for how likely the sign-in request
is to be from a malicious attacker
• Users are prompted for a second MFA only when risk is Amazon
detected Cognito
• Risk score is based on different factors such as if the user
has used the same device, location, or IP address MFA Required
• Signature
• The signature must be verified to ensure
the JWT can be trusted
• Libraries can help you verify the validity of
JWT tokens issued by Cognito User Pools
• The Payload will contain the user
information (sub UUID, given_name, email,
phone_number, attributes…)
Expiry & Issued At
• From the sub UUID, you can retrieve all
users details from Cognito / OIDC
2. authenticate
Amazon Cognito
ALB – Auth through Cognito User Pools
• Create Cognito User Pool, Client and
Domain
• Make sure an ID token is returned
• Add the social or Corporate IdP if needed
• Several URL redirections are necessary
• Allow your Cognito User Pool Domain on
your IdP app's callback URL. For example:
• https://domain-
prefix.auth.region.amazoncognito.com/saml2/
idpresponse
• https://user-pool-domain/oauth2/idpresponse
Application Load Balancer – OIDC Auth.
1. HTTPS request 8. Redirect to Original Request
7.
Us
6.
er
Ac
2. ALB redirect User for 3. Auth. Grant 4. Auth. Grant 5. ID Token +
Cl
ce
aim
authentication Code Code Access Token
ss
To
s
ke
n
Authentication Token User Info.
Endpoint Endpoint Endpoint
Identity Provider
ALB – Auth. Through an Identity Provider (IdP)
That is OpenID Connect (OIDC) Compliant
• Configure a Client ID & Client Secret
Cognito
Web & Mobile Exchange token Identity Pools
Applications for temporary
AWS credentials validate Cognito
User Pools Internal DB
of users
Cognito
Web & Mobile Exchange token Identity Pools
Applications for temporary
AWS credentials validate Cognito
User Pools Internal DB
of users
INPUT
OUTPUT WITH
ERROR
USING RESULTPATH
Step Functions – Wait for Task Token
• Allows you to pause Step Functions during a Task until a Task Token is
returned
• Task might wait for other AWS services, human approval, 3rd party
integration, call legacy systems…
• Append .waitForTaskToken to the Resource field to tell Step Functions
to wait for the Task Token to be returned
• Task will pause until it receives that Task Token back with a
SendTaskSuccess or SendTaskFailure API call
Step Functions – Wait for Task Token
Step Functions Workflow
Approved Denied
Lambda
Credit Check
Failed
Send Response Task completed ECS
SendTaskSuccess API call
…
Process Messages
Step Functions – Activity Tasks
Step Functions
• Enables you to have the Task work performed by an
Activity Worker
• Activity Worker apps can be running on EC2, Lambda, 2. Task
mobile device… (input & TaskToken)
Gra
phQ
L
in J Respo
SON nse
DynamoDB
Resolver
Web apps
GraphQL Schema Aurora
Mobile apps Resolvers
Real-time
OpenSearch
dashboards
AppSync
Offline Sync
Lambda Anything
HTTP Public
CloudWatch HTTP APIs
Metrics & Logs
AppSync – Security
• There are four ways you can authorize applications to interact with your
AWS AppSync GraphQL API:
• API_KEY
• AWS_IAM: IAM users / roles / cross-account access
• OPENID_CONNECT: OpenID Connect provider / JSON Web Token
• AMAZON_COGNITO_USER_POOLS
AUTHENTICATION DATASTORE
• Leverages Amazon Cognito • Leverages Amazon AppSync and
• User registration, authentication, Amazon DynamoDB
account recovery & other • Work with local data and have
operations
automatic synchronization to the
• Support MFA, Social Sign-in, cloud without complex code
etc…
• Pre-built UI components • Powered by GraphQL
• Fine-grained authorization • Offline and real-time capabilities
• Visual data modeling w/ Amplify Studio
AWS Amplify Hosting
AMPLIFY HOSTING
• Monitoring
• Redirect and Custom Headers Amplify
(optionally) deploy
• Password protection Source Code
Repository Build Back End
AWS Amplify – End-to-End (E2E) Testing
• Run end-to-end (E2E) tests in the test phase in
Amplify
• Catch regressions before pushing code to
production
• Use the test step to run any test commands at
build time (amplify.yml)
• Integrated with Cypress testing framework
• Allows you to generate UI report for your tests
amplify.yml
Advanced Identity in AWS
AWS STS – Security Token Service
• Allows to grant limited and temporary access to AWS resources (up to 1 hour).
• AssumeRole: Assume roles within your account or cross account
• AssumeRoleWithSAML: return credentials for users logged with SAML
• AssumeRoleWithWebIdentity
• return creds for users logged with an IdP (Facebook Login, Google Login, OIDC compatible…)
• AWS recommends against using this, and using Cognito Identity Pools instead
• GetSessionToken: for MFA, from a user or AWS account root user
• GetFederationToken: obtain temporary creds for a federated user
• GetCallerIdentity: return details about the IAM user or role used in the API call
• DecodeAuthorizationMessage: decode error message when an AWS API is denied
Using STS to Assume a Role
• Define an IAM Role within your AssumeRole API
account or cross-account
• Define which principals can access
AWS STS
this IAM Role
user
• Use AWS STS (Security Token temporary
security
Service) to retrieve credentials and credential
impersonate the IAM Role you permissions
have access to (AssumeRole API)
• Temporary credentials can be valid
between 15 minutes to 1 hour
Role (same or
other account) IAM
Cross account access with STS
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_aws-accounts.html
STS with MFA
• Use GetSessionToken from STS
• Appropriate IAM policy using
IAM Conditions
• aws:MultiFactorAuthPresent:tru
e
• Reminder, GetSessionToken
returns:
• Access ID
• Secret Key
• Session Token
• Expiration date
IAM Best Practices – General
• Never use Root Credentials, enable MFA for Root Account
• Grant Least Privilege
• Each Group / User / Role should only have the minimum level of permission it
needs
• Never grant a policy with “*” access to a service
• Monitor API calls made by a user in CloudTrail (especially Denied ones)
• Never ever ever store IAM key credentials on any machine but a
personal computer or on-premise server
• On premise server best practice is to call STS to obtain temporary
security credentials
IAM Best Practices – IAM Roles
• EC2 machines should have their own roles
• Lambda functions should have their own roles
• ECS Tasks should have their own roles
(ECS_ENABLE_TASK_IAM_ROLE=true)
• CodeBuild should have its own service role
• Create a least-privileged role for any service that requires it
• Create a role per application / lambda function (do not reuse roles)
IAM Best Practices – Cross Account Access
• Define an IAM Role for another AssumeRole API
account to access
• Define which accounts can access
AWS STS
this IAM Role
user
• Use AWS STS (Security Token temporary
security
Service) to retrieve credentials and credential
impersonate the IAM Role you permission
have access to (AssumeRole API)
• Temporary credentials can be valid
between 15 minutes to 1 hour
Role (same or
other account) IAM
Advanced IAM - Authorization Model
Evaluation of Policies, simplified
1. If there’s an explicit DENY, end decision and DENY
2. If there’s an ALLOW, end decision with ALLOW
3. Else DENY
Yes Yes
Final Final
decision = decision =
“deny” “allow”
IAM Policies & S3 Bucket Policies
• IAM Policies are attached to users, roles, groups
• S3 Bucket Policies are attached to buckets
• When evaluating if an IAM Principal can perform an operation X on a
bucket, the union of its assigned IAM Policies and S3 Bucket Policies will
be evaluated.
Simple AD
AWS Security & Encryption
KMS, Encryption SDK, SSM Parameter Store
Why encryption?
Encryption in flight (SSL)
• Data is encrypted before sending and decrypted after receiving
• SSL certificates help with encryption (HTTPS)
• Encryption in flight ensures no MITM (man in the middle attack) can happen
U: admin aGVsbG8gd29
P: supersecret ybGQgZWh…
+ encryption decryption
Data key
Data key
Why encryption?
Client side encryption
• Data is encrypted by the client and never decrypted by the server
• Data will be decrypted by a receiving client
• The server should not be able to decrypt the data
• Could leverage Envelope Encryption
Object Client Encryption Any store (FTP, Object Client Decryption
S3, etc..)
+ encryption
+ decryption
Encrypted DEK
with CMK
Encrypted File
Final file
Deep dive into Envelope Encryption
Decrypt envelope data
KMS
Decrypt API
Encrypted DEK CMK
Check IAM permissions
Client side decryption
IAM
Encrypted File
Envelope file
Using DEK
• IAM Roles
• AWS Services
• All Principals
CloudHSM
• KMS => AWS manages the software for encryption
• CloudHSM => AWS provisions encryption hardware
• Dedicated Hardware (HSM = Hardware Security Module)
• You manage your own encryption keys entirely (not AWS)
• HSM device is tamper resistant, FIPS 140-2 Level 3 compliance
• Supports both symmetric and asymmetric encryption (SSL/TLS keys)
• No free tier available
• Must use the CloudHSM Client Software
• Redshift supports CloudHSM for database encryption and key management
• Good option to use with SSE-C encryption
CloudHSM Diagram
AWS manages the Hardware
SSL Connection
User manages the Keys
AWS CloudHSM
CloudHSM Client
CloudHSM 1
Availability Zone 2
CloudHSM Client
CloudHSM 2
CloudHSM – Integration with AWS Services
CloudHSM
CloudTrail
keys usage logs
CloudHSM vs. KMS
Feature AWS KMS AWS CloudHSM
Tenancy Multi-Tenant Single-Tenant
Standard FIPS 140-2 Level 3 FIPS 140-2 Level 3
Master Keys • AWS Owned CMK Customer Managed CMK
• AWS Managed CMK
• Customer Managed CMK
Key Types • Symmetric • Symmetric
• Asymmetric • Asymmetric
• Digital Signing • Digital Signing & Hashing
Key Accessibility Accessible in multiple AWS regions (can’t • Deployed and managed in a VPC
access keys outside the region it’s created in) • Can be shared across VPCs (VPC Peering)
Cryptographic None • SSL/TLS Acceleration
Acceleration • Oracle TDE Acceleration
Access & AWS IAM You create users and manage their permissions
Authentication
CloudHSM vs. KMS
Feature AWS KMS AWS CloudHSM
High Availability AWS Managed Service Add multiple HSMs over different AZs
Audit Capability • CloudTrail • CloudTrail
• CloudWatch • CloudWatch
• MFA support
Free Tier Yes No
SSM Parameter Store
• Secure storage for configuration and secrets
• Optional Seamless Encryption using KMS Applications
AWS KMS
SSM Parameter Store Hierarchy
• /my-department/
• my-app/ GetParameters or
• dev/ GetParametersByPath API
• db-url Dev Lambda
• db-password Function
• prod/
• db-url
Prod Lambda
• db-password
Function
• other-app/
• /other-department/
• /aws/reference/secretsmanager/secret_ID_in_Secrets_Manager
• /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2 (public)
Standard and advanced parameter tiers
Standard Advanced
Total number of parameters 10,000 100,000
allowed
(per AWS account and
Region)
Maximum size of a 4 KB 8 KB
parameter value
Parameter policies available No Yes
Cost No additional charge Charges apply
Storage Pricing Free $0.05 per advanced parameter per
month
Parameters Policies (for advanced parameters)
• Allow to assign a TTL to a parameter (expiration date) to force
updating or deleting sensitive data such as passwords
• Can assign multiple policies at a time
Secrets
Manager
create Secret
create Stack
User CloudFormation
RDS
create DB and
configure
Username/Password
Secrets Manager
CloudFormation - Dynamic Reference
secret is generated
reference secret in
RDS DB instance
invoke invoke
• Secrets in CodeBuild:
• Don’t store them as plaintext in environment variables
• Instead…
• Environment variables can reference parameter store parameters
• Environment variables can reference secrets manager secrets
AWS Nitro Enclaves
• Process highly sensitive data in an isolated compute environment
• Personally Identifiable Information (PII), healthcare, financial, …
• Fully isolated virtual machines, hardened, and highly constrained
• Not a container, not persistent storage, no interactive access, no external networking
• Helps reduce the attack surface for sensitive data processing apps
• Cryptographic Attestation – only authorized code can be running in your Enclave
• Only Enclaves can access sensitive data (integration with KMS)
• Use cases: securing private keys, processing credit cards, secure multi-party
computation…
AWS Nitro Enclaves
1. Launch a compatible 2. Use the Nitro CLI 3. Using the EIF file 4. The Enclave is a
Nitro-based EC2 instance to convert your app as an input, use the separate virtual machine
AWS Nitro Enclave with the ‘EnclaveOptions’ to an Enclave Image Nitro CLI to create with its own kernel,
parameter set to ‘true’ File (EIF) an Enclave memory, and CPU
EC2 Host
Secure
local channel isolation between
EC2 instances
Instance A Enclave A Instance B on the same host
Nitro Hypervisor
Other Services
Quick overview of other services that might have questions on at the exam
Amazon Simple Email Service (Amazon SES)
• Fully managed service to send emails securely, globally and at scale
• Allows inbound/outbound emails Users
• Reputation dashboard, performance insights, anti-spam feedback
• Provides statistics such as email deliveries, bounces, feedback loop bulk emails
results, email open
• Supports DomainKeys Identified Mail (DKIM) and Sender Policy
Framework (SPF)
• Flexible IP deployment: shared, dedicated, and customer-owned IPs
Amazon SES
• Send emails using your application using AWS Console, APIs, or SMTP
APIs
or SMTP
• Use cases: transactional, marketing and bulk email communications
Application
Amazon OpenSearch Service
• Amazon OpenSearch is successor to Amazon ElasticSearch
• In DynamoDB, queries only exist by primary key or indexes…
• With OpenSearch, you can search any field, even par tially matches
• It’s common to use OpenSearch as a complement to another database
• Two modes: managed cluster or serverless cluster
• Does not natively support SQL (can be enabled via a plugin)
• Ingestion from Kinesis Data Firehose, AWS IoT, and CloudWatch Logs
• Security through Cognito & IAM, KMS encryption, TLS
• Comes with OpenSearch Dashboards (visualization)
OpenSearch patterns
DynamoDB
CRUD
Lambda
Function
Kinesis Data data Lambda
Firehose transformation Function
(near real time) (real time)
Amazon Amazon
OpenSearch OpenSearch
Amazon Athena
• Serverless query service to analyze data stored in Amazon S3
• Uses standard SQL language to query the files (built on Presto)
load data
• Supports CSV, JSON, ORC, Avro, and Parquet
• Pricing: $5.00 per TB of data scanned
S3 Bucket
• Commonly used with Amazon Quicksight for
reporting/dashboards Query & Analyze
Amazon
• Use cases: Business intelligence / analytics / reporting, analyze & Athena
query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
• Exam Tip: analyze data in S3 using serverless SQL, use Athena Reporting & Dashboards
Amazon
QuickSight
Amazon Athena – Performance Improvement
• Use columnar data for cost-savings (less scan)
• Apache Parquet or ORC is recommended
• Huge performance improvement
• Use Glue to convert your data to Parquet or ORC
• Compress data for smaller retrievals (bzip2, gzip, lz4, snappy, zlip, zstd…)
• Partition datasets in S3 for easy querying on virtual columns
• s3://yourBucket/pathToTable
/<PARTITION_COLUMN_NAME>=<VALUE>
/<PARTITION_COLUMN_NAME>=<VALUE>
/<PARTITION_COLUMN_NAME>=<VALUE>
/etc…
• Example: s3://athena-examples/flight/parquet/year=1991/month=1/day=1/
• Use larger files (> 128 MB) to minimize overhead
Amazon Athena – Federated Query
Amazon
• Allows you to run SQL queries across Athena
data stored in relational, non-relational, S3 Bucket
Redshift
Amazon Managed Streaming for Apache Kafka
(Amazon MSK)
• Alternative to Amazon Kinesis
• Fully managed Apache Kafka on AWS
• Allow you to create, update, delete clusters
• MSK creates & manages Kafka brokers nodes & Zookeeper nodes for you
• Deploy the MSK cluster in your VPC, multi-AZ (up to 3 for HA)
• Automatic recovery from common Apache Kafka failures
• Data is stored on EBS volumes for as long as you want
• MSK Serverless
• Run Apache Kafka on MSK without managing the capacity
• MSK automatically provisions resources and scales compute & storage
Apache Kafka at a high level
EMR
MSK Cluster
Kinesis
S3
on ti
Broker 2
lica
rep
IoT SageMaker
Producers Write to topic Poll from topic Consumers
(your code) (your code)
Broker 1
rep
RDS Kinesis
lica
onti
Etc…
RDS
Broker 3
Etc…
Kinesis Data Streams vs. Amazon MSK
• 1 MB message size limit • 1MB default, configure for higher (ex: 10MB)
• Data Streams with Shards • Kafka Topics with Partitions
• Shard Splitting & Merging • Can only add partitions to a topic
• TLS In-flight encryption • PLAINTEXT or TLS In-flight Encryption
• KMS at-rest encryption • KMS at-rest encryption
Amazon MSK Consumers
Kinesis Data Analytics
for Apache Flink
AWS Glue
Streaming ETL Jobs
Powered by Apache Spark Streaming
Lambda
Amazon MSK
Applications Running on
CW Logs Amazon S3
Exam Review & Tips
State of learning checkpoint
• Let’s look how far we’ve gone on our learning journey
• https://aws.amazon.com/certification/certified-developer-associate/
Practice makes perfect
• If you’re new to AWS, take a bit of AWS practice thanks to this course
before rushing to the exam
• The exam recommends you to have one or more years of hands-on
developing and maintaining an AWS based applications
• Practice makes perfect!
Architecture
Application Architect
Design significant aspects of
application architecture including
user interface, middleware, and
infrastructure, and ensure
enterprise-wide scalable, reliable,
and manageable systems Dive Deep
https://d1.awsstatic.com/training-and-
certification/docs/AWS_certification_paths.pdf
AWS Certification Paths – Operations
Operations
Systems Administrator
Install, upgrade, and maintain
computer components and
software, and integrate
automation processes
Dive Deep
Operations
Cloud Engineer
Implement and operate an
organization’s networked computing
infrastructure and Implement
security systems to maintain
data safety
Dive Deep
AWS Certification Paths – DevOps
DevOps
Test Engineer
Embed testing and quality
best practices for software
development from design to release,
throughout the product life cycle
DevOps
Cloud DevOps Engineer
Design, deployment, and operations
of large-scale global hybrid
cloud computing environment,
advocating for end-to-end
automated CI/CD DevOps pipelines Optional Dive Deep
DevOps
DevSecOps Engineer
Accelerate enterprise cloud adoption
while enabling rapid and stable delivery
of capabilities using CI/CD principles,
methodologies, and technologies
AWS Certification Paths – Security
Security
Cloud Security Engineer
Design computer security architecture
and develop detailed cyber security designs.
Develop, execute, and track performance
of security measures to protect information
Dive Deep
Security
Cloud Security Architect
Design and implement enterprise cloud
solutions applying governance to identify,
communicate, and minimize business and
technical risks
Dive Deep
AWS Certification Paths – Data Analytics &
Development
Data Analytics
Cloud Data Engineer
Automate collection and processing
of structured/semi-structured data
and monitor data pipeline performance
Dive Deep
Development
Software Development Engineer
Develop, construct, and maintain
software across platforms and devices
AWS Certification Paths – Networking & AI/ML
Networking
Network Engineer
Design and implement computer
and information networks, such as
local area networks (LAN),
wide area networks (WAN),
intranets, extranets, etc. Dive Deep
AI/ML
Machine Learning Engineer
Research, build, and design artificial
intelligence (AI) systems to automate
predictive models, and design machine
learning systems, models, and schemes
Congratulations & Next Steps!
Congratulations!
• Congrats on finishing the course!
• I hope you will pass the exam without a hitch J
• Overall, I hope you learned how to use AWS and that you will be a
tremendously good AWS Developer
Next Steps
• We’ve spent a lot of time getting an overview of each service
• Each service on its own deserves its own course and study time
• Find out what services you liked and get specialized in them!
• Happy learning!