AWS SAA Overview
AWS SAA Overview
ELB
ALB
- Works at application layer (layer 7)
- ALB target groups can be:
- EC2 instances
- ECS tasks
- Lambda functions
- Private IP addresses
-ALB have listeners with specific protocols and each listener can route
the traffic to different target groups using listener rules
- health check is done at the target group level using HTTP and HTTPS protocols
- cross zone load balancing is enabled by default
- Cannot attach elastic IP to ALB
- ALB must be in a public subnet to work
- Also supports gRPC protocol
- supports Weighted Target Groups routing
NLB
- Works at transport layer (layer 4)
- extreme performance (can handle millions of requests per second)
- TCP and UDP protocols
- has one static IP per AZ which can also be elastic IP
- NLB target groups can be:
- EC2 instances
- Private IP addresses
- ALBs
- health check can be done via TCP, HTTP, HTTPs protocols
- cross zone load balancing is disabled by default
GWLB
- Works at network layer (layer 3)
- Route traffics to 3rd party virtual appliances to do processes like security analysis first before
routing to the servers
- Uses geneve protocol on port 6081
- GWLB target groups can be:
- EC2 instances
- Private IP addresses
- Cross zone load balancing is disabled by default
Cross-zone Load Balancing
Distribute the traffic evenly across target groups in different regions
ELBs have security groups too
ELBs are region bound
Sticky Sessions
- to make sure the same client will always be routed to the same instance
- support for CLB, ALB and NLB
- ALB uses cookies which have expiration date that can be controlled
Cookies
- Application based cookies
- custom cookies: defined by application and name cannot be AWSALB, AWSALBAPP or
AWSALBTG
- application cookies: defined by load balancer and name is AWSALBAPP
● Have multi-AZ configuratio- Can have up to 5 read replicas across multiple AZs
Neptune
- Graph DB
DocumentDB
- AWS service for MongoDB
KeySpaces
- AWS service for Apache Cassandra
DNS
Route53
- A highly available, scalable, fully managed and Authoritative DNS
- The only AWS service which provides 100% availability SLA
Record Types
- A - map to ipv4
- AAAA - map to ipv6
- CNAME - map to another domain name (can’t be root or top node namespace or zone apex)
- Alias - can map root or top nodes to AWS resources (eg; alb endpoints) (extension of A or
AAAA type)
- NS - name servers for the hosted zones (for dns traffic routing)
Name Servers
- Physical servers that resolve the DNS requests by looking at the records stored in hosted zones
- NS record in a hosted zone route the DNS request traffic to name servers
Cost
$0.50 per month per hosted zone
Hosted Zones
- Public
- Private (within VPC)
Routing Policies
- Simple
- Weighted
- Latency-based
- Failover
- Geolocation
- Geoproximity
- IP-based routing
- Multi-value
Failover
- active-active
- active-passive
active-active
Both systems are running and can be served as failover
active-passive
Only one system is serving and another one is standby as failover occurs
s3 static website routing
To route s3 static website using Route53, name of the s3 bucket must be the same as domain
name
Containerization
ECS
Launch Types
- EC2
- Fargate
EC2 Launch Type
- Must provision & maintain the infrastructure (the EC2 instances)
- Each EC2 Instance must run the ECS Agent to register in the ECS Cluster
Fargate Launch Type
- No need to provision the infrastructure (no EC2 instances to manage)
IAM Roles
- EC2 Instance Profile
- ECS Task Role
Data Volumes
- EBS volumes of each EC2 instance
- Can use EFS
- Fargate+EFS = Serverless
AWS Application Auto Scaling
Automatically increase/decrease the desired number of ECS tasks
Scaling Methods
- Target Tracking
- Step Scaling
- Scheduled Scaling
Cluster Capacity Auto Scaling
- Use ECS Cluster Capacity Provider to automatically provision and scale the infrastructure for
your ECSTasks
- Capacity Provider paired with an Auto Scaling Group
ECR
- Store and manage Docker images on AWS
- Fully integrated with ECS, backed by Amazon S3
EKS
- EKS supports EC2 if you want to deploy worker nodes or Fargate to deploy serverless
containers
Node Types
- Managed Node Group
- Self-managed Nodes
- Fargate
Data Volumes
- EBS
- EFS
- FSx for Lustre
- FSx for NetApp
Karpenter
automatically adjust the number of nodes in the EKS cluster when pods fail or are rescheduled
onto other nodes
Horizontal Pod Autoscaler
- automatically scales the number of Pods in a deployment, replication
controller, or replica set based on that resource’s CPU utilization
- it Installs the Kubernetes Metrics Server to the Amazon EKS cluster
ECS Anywhere and EKS Anywhere
-Extends AWS ECS and EKS functionality to run containers on any
infrastructure, including on-premises servers, edge devices, or virtual
machines outside AWS
- Allows organizations to use ECS and EKS as the orchestration layer for hybrid or multi-cloud
deployments
AWS App Runner
-Fully managed service designed to automatically deploy and scale web
applications and APIs from source code or a container image, with
minimal configuration
- No infrastructure experience required, just need source code or container image
- Automatic code building, deploying, scaling, highly available, load balancer, encryption
AWS ElasticBeanStalk
- Platform-as-a-Service (PaaS) that makes it easy to deploy, manage, and scale web applications
and services
- Manages the infrastructure (compute, storage, networking) but still allows customization if
needed
- Provides real-time monitoring of application health, resource usage, and logs
Serverless
Services
- Lambda
- Dynamodb
- Cognito
- API Gateway
- S3
- SNS and SQS
- Kinesis
- Aurora Serverless
- Step Functions
- Fargate
Lambda
- Pay per request and compute time
- Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time
- Outside of a VPC by default
- If assigned a VPC and subnet, lambda will create ENI in the subnet/VPC
- Can be invoked by using lambda function URL
Pricing
=====
Pay per call
- First 1,000,000 requests are free
- $0.20 per 1 million requests thereafter ($0.0000002 per request)
Pay per duration
- 400,000 GB-seconds of compute time per month for FREE
- 400,000 seconds if function is 1GB RAM
- 3,200,000 seconds if function is 128 MB RAM
- After that $1.00 for 600,000 GB-seconds
Execution
- Memory allocation: 128 MB – 10GB (1 MB increments)
- Maximum execution time: 900 seconds (15 minutes)
- Environment variables (4 KB)
- Disk capacity in the “function container” (in /tmp): 512 MB to 10GB
- Concurrency executions: 1000 (can be increased) per region
Deployment
- Lambda function deployment size (compressed .zip): 50 MB
- Size of uncompressed deployment (code + dependencies): 250 MB
- Can use the /tmp directory to load other files at startup
- Size of environment variables: 4 KB
Lambda SnapStart for JAVA
- Lambda initializes the function at publish time
- Takes a snapshot of memory and disk state of the initialized function
- Snapshot is cached for low-latency access
Running Container Images
- Container image must be built using AWS provided base image tailored specifically for AWS
Lambda
API Gateway
Endpoint Types
- Edge-optimized
- Regional
- Private
Edge-optimized
- Requests are routed through the CloudFront Edge locations (improves latency)
- The API Gateway still lives in only one region
Regional
- For clients within the same region
- Could manually combine with CloudFront (more control over the caching strategies and the
distribution)
Private
- Can only be accessed from own VPC using an interface VPC endpoint (ENI)
- Have to use a resource policy to define access
User Authentication
IAM Roles (useful for internal applications)
Cognito (identity for external users – example mobile users)
- Custom Authorizer (your own logic)
- Custom Domain Name HTTPS security through integration with AWS Certificate Manager
(ACM)
Supports API Caching and Request Throttling too
Step Functions
Build serverless visual workflow to orchestrate your Lambda functions
AWS Cognito
- Give users an identity to interact with the web or mobile application on AWS
Cognito User Pool
Sign in functionality for app users
- Create a serverless database of user for the web & mobile apps
- Integrate with API Gateway & Application Load Balancer
Cognito Identity Pool (Federated Identity)
- Provide AWS credentials to users so they can access AWS resources directly
- Integrate with Cognito User Pools as an identity provider
- Get identities for “users” so they obtain temporary AWS credentials
Data Analytics
Amazon Athena
- Serverless query service to analyze data stored in Amazon S3
- Supports CSV, JSON, ORC, Avro, and Parquet
- $5.00 per TB of data scanned
- Commonly used with Amazon Quicksight for reporting/dashboards
Federated Query
- To run SQL queries across data stored in relational, non-relational, object, and custom data
sources (AWS or on-premises)
- Uses Data Source Connectors that run on AWS Lambda to run Federated Queries
- Store the results back in Amazon S3
Performance Improvement
- Use columnar data (Apache Parquet or ORC) for cost-savings
- Compress data for smaller retrievals
- Partition datasets in S3 for easy querying on virtual columns
- Use larger files (> 128 MB) to minimize overhead
RedShift
- based on Postgresql but OLAP: online analytical processing (analytics and data warehousing)
- 10x better performance than other data warehouses, scale to PBs of data
- Columnar storage of data (instead of row based) & parallel query engine
Modes
- Provisioned Cluster
- Serverless Cluster
Provisioned Cluster
- Choose instance types in advance
- Can reserve instances for cost savings
Redshift Clusters
- Leader Node
- Compute Node
Leader Node
for query planning, results aggregation
Compute Node
for performing the queries, send results to leader
Snapshots and DR
- Snapshots are point-in-time backups of a cluster, stored internally in S3
- can restore a snapshot into a new cluster
- Automatically every 8 hours, every 5 GB or can be scheduled
- Set retention between 1 to 35 days
- Can manually take snapshots too
- Can enable cross-region snapshots
Data Loading into RedShift
- with Kinesis Data Firehose
- s3 using copy command
- without enhanced VPC routing
- with enhanced VPC routing
- EC2 Instance JDBC driver
RedShift Spectrum
- to run query on data stored in s3 without loading the data
Amazon OpenSearch
- Successor to ElasticSearch
- common to use OpenSearch as a complement to another database as a database search API
- Ingestion from Kinesis Data Firehose, AWS IoT, and CloudWatch Logs
- Comes with OpenSearch Dashboards for visualization
Modes
- Managed Cluster
- Serverless Cluster
Amazon EMR
- Amazon Elastic MapReduce
- The clusters can be made of hundreds of EC2 instances with autoscaling and can be integrated
with spot instances
- EMR comes bundled with Apache Spark, HBase, Presto, Flink
- EMR takes care of all the provisioning and configuration
Node Types
- Master Node
- Core Node
- Task Node
Master Node
Manage the cluster, coordinate, manage health – long running
Core Node
Run tasks and store data – long running
Task Node
Just to run tasks – usually Spot
Purchasing Options
- On demand
- Reserved (min 1 yr)
- Spot Instances
Modes
- Long running cluster
- Transient cluster
Amazon QuickSight
- Serverless machine learning-powered BI service to create interactive dashboards
- In-memory computation using SPICE engine if data is imported into QuickSight
- Define Users and Groups (separate from IAM)
AWS Glue
- managed ETL service
Glue Job Bookmarks
prevent re-processing old data
Glue Elastic Views
- Combine and replicate data across multiple data stores using SQL
- No custom code, Glue monitors for changes in the source data, serverless
- Leverages a “virtual table” (materialized view)
Glue DataBrew
- Prebuilt transformations
Glue Studio
- GUI for ETL jobs
Glue Streaming ETL
- for streaming data
- built on Apache Spark Structured Streaming
- compatible with Kinesis Data Streaming, Kafka, MSK
AWS LakeFormation
- To build data lake
- Created data lakes are stored in s3
- Built on top of AWS Glue
- Can be used to consolidate data from multiple accounts into a single account as a central
datalake
MSK (Amazon Managed Streaming for Kafka)
- Alternative to Amazon Kinesis
MSK Serverless
- Run Apache Kafka on MSK without managing the capacity
- MSK automatically provisions resources and scales compute & storage
AWS Data Exchange
service that makes it easy to find, subscribe to, and use third-party data in the AWS cloud
AWS Data Pipeline
- enables you to automate the movement, transformation, and processing of
data across different AWS services and on-premises data sources
- useful for creating complex data workflows that involve scheduling, dependency management,
and data transformations
Monitoring
CloudWatch
CloudWatch Metrics
- CloudWatch provides metrics for every services in AWS
- Metrics belong to namespaces (eg: S3, ECS, EC2,...)
- Dimension is an attribute of a metric (eg: instance id, environment, etc...)
- Up to 30 dimensions per metric
- Can create CloudWatch Custom Metrics
Metric Streams
-Continually stream CloudWatch metrics to a destination of your choice,
with near-real-time delivery and low latency (to Kinesis Data Firehose,
3rd party service providers)
-Option to filter metrics to only stream a subset of them
Cloudwatch Logs
- organized into log groups and log streams
- Can define log expiration policies (never expire, 1 day to 10 years...)
- Logs are encrypted by default
- Can setup KMS-based encryption with your own keys
Can send logs to
- Amazon S3 (exports)
- Kinesis Data Streams
- Kinesis Data Firehose
- AWS Lambda
- OpenSearch
Log sources
- SDK, CloudWatch Logs Agent, CloudWatch Unified Agent
- Elastic Beanstalk: collection of logs from application
- ECS: collection from containers
- AWS Lambda: collection from function logs
- VPC Flow Logs: VPC specific logs - API Gateway
- CloudTrail based on filter
- Route53: Log DNS queries
Log Insights
- Search and analyze log data stored in CloudWatch Logs
S3 Export
- Log data can take up to 12 hours to become available for export
- The API call is CreateExportTask
- use Logs Subscriptions
Log Subscriptions
- Get a real-time log events from CloudWatch Logs for processing and analysis
- Send to Kinesis Data Streams, Kinesis Data Firehose, or Lambda
- Subscription Filter: filter which log events are delivered to the destination
- Can do cross-account subscription
CloudWatch Agents
- To collect logs from EC2 instances or on-premise servers
Log Agents
- Older version
Can only collect logs
Unified Agents
- Can collect logs and also the instance metrics (eg: CPU, RAM, Disk info, etc)
CloudWatch Alarms
Alarms are used to trigger notifications for any metric
Alarm States
- OK
- Insufficient Data
- In Alarm
Alarm Target Actions
- EC2 instances (stop, terminate, reboot, etc)
- EC2 Auto Scaling
- Amazon SNS
Composite Alarm
- Can trigger multiple alarms in conjunction
- AND and OR conditions
EC2 Recovery
- CloudWatch alarm can trigger the recovery of the Amazon EC2 instance, in case the instance
fails.
- The instance, however, should only be configured with an Amazon EBS volume
-
Recovered instance is identical to the original instance, including the
instance ID, private IP addresses, Elastic IP addresses, and all
instance metadata
CloudWatch Insights
- CloudWatch Container Insights
- CloudWatch Lambda Insights
- CloudWatch Contributor Insights
CloudWatch Application Insights
CloudWatch Container Insights
ECS, EKS, Kubernetes on EC2, Fargate, needs agent for Kubernetes
CloudWatch Lambda Insights
Detailed metrics to troubleshoot serverless applications
CloudWatch Contributors Insights
Find “Top-N” Contributors through CloudWatch Logs
CloudWatch Application Insights
Automatic dashboard to troubleshoot your application and related AWS services
CloudTrail
- Provides governance, compliance and audit for your AWS Account
- Can be integrated with EventBridge to trigger AWS services based on CloudTrail events
- Cloudtrail log files are encrypted by default
CloudTrail Events
- Management Events
- Data Events
- CloudTrail Insights Events
Management Events
- Operations that are performed on resources in your AWS account
- By default, trails are configured to log management events.
Data Events
- Granula data object activities like Amazon S3 object-level activity, AWS Lambda function
execution activity
CloudTrail Insights Events
- Analyze anomalies in write events to detect unusual patterns
Events retention
- Events are stored for 90 days in CloudTrail
- To keep events beyond this period, log them to S3 and use Athena
AWS Config
- Helps with auditing and recording compliance of your AWS resources
- Helps record configurations and changes over time
- AWS Config is a per-region service
- Can be aggregated across regions and accounts
Config Rules
- Can use AWS managed config rules
- Can make custom config rules
- no free tier, $0.003 per configuration item recorded per region, $0.001 per config rule
evaluation per region
Config Resource
- View compliance of a resource over time
- View configuration of a resource over time
- View CloudTrail API calls of a resource over time
Remediation
- Automate remediation of non-compliant resources using SSM Automation Documents
- Use AWS-Managed Automation Documents or create custom Automation Documents
- Can set Remediation Retries if the resource is still non-compliant after auto-remediation
Notification
- Use EventBridge to trigger notifications when AWS resources are non-compliant
- Ability to send configuration changes and compliance state
notifications to SNS (all events – use SNS Filtering or filter at
client-side)
AWS Trusted Advisor
- optimize costs, increase performance, improve security and resilience, and operate at scale in
the cloud
- recommends actions to remediate any deviations from best practices
- can do service quota checks by writing an AWS Lambda function that
refreshes the AWS Trusted Advisor Service Limits checks and set it to
run every 24 hours
AWS X-ray
X-Ray
collects data about the requests and responses, tracks latency,
identifies performance bottlenecks, and detects errors, helping
developers and operations teams understand how their applications behave
in real-time
Service Map
X-Ray- generates a service map that visualizes the relationships and
interactions between the services in your application. This map
highlights performance bottlenecks, latency issues, and error rates.
Disaster Recovery
RPO and RTO
- Recovery Point Objective: Time between disaster and last backup point
- Recovery Time Objective: Time between disaster and system recover time
DR Strategies
- Backup and Restore
- Pilot Light
- Warm Standby
- Hot Site / Multi Site Approach
Backup and Restore
- Cheapest
- High RPO, High RTO
Pilot Light
A most-minimal version of the app is always running in the cloud
Warm Standby
A scaled-down version of the full system is always up and running
Hot Site/ Multi Site
Full Production Scale is running both on AWS and On Premise
AWS Database Migration Service (DMS)
-Can migrate databases both heterogeneously and homogeneously from
different sources to targets (eg: from on-premise Oracle to AWS Aurora)
- Must create an EC2 instance to perform the replication tasks
- If the source and target db uses different db engines (eg: Oracle and Postgresql), Schema
Conversion Tool (SCT) must be used
- AWS DMS supports multi-AZ deployment
- In addition to databases, s3 and kinesis can also be the source or target
- full load and change data capture (CDC) replication task can be used to migrate and also track
the on-going data changes
RDS and Aurora DB Migration
- MySQL
- PostgreSQL
MySQL
- RDS to Aurora:
1. DB Snapshots from RDS MySQL restored as MySQL Aurora DB
2. Create an Aurora Read Replica from your RDS MySQL, and when the replication lag is 0,
promote it as its own DB cluster
- External to Aurora:
1. Backup onto s3 and import from s3 to Aurora
2. Use mysqldump utility to directly migrate into Aurora
Can also use DMS
PostgreSQL
- RDS to Aurora:
1. DB Snapshots from RDS PostgreSQL restored as PostgreSQL Aurora DB
2. Create an Aurora Read Replica from your RDS PostgreSQL, and when
the replication lag is 0, promote it as its own DB cluster
- External to Aurora:
Create a backup, put it in Amazon S3 and import it using the aws_s3 Aurora extension
Can also use DMS
AWS Backup
- Centrally manage and automate backups across AWS services
- Supports cross-region backups
- Supports cross-account backups
Supported Services
- Amazon EC2 / Amazon EBS
- Amazon S3
- Amazon RDS (all DBs engines) / Amazon Aurora / Amazon DynamoDB
- Amazon DocumentDB / Amazon Neptune
- Amazon EFS / Amazon FSx (Lustre & Windows File Server)
- AWS Storage Gateway (Volume Gateway)
Features
- PITR for supported services
- On-demand and scheduled backups
- Tag based backup policies
- Backup Plans
- Backup Vault Lock
Backup Plans
Can configure:
- Backup frequency
- Backup window
- Transition to cold storage
- Retention period
Backup Vault Lock
- WORM (Write Once Read Many)
- Even the root user cannot delete backups inside the locked Vault
AWS ADS and MGN
- Application Discovery Service (ADS)
- Application Migration Service (MGN)
ADS
-Plan migration projects by gathering information about on-premises data
centers like server utilization data and dependency mapping
- Resulting data can be viewed within AWS Migration Hub
Agentless Discovery
- Uses AWS Agentless Discovery Connector
- Discover VMinventory, configuration, and performance history such as CPU, memory, and disk
usage
Agent-based Discovery
- Uses AWS Application Discovery Agent
- System configuration, system performance, running processes, and details of the network
connections between systems
MGN
- The "AWS evolution" of CloudEndure Migration, replacing AWS Server Migration Service (SMS)
- Lift-and-shift (rehost) solution
- Converts physical, virtual, and cloud-based servers to run natively on AWS
- Migrate data by installing AWS Replication Agent on source servers
Compute
EC2
Storage
- EBS
- EFS
- EC2 Instance Store
EBS
- bound to specific AZs
- by default, root volume is set to delete on termination
- Only gp2/gp3 and io1/io2 can be used as boot volumes
-EBS volumes support live configuration changes while in production
which means that you can modify the volume type, volume size, and IOPS
capacity without service interruptions
EBS Volume Types
- gp2 (SSD)
- gp3 (SSD)
- io1 (SSD)
- io2 block express (SSD)
- st1 (HDD)
- sc1 (HDD)
gp2
- 1 GiB - 16TiB
- can burst IOPS to 3,000
- Size of the volume and IOPS are linked
- max IOPS is 16,000
- if 3 IOPS per GB, max IOPS at 5,334 GB
gp3
- 1 GiB - 16TiB
- Baseline of 3,000 IOPS and throughput of 125 MiB/s
- Can increase IOPS up to 16,000 and throughput up to 1000 MiB/s independently
io1
- 4 GiB - 16TiB
- Max IOPS: 64,000 for Nitro EC2 instances & 32,000 for other
- Can increase IOPS independently from storage size
io2 Block Express
- 4 GiB - 64 TiB
- Sub-millisecond latency
- Max IOPS: 256,000 with an IOPS:GiB ratio of 1,000:1
Snapshots
- snapshots can be copied across AZs
- snapshots can be moved to snapshot archives which is 75% cheaper but can take 24 to 72 hrs
to restore
- snapshots can be moved to recycle bins and retention period can be set from 1 day to 1 year
- fast snapshot restore: Force full initialization of snapshot to have no latency on the first use
- snapshots can be created automatedly using Amazon Data Lifecycle Manager (DLM)
- The EBS volume can be used while the snapshot is in progress
EBS Encryption
- Copying an unencrypted snapshot allows encryption
- Snapshots of encrypted volumes are encrypted
Encrypt an Unencrypted EBS Volume
- Create an EBS snapshot of the volume
- Encrypt the EBS snapshot ( using copy )
- Create new EBS volume from the snapshot ( the volume will also be encrypted )
Copying encrypted snapshots across regions
- Take snapshot of the encrypted volume
- Copy the snapshot and encrypt using key B in region B
- Restore the volume
Copying encrypted snapshots cross accounts
- Create snapshot encrypted with own KMS key
- Attach KMS key policy to authorize cross account decrypt access
- Share encrypted snapshot
- Encrypt the snapshot using KMS key B in account B
- Restore the volume
EBS Multi Attach
- only io1/io2 volume types can support multi attach
- one volume can be attached to multiple instances within same AZ
- up to 16 instances at the same time
EFS
- network file system (NFS) that can be mounted on many EC2 instances
- EFS can be attached to EC2 instances in multiple AZs
- have to use security group to control access to EFS
- can only be used with linux based AMIs
- pay per use, no capacity planning
Performance Modes
- General purpose
- Max I/O
Throughput Modes
- Bursting
- Provisioned
- Elastic
Bursting
- scales with storage
- burst up to 100MiB/s
Provisioned
- set the throughput regardless of storage size
Elastic
- automatically scales throughput up or down based on the workloads
- Up to 3GiB/s for reads and 1GiB/s for writes
Storage Tiers
- Standard
- IA
- Archive
Storage Life Cycle
- Maximum day that can be configured using storage life cycle is 365 days
Availability Modes
- standard (Multi-AZ)
- one zone (Single-AZ)
EFS One Zone IA
- IA storage tier with one zone availability mode
Instance Store
- closely attached to EC2 instance
- better I/O than EBS
- destroyed when the instance is stopped
RAID 0 vs RAID 1
EBS and Instance Store supports RAID 0 configuration
RAID 0
- Data are spread across multiple EBS or Instance store volumes and all volumes act as single
storage
- Increased throughput
RAID 1
- Data are duplicated in all the EBS and Instance store volumes
- For data redundancy
Instance Types
- General Purpose (M, T)
- Compute optimized (C)
- Memory optimized (R)
- Accelerated (G, P)
- Storage optimized (I)
Compute Optimized (C)
- Batch processing
- HPC
- Media transcoding
- Scientific modeling
- Dedicated gaming servers
Memory Optimized (R)
- High performance databases
- Cache stores
- In memory BIs
- In memory big data processing
Storage Optimized (I)
- High performance OLTP
- For high sequential I/O
Tenancy
- default
- dedicated
- host
default
shared tenancy
dedicated
dedicated tenancy (eg: dedicated instances)
host
dedicated host
Security Group
- Control ins/outs of the instance
- VPC bound
- Can attach to multiple instances
- Only contains 'Allow' rules
- Can reference by IP or by other SGs
- Inbound traffics are blocked by default
- Outbound traffics are allowed by default
Purchasing Options
- On-demand Instances
- Reserved Instances
Saving Plan
Spot Instances
- Dedicated Hosts
- Dedicated Instances
- Capacity Reservation
On-demand Instances
- Pay by second after 1 min
Reserved Instances
- Reserved for 1 or 3 years
Payments: upfront, no upfront, partial upfront
- Convertible reserved instance: can change instance attributes
Saving Plans
- Reserved to a certain type of usage ($/hr)
- Reserved for 1 or 3 years
- Locked to an instance family and region
- Usage beyond saving plans are charge at on-demand price
Spot Instances
- Can get up to 90% discount
- Can lose the instance when the current price gets larger than max price you pay
- have 2 mins grace period at termination time
- Cancelling a spot request does not terminate the instances
- First cancel the request and then terminate the instances
- Spot fleets: spot instances + optional on-demand instances
- Spot fleet allocation strategies:
- lowestPrice
- diversified
- capacityOptimized
- priceCapacityOptimized
Dedicated Host
- most expensive option
- book entire server
- visibility down to port level
- can do instance placement
- options:
- on demand
- reserved
Dedicated Instances
- own hardware within account
- cannot do instance placement
Capacity Reservation
- Pay whether use the instances or not within reserved period
- Capacity Reservations enable you to reserve compute capacity for your
EC2 instances in a specific AZ for any duration (can also be in hourly
duration)
Elastic IP
- Can attach to one instance at a time
- Can only have 5 IPs per account (can ask AWS to increase)
Placement Groups
- Cluster
- Spread
- Partition
Cluster
- Cluster instances into a low latency group within a single AZ
- It is recommended that you launch the number of instances that you need in the placement
group in a single launch request
- use the same instance type for all instances in the placement group
- If you try to add more instances to the placement group later, or if
you try to launch more than one instance type in the placement group,
you increase your chances of getting an insufficient capacity error
- Need to re-launch the cluster when insufficient capacity error occurs
Spread
- Spread instances across different hardwares across AZs
- Only 7 instances per group per AZ
Partition
- Many instances can share a partition (a rack of hardware) and partitions are distributed across
AZs
- Only 7 partitions per AZ
Elastic Network Interface (ENI)
- One instance can have multiple ENIs attached with one primary private IPv4 and many
secondary private IPv4s
- ENIs are bound to specific AZs
- Public IPv4 is assigned to an ENI according to ip assign rule of the subnet that the ENI belongs
to
- One elastic IP address per one private IP
EC2 Instance Stages
- Stop
- Terminate
- Hibernate
Stop
- Data on non-root EBS volume are preserved
- All data on the attached instance-store devices will be lost
- Underlying host can be changed when restarted
- Elastic IP and ENIs are still attached
Terminate
If the EBS volume is set to be destroyed, all the data are lost
Hibernate
- Data and states on RAM are saved on EBS and restart from the saved state
- Instance ram size must be less than 150GB
- Root volume must be EBS and encrypted
- An instance cannot be hibernated for more than 60 days
-
It is not possible to enable or disable hibernation for an instance
after it has been launched; Have to configure at launch time
AMI
- AMIs can be accessed using:
- AWS public AMIs
- Custom made AMIs
- AMIs found/sold on AWS marketplace
- AMIs can be used to copy instances across AZs, Regions and Accounts
- AMI includes one or more snapshots, so if AMI is copied, snapshots are copied along with it
- Copying an AMI backed by an encrypted snapshot cannot result in an unencrypted target
snapshot
EC2 Enhanced Networking
- Elastic Network Adapter (ENA)
- Elastic Fabric Adapter (EFA)
ENA
- up to 100 Gbps
- can support windows instances
EFA
- Improved ENA for HPC
- only works for Linux
Automation and Orchestration
- AWS Batch
- AWS ParallelCluster
AWS Batch
- Managed service that helps you efficiently run batch processing jobs at scale
- AWS Batch handles the provisioning, scaling, and management of compute resources required
for batch jobs
AWS ParallelCluster
Open-source
cluster management tool provided by AWS that simplifies the deployment,
configuration, and management of high-performance computing (HPC)
clusters on the AWS Cloud
There is vCPU-based On-Demand Instance limit per region
EC2 Billing
- Pending: will not be billed
- Running: will be billed
- Stopping: will not be billed
- Terminated: will not be billed
- Stopping (to hibernate): will be billed
- Terminated (reserved instance): will be billed
AWS Outposts
- Fully managed service that extends AWS infrastructure, services, APIs,
and tools to your on-premises data center or edge location
- Brings AWS infrastructure (hardware and software) to your physical data center or
on-premises environment
- Supports core AWS services like Amazon EC2, ECS/EKS, RDS, S3, and EBS locally
AWS Wavelength
- Brings AWS compute and storage services to the edge of
telecommunications (telco) 5G networks, enabling developers to build
applications that require ultra-low latency for end users and devices
-
AWS Wavelength extends AWS infrastructure into Wavelength Zones, which
are zones within telco provider data centers connected to 5G networks
-
Applications deployed in these zones process data close to users,
reducing the latency introduced by routing to traditional AWS regions
Access Control
IAM
- IAM users can be grouped into IAM groups
- Permission policies can be assigned to IAM groups
(or)
- Can be assigned to users by mean of inline policy
- Least privilege permission
- One user can belong to multiple different groups, thus can have multiple permission policies
- Groups can only contain users (cannot contain other groups)
- Admin can set password policy for IAM users
- AWS cloudshell is not available in every region
- AWS services can do actions on behalf of user by being assigned IAM roles which include one
or more IAM policies
- Access is allowed only if explicit "Allow" permission is defined
MFA Options
- Authenticator apps
- Universal 2nd Factor (U2F)
MFA Options Security Key
- Hardware key fob MFA device
Hardware key fob MFA device for AWS GovCloud
IAM security tools
- Can generate IAM security credentials report of IAM users (account level)
- IAM access adviser (user level)
AWS Organizations
- Allows to manage multiple AWS accounts
- The main account is the management account
- Other accounts are member accounts
- Member accounts can only be part of one organization
Organization Units (OUs)
- Accounts in the organization are organized into OUs
- OUs can be nested
Security Control Policy (SCP)
- IAM policies applied to OU or Accounts to restrict Users and Roles
- They do not apply to the management account (full admin power)
- They do not affect the service-linked roles
Resource-based Policy vs IAM Roles
- Some services provide resource-based policy but some only IAM role
- Cross-account resource access can be done either by account A assuming role in account B or
by defining resource-based policy for the resource in account B
- Trust policy is also a type of resource-based policy
AWS Services with Resource-based Policy
- Lambda
- SNS
- SQS
- S3
- API Gateway
- KMS
AWS Services with IAM Roles
- Kinesis streams
- ECS tasks
IAM Permission Boundaries
- Advanced feature to use a managed policy to set the maximum permissions an IAM entity can
get
- IAM Permission Boundaries are supported for users and roles only (not groups)
IAM Identity Center
-One login (single sign-on) for all AWS accounts in AWS Organizations,
business applications, and third-party applications (e.g., Salesforce,
Office 365, etc.)
- IAM users in Identity Center management account can be assigned with
permission sets which allow them to access accounts and also specific
resources in OUs
-Can manage users and groups directly within AWS Identity Center or
integrate with external identity providers like Microsoft Active
Directory, Okta, or Azure AD
AWS ControlTower
- Easy way to set up and govern a secure and compliant multi-account AWS environment based
on best practices
- AWS Control Tower uses AWS Organizations to create accounts
Preventive Guardrail
- using SCPs (e.g., Restrict Regions across all your accounts)
Detective Guardrail
- using AWS Config (e.g., identify untagged resources)
AWS Resource Access Manager (RAM)
- To easily and securely share your resources with your AWS accounts
AWS ActiveDirectory (AD)
- AWS Managed Microsoft AD
- AD Connector
- Simple AD
AWS Managed Microsoft AD
- Create your own AD in AWS to manage users
- Establish "trust" connections with your on-premises AD
AD Connector
- Proxy for on-premise AD
Simple AD
- AWS managed
- Cannot be joined with on-prem ADs
AWS Federated Access
Federated
Access in AWS refers to the ability to grant users from external
identity providers (IdPs) access to AWS resources without having to
create and manage AWS-specific IAM (Identity and Access Management)
users for each individual
Types
- Federation with IAM Identity Center
- Federation with IAM
- Federation with Amazon Cognito identity pools
Federation with IAM Identity Center
- Users in IAM Identity Center are granted short-term credentials to your AWS resources
-
IAM Identity Center supports identity federation with SAML (Security
Assertion Markup Language) 2.0 to provide federated single sign-on
access for users who are authorized to use applications within the AWS
access portal
-
Users can then single sign-on into services that support SAML,
including the AWS Management Console and third-party applications, such
as Microsoft 365, SAP Concur, and Salesforce
Federation with IAM Role
- For single, standalone AWS account
- User Logs In to IdP
- IdP Sends Authentication Token to AWS
- AWS Grants Temporary Credentials through STS
- User Accesses AWS Services
CDN
Cloudfront
- Cloudfront is a CDN service that caches the cloud contents at POPs (216 currently)
- Cloudfront origin can be:
- S3
- EC2
- ALB
- any HTTP endpoint
- Cloudfront can do geo restriction to allow or block users from specific countries using allowlist
and blocklist
- Should use in front of S3 if the file size is less than 1GB
- Can use field level encryption to protect sensitive data for specific content
- Can route to multiple origins based on the content type
- Can use an origin group with primary and secondary origins to configure for high-availability
and failover
- Can generate Signed URL and Signed cookies
Price Classes
- price class all regions - all regions, most expensive
- price class 200 - exclude most expensive regions
- price class 100 - only least expensive regions
Cache Invalidation
origins can invalidate the cloudfront cache when new content is updated so the
cloudfront cache will be invalid and user request will go straight to
the updated content in the origin instead
CloudFront Functions
- Used to change Viewer requests and responses
- Sub-ms startup times, millions of requests/second
- Native feature of CloudFront (manage code entirely within CloudFront)
- javascript only
Lambda@Edge
- Scales to 1000s of requests/second
- Used to change CloudFront requests and responses
- Author your functions in one AWS Region, then CloudFront replicates to its locations
Origin Access Identity (OAI)
- Serves as the identity of a cloudfront distribution
- Origins can use this OAI of the cloudfront distribution in their access control policies to give
access to the distribution
- Cannot set OAI if the S3 bucket is configured as a website endpoint
Origin Access Control (OAC)
- A more preferred way (compared with OAI) to restrict access to an Amazon S3 origin
-
Enables CloudFront customers to easily secure their Amazon S3 Origins
by permitting only designated CloudFront distributions to access their
Amazon S3 buckets
DDoS Mitigation
AWS- services that operate at edge locations, such as AWS CloudFront, AWS
Global Accelerator, and Amazon Route 53 can be used to mitigate DDoS
attacks
Global Accelerator
- 2 anycast IPs are created
- anycast IPs send the traffic to the edge locations and edge locations send the traffic to the
application endpoint
- Uses internal AWS network
- Can be used to distribute a portion of traffic to a particular deployment using enpoint weights
- Good for gaming, IoT or voice over IP services
Cloudfront vs Global Accelerator
- Cloudfront caches the contents at the edge location and serve the content from the edge
location
- global accelerator uses TCP or UDP to route the traffics through the edge location to the
application
- global accelerator doesn’t have cache service like cloudfront
- both have DDoS protection using AWS shield
Storage
S3
- max size of an object is 5TB
- if an object is more than 5GB, have to use multi-part upload
- blocking public access setting can be set at account level
Versioning
- if versioning is enabled for a bucket, previous versions of the object are preserved when
overwritten
-
if an object is deleted, it is not truly deleted but marked with the
delete marker and then previous versions can be restored by deleting the
delete marker
- Once versioning is enabled for a bucket, it cannot be disabled, can only be suspended
Replication
- replication is done by creating replication rule at the source s3 bucket
- both source and destination bucket have to enable bucket versioning
- only new objects are replicated
- have to use s3 batch replicate to replicate existing and failed replication objects
- can replicate buckets in different regions
Storage Classes
- standard
- standard IA
- good for once a month access
- one-zone IA
- good for once a month access
- glacier instant retrieval
- millisec retrieval
- good for data accessed once a quarter
- min storage duration of 90 days
- glacier flexible retrieval
- expedited (1-5 mins), standard (3-5 hrs), bulk (5-12 hrs)
- min storage duration of 90 days
- glacier deep archive
- standard (12 hrs), bulk (48 hrs)
- min storage duration of 180 days
- intelligent tiering
- frequent access
- infrequent access: objects not accessed for 30 days
- archive instant access: objects not accessed for 90 days
- archive access (optional): configurable from 90 to 700+ days
- deep archive access (optional): configurable from 180 to 700+ days
Provisioned Capacity (for Glacier Flexible Expedited Retrieval)
- ensures that your retrieval capacity for expedited retrievals is available when you need it
- unit of capacity provides that at least three expedited retrievals can
be performed every five minutes and provides up to 150 MB/s of retrieval
throughput
Lifecycle Rules / Lifecycle Policies
- Transition rule: to move objects from one class to another
- Expiration rule: to delete expired objects
- Object level rules
Requester Pay
- requester of the object pays for the network costs
- requester have to be an authenticated IAM user of an AWS account
- After a bucket is configured to be a Requester Pays bucket, requesters must include
x-amz-request-payer
in their API request header, for DELETE, GET, HEAD, POST, and PUT
requests, or as a parameter in a REST request to show that they
understand that they will be charged for the request and the data
download
Event Notifications
-send messages/events to SNS, SQS (only standard queue) or Lambda
function when an object action is triggered (eg: ObjectCreated:Put,
ObjectCreated:Post, …)
- receiving services have to be configured with IAM policy to receive event notification from s3
Performance
- each s3 prefix can achieve 3500 put/copy/post/delete requests/sec and 5500 get/head
requests/sec
- if objects are distributed across 4 prefix, user can have 22000
get/head requests/sec and 14000 put/copy/post/delete requests/sec
- how to further optimize s3 performance:
- multi-part upload
- s3 transfer acceleration
- s3 byte range fetches
Batch Operations
- to perform bulk operations on existing s3 objects with a single request
- to get the list of objects:
- use s3 inventory
- filter using s3 select
- and use s3 batch operation to do processings
Encryption
- Server side encryption (SSE)
- SSE-S3: encrypt with aws managed key
- SSE-KMS: encrypt with KMS key
- SSE-C: encrypt with customer provided key
- Client side encryption (CSE)
CORS
Need to be enabled to access objects from web browsers
MFA Delete
Only root account can enable/disable MFA delete of a S3 bucket
Access Logs
- To capture detailed records of requests made to the S3 bucket
- Provide insights into who accessed the bucket, from where, and how they interacted with the
objects
Presigned URLs
Time-limited URL that grants temporary access to an S3 object
Glacier Vault Lock
- write once read many model
- glacier vault lock has policy and that policy cannot be changed after set once
- if an object is moved to glacier vault, it cannot be deleted anymore
S3 Object Lock
- write once read many model
- bucket versioning must be enabled
- block an object version deletion for a period of time
Retention Modes
- compliance - no one can delete the object or change the retention policy
governance - some(admin) users can delete the object or change the retention policy
Legal Hold
- protect the object indefinitely
- independent from retention period
- legal hold can be placed and removed on an object by using s3:PutObjectLegalHold IAM
permission
S3 Access Points
- each AP points to each bucket
- s3 access points can have own DNS names
- can be internet origin or vpc origin
- can have policy of it’s own
- so the bucket policy can be simple
S3 Objects Lambda Access Points
Object- lambda access points enable users to have modified s3 object by
pointing to the lambda function which access the original s3 object and
do modifications before sending to the object lambda access point
AWS Snow Family
- snowcone and snowball edge are devices used for offline data migration
-order the snowcone or snowball edge devices from AWS, load the devices
with data, send back the devices to AWS and AWS will transfer the data
from devices to s3 buckets
- snowcone can handle 8TB hdd - 14TB ssd, migration size up to terabytes
- snowball edge can handle 80TB - 210TB, migration size up to petabytes
- snowball edge supports storage clustering
- can do edge computing on snow devices by running lambda functions or ec2 instances at the
edge
- snowcone is capable with 2 cpu and 4gb of ram
- snowball edge on the other hand is compute-optimized and storage-optimized
- snowball cannot transfer the data directly to s3 glacier
- snowmobile is used to move petabytes to exabytes of data, transfer data with container-sized
trucks
AWS FSx
- fully-managed high performance file systems on AWS
Types
- FSx for Lustre
- FSx for Windows file server
- FSx for NetApp ONTAP
- FSx for openZFS
AWS Storage Gateway
Bridge between on-premises data and cloud data
Types
- s3 file gateway
- FSx file gateway
- Volume gateway (cached or stored)
- Tape gateway
Volume Gateway Cached Mode
Only subset of data is stored in on-premise volume gateway
Volume Gateway Stored Mode
Full and redundant data is stored in on-premise volume gateway
AWS Transfer Family
A fully-managed service for file transfers into and out of Amazon S3 or Amazon EFS using the
FTP protocol
Supported Protocols
- AWS Transfer for FTP (File Transfer Protocol)
- AWS Transfer for FTPS (File Transfer Protocol over SSL)
- AWS Transfer for SFTP (Secure File Transfer Protocol)
AWS DataSync
- Move large amount of data to and from (can be scheduled using agent tasks)
- On-premise/Other clouds to AWS
- AWS to AWS
- Only AWS data transfer service that can directly transfer the data to S3 Glacier
Supported Storage Services
- S3
- S3 Glacier
- EFS
- FSx
Machine Learning
Rekognition
- for CV
- labeling
- content moderation
- Face Detection and Analysis (gender, age range, emotions...)
- Face Search and Verification
- Celebrity Recognition
- Pathing (ex: for sports game analysis)
Amazon Transcribe
- Speech to text
Features
- Automatically remove Personally Identifiable Information (PII) using Redaction
- Automatic Language Identification for multi-lingual audio
Amazon Polly
- Text to speech
Features
- Lexicon upload for acronyms and stylized words
- Speech customization with Speech Synthesis Markup Language (SSML)
Amazon Translate
- Language translation
Amazon Lex
- Chatbots
- Call center bots
- Natural Language Understanding to recognize the intent of text, callers
Amazon Connect
- Cloud contact center
Amazon Comprehend
- Fully managed NLP service
Amazon Comprehend Medical
- Uses NLP to detect Protected Health Information (PHI)
Amazon SageMaker
- Fully managed service for developers / data scientists to label data, build and deploy ML
models
Amazon Forecast
- For timeseries analysis
Amazon Kendra
- Fully managed document search service powered by Machine Learning
- Sources can be text, pdf, HTML, PowerPoint, MS Word, databases
Amazon Personalize
- Recommendation system service
Amazon Textract
- For OCR and IE
Application Integration/Messaging
SQS
Producer/Consumer Model
Standard Queue
- Unlimited throughput, unlimited number of messages in queue
- Default retention of messages: 4 days, maximum of 14 days
- Low latency (<10 ms on publish and receive)
- Limitation of 256KB per message sent
- Can have duplicate messages
- Can have out of order messages
- Default visibility timeout of 30 sec
- Cannot set priority value to each message
FIFO Queue
- Limited throughput: 300 msg/s without batching, 3000 msg/s with
- Exactly-once send capability (by removing duplicates)
- Messages are processed in order by the consumer
- Use deduplication ID and message group ID to ensure exactly-once capability
Encryption
- In-flight encryption using HTTPS API
- At-rest encryption using KMS keys
- Client-side encryption if the client wants to perform encryption/decryption itself
Access Policy
Similar to s3 bucket policy to control the access to the queue
Long Polling
-When a consumer requests messages from the queue, it can optionally
“wait” for messages to arrive if there are none in the queue
- The wait time can be between 1 sec to 20 sec (20 sec preferable)
- Can configure by setting ReceiveMessageWaitTimeSeconds to a number greater than zero
Dead Letter Queues
Dead-letter
queues can be used by other queues (source queues) as a target for
messages that can't be processed (consumed) successfully
Delay Queue
- Delay queues let you postpone the delivery of new messages to a queue for several seconds
- The default (minimum) delay for a queue is 0 sec
- The maximum is 15 minutes
SNS
Pub/Sub Model
Topics
- Publisher pushes events to a topic and each subscriber to the topic will get all the events
- Up to 12,500,000 subscriptions per topic
- 100,000 topics limit
FIFO SNS
- Similar features as SQS FIFO
- Can have SQS Standard and FIFO queues as subscribers
- same throughput as SQS FIFO
Encryption
- In-flight encryption using HTTPS API
- At-rest encryption using KMS keys
- Client-side encryption if the client wants to perform encryption/decryption itself
Access Policy
Similar to s3 bucket policy to control the access to the queue
Message Filtering
- JSON policy used to filter messages sent to SNS topic’s subscriptions
- If a subscription doesn’t have a filter policy, it receives every message
Fan-out (SNS+SQS)
- Push once in SNS, receive in all SQS queues that are subscribers
- Cross-Region Delivery: works with SQS Queues in other regions
Kinesis
Producer/Consumer Model
Kinesis Data Streams
- Streaming service for ingest at scale
- data contain partition key and data blob
- data with same partition keys always go into same shard
- Once data is inserted in Kinesis, it can’t be deleted (immutability)
- Ability to reprocess (replay) data
- Retention between 1 day to 365 days, default of 1 day
- Cannot autoscale, have to be pre-provisioned
Capacity Modes
- Provisioned Mode
- On-demand Mode
Provisioned Mode
- choose the number of shards provisioned, scale manually or using API
- Each shard gets 1MB/s in (or 1000 records per second)
- Each shard gets 2MB/s out (classic or enhanced fan-out consumer)
- Pay per shard provisioned per hour
On-demand Mode
- Default capacity provisioned (4 MB/s in or 4000 records per second)
- Scales automatically based on observed throughput peak during the last 30 days
- Pay per stream per hour & data in/out per GB
Security
- In-flight encryption using HTTPS API
- At-rest encryption using KMS keys
- Client-side encryption if the client wants to perform encryption/decryption itself
- VPC Endpoints available for Kinesis to access from within the VPC
Enhanced Fan-out
- Standard: 2MB/s per shard (shared between multiple consumers)
- Enhanced fan-out: 2MB/s per shard per consumer
Kinesis Data Firehose
- Load streaming data into S3 / Redshift / OpenSearch / 3rd party / custom HTTP
- Fully Managed Service, no administration, automatic scaling, serverless
- Pay for data going through Firehose
- Near real-time
- Supports custom data transformations using AWS Lambda
- Doesn't guarantee the order of message delivery and processing
Kinesis Data Analytics
- Real-time analytics on Kinesis Data Streams & Firehose using SQL
- Add reference data from Amazon S3 to enrich streaming data
- Fully managed, no servers to provision
- Automatic scaling
Kinesis Video Streams
EventBridge
- Trigger AWS services based on events sent by other AWS services or 3rd party integrations
- Can archive and replay events for debugging purposes
Trigger Types
- Schedule
- Event Patterns
Schedule
Cron jobs (scheduled scripts)
Event Patterns
Event rules to react to a service doing something
Event Buses
- Default event bus (AWS services)
- Partner event bus (3rd parties)
- Custom event bus
Schema Registry
- The Schema Registry allows you to generate code for your application,
that will know in advance how data is structured in the event bus
- Schema can be versioned
Resource-based Policy
- Manage permissions for a specific Event Bus
- Allow/deny events from another AWS account or AWS region
- Aggregate all events from your AWS Organization in a single AWS account or AWS region
Amazon MQ
Service for on-premise message broker protocols such as: MQTT, AMQP, STOMP, Openwire, WSS
Amazon Simple Workflow Service (SWF)
Amazon SWF is a web service that makes it easy to coordinate work across distributed
application components
AWS AppFlow
- To transfer and integrate data between AWS services and external SaaS platforms
- Keeping SaaS data synchronized with AWS resources
AWS AppSync
- A managed service for building real-time GraphQL APIs to power data-driven applications
- Simplifies building GraphQL APIs for querying, mutating, and subscribing to data
- Allows combining multiple data sources (e.g., DynamoDB, RDS, Lambda) into a single unified
API
Security
Encryption
- In-flight encryption
- Server-side encryption
- Client-side encryption
In-flight encryption
- Data is encrypted before sending and decrypted after receiving
- TLS certificate is used in HTTPS
Server-side encryption
- Data is encrypted after receiving by server and decrypted before sending to the client
Client-side encryption
- Data is encrypted by the client and never decrypted by the server
KMS
- Fully integrated with IAM for authorization
- Able to audit KMS Key usage using CloudTrail
- KMS Key Encryption also available through API calls (SDK, CLI)
- Have to pay for API call to KMS ($0.03 / 10,000 calls)
- If a KMS key is deleted, it is in 'pending deletion' state for 7–30 days, with a default of 30 days
and can be recovered
Asymmetric vs Symmetric Keys
- Symmetric Keys (AES-256)
- Asymmetric Keys (RSA & ECC key pair)
Symmetric Keys
- Single key for both encryption and decryption
- AWS services integrated with KMS use symmetric keys
- Never get access to the KMS Key unencrypted (must call KMS API to use)
Asymmetric Keys
- Public (Encrypt) and Private Key (Decrypt) pair
- The public key is downloadable, but the Private Key can't be accessed unencrypted
KMS Key Types
- AWS Owned Keys (free): SSE-S3, SSE-SQS, SSE-DDB (default key)
- AWS Managed Keys (free): (aws/service-name, example: aws/rds or aws/ebs)
- Customer managed keys created in KMS: $1 / month
- Customer managed keys imported: $1 / month
Automatic Key Rotation
- AWS-managed KMS Key: automatic every 1 year
- Customer-managed KMS Key: automatic (must be enabled) or on-demand
- Imported KMS Key: only manual rotation possible using alias
Key Policies
- Control access to KMS keys, “similar” to S3 bucket policies
Default Key Policy
- Created if you don’t provide a specific KMS Key Policy
- Complete access to the key to the root user
Custom Key Policy
- Define users, roles that can access the KMS key
- Define who can administer the key
Multi-region Keys
- MRK is bound to a single region but replicas are replicated to multiple regions
- To be able to decrypt the data encrypted in a different region
- For the use cases of global client-side encryptions like global
dynamodb client-side encryption, global aurora client-side encryption
Replicating encrypted S3 objects
- For objects encrypted with SSE-KMS:
- Specify which KMS Key to encrypt the objects within the target bucket
- Adapt the KMS Key Policy for the target key
- An IAM Role with kms:Decrypt for the source KMS Key and kms:Encrypt for the target KMS
Key
- You might get KMS throttling errors, in which case you can ask for a Service Quotas increase
AWS CloudHSM
- Fully managed service that provides customers with dedicated hardware
security modules to securely generate and use encryption keys
- AWS CloudHSM is a fully managed service, meaning AWS takes care of hardware maintenance,
updates, and availability
- Customer retains full control over the cryptographic key management and security
configurations
AWS System Manager (SSM) Parameter Store
- Secure storage for configuration and secrets
- Optional Encryption using KMS
- Parameters can be stored in hierarchies
Tiers
- Standard
- Advanced
Parameter Policies
Allow to assign a TTL to a parameter (expiration date) to force updating or deleting sensitive
data such as passwords
AWS SecretsManager
- Secure storage of secrets
- Capability to force rotation of secrets every X days
- Automate generation of secrets on rotation (uses Lambda)
- Integration with database services like RDS, Aurora, Redshift, DocumentDB
- Secrets are encrypted using KMS
Multi-region secrets
- Replicate Secrets across multiple AWS Regions
- Secrets Manager keeps read replicas in sync with the primary Secret
- Ability to promote a read replica Secret to a standalone Secret
AWS Certificate Manager
- Easily provision, manage, and deploy TLS Certificates
- Supports both public and private TLS certificates
- Free of charge for public TLS certificates
- Can generate certificates too
- Certificates generated with ACM are automatically renewed
Integrations
- ELB
- Cloudfront distributions
- APIs on API Gateway
- Cannot use from EC2
API Gateway
- Edge-optimized
- Regional
- Private (cannot use ACM)
Edge-optimized
- ACM is integrated with Cloutdfront distribution
- The TLS Certificate must be in the same region as CloudFront
Regional
- The TLS Certificate must be imported on API Gateway, in the same region as the API Stage
Web Application Firewall (WAF)
- Protects your web applications from common web exploits (Layer 7)
Integrations
- ALB
- API Gateway
- Cloudfront
- AppSync GraphQL API
- Cognito User Pool
Web Access Control List (Web ACL)
- IP Set: up to 10,000 IP addresses
- HTTP headers, HTTP body, or URI strings Protects from common attack - SQL injection and
Cross-Site Scripting (XSS)
- Size constraints
- geo-match (block countries)
- Rate-based rules (to count occurrences of events) – for DDoS protection
- Web ACL are Regional except for CloudFront
- A rule group is a reusable set of rules that can be added to a web ACL
AWS Shield
Protect from DDoS Attacks
Modes
- Standard
- Advanced
Standard
- Free service that is activated for every AWS customer
- Provides protection from attacks such as SYN/UDP Floods, Reflection attacks and other
layer3/4 attacks
Advanced
- Optional DDoS mitigation service
- $3,000 per month per organization
- 24/7 access to AWS DDoS response team (DRP)
- Shield Advanced automatic application layer DDoS mitigation
automatically creates, evaluates and deploys AWS WAF rules to mitigate
layer 7 attacks
Supported Services
- EC2
- ELB
- CloudFront
- Global Accelerator
- Route 53
Elastic IP
AWS Network Firewall
- Detail in VPC section
AWS Firewall Manager
- Manage firewall rules in all accounts of an AWS Organization
-Rules are applied to new resources as they are created (good for
compliance) across all and future accounts in your Organization
Security Policies
- common set of security rules
- WAF rules (ALB, API Gateways, CloudFront)
- AWS Shield Advanced (ALB, CLB, NLB, Elastic IP, CloudFront)
- Security Groups for EC2, ALB and ENI resources in VPC
- AWS Network Firewall (VPC Level)
- Route 53 Resolver DNS Firewall
- Policies are created at the region level
AWS GuardDuty
- Managed threat detection service
- Analyze threat from input data like CloudTrail events, VPC flow logs, etc
- Notify the findings through EventBridge
Foundational Data Sources
- CloudTrail Events Logs
- VPC Flow Logs
- DNS Logs
Other Data Sources
- S3 data event logs
- EKS audit logs
- Lambda network activity logs
- RDS login activity logs
- EBS volume data
AWS Inspector
- Automated Security Assessments for:
- EC2
- Container Images push to Amazon ECR
- Lambda Functions
- Reporting & integration with AWS Security Hub
- Send findings to Amazon Event Bridge
EC2
- Leveraging the AWS System Manager (SSM) agent
- Analyze against unintended network accessibility
- Analyze the running OS against known vulnerabilities
Container Images push to Amazon ECR
- Assessment of Container Images as they are pushed
Lambda Functions
- Identifies software vulnerabilities in function code and package dependencies
- Assessment of functions as they are deployed
AWS Macie
Find sensitive Personally Indentifiable Information (PII) in data stored on S3
AWS Artifact
To view, assess and manage the security reports as well as other AWS compliance-related
information
AWS Security Hub
- Security service that provides a comprehensive view of your security posture across AWS
accounts
-Security Hub collects and aggregates security findings from multiple
AWS services such as Amazon GuardDuty, Amazon Macie, Amazon Inspector,
and AWS Config, as well as from third-party security solutions
AWS Security Token Service (STS)
-Service that you can use to create and provide trusted users with
temporary security credentials that can control access to your AWS
resources
- Temporary security credentials work almost identically to the long-term access key credentials
that your IAM users can use
VPC
Default VPC
- Default VPC has Internet connectivity through internet gateway and all EC2 instances inside it
have public IPv4 addresses
Own VPC
- Can create max 5 per region (but soft limit)
- Max CIDR per VPC is 5
CIDR size
Min: /28 (16 IP addresses)
Max: /16 (65536 IP addresses)
Allowed CIDR ranges (private)
- 10.0.0.0 – 10.255.255.255 (10.0.0.0/8)
- 172.16.0.0 – 172.31.255.255 (172.16.0.0/12)
- 192.168.0.0 – 192.168.255.255 (192.168.0.0/16)
Subnets
- AWS reserves 5 IP addresses (first 4 & last 1) in each subnet
- x.x.x.0 – Network Address
- x.x.x.1 – reserved by AWS for the VPC router
- x.x.x.2 – reserved by AWS for mapping to Amazon-provided DNS
- x.x.x.3 – reserved by AWS for future use
- x.x.x.255 – Network Broadcast Address. AWS does not support broadcast in a VPC, therefore
the address is reserved
- Each subnet maps to single AZ
- Every subnet created is automatically associated with the main route table for the VPC.
IPv6-only Subnet
- Can only support Nitro instances
Internet Gateway
- Allows resources (e.g. EC2 instances) in a VPC connect to the Internet
- It scales horizontally and is highly available and redundant
- Must be created separately from a VPC and attach to a VPC
- Subnet route tables must be configured to route the traffic to internet gateway to access the
internet
- Subnet becomes public subnet when it is connected to and routed through an internet
gateway
Bastion Host
- BH is an instance in a public subnet which have access to other instances in the private subnet
To be able to ssh into private instances via BH
SG of the BH have to allow port 22 from internet and SG of private instances must allow ssh
from SG of the bastion host
NAT Instance
- An instance in the public subnet through which the private instances can access to the internet
- Must have Elastic IP attached to it
- Must disable EC2 setting: Source / destination Check
- An instance can be NAT instance by configuring using NAT AMIs
- Route tables of private subnets must be configured to route traffic from private subnets to the
NAT Instance
NAT Instance SG rules
- Inbound:
- Allow HTTP / HTTPS traffic coming from Private Subnets
- Allow SSH from source network (access is provided through Internet Gateway)
- Outbound:
- Allow HTTP / HTTPS traffic to the Internet
NAT Gateway
- AWS-managed NAT instance
- Higher bandwidth, high availability, no administration
- Pay per hour for usage and bandwidth
- NAT GW is AZ-bound
- Uses an Elastic IP
- Can’t be used by EC2 instance in the same subnet (only from other subnets)
- Private Subnet => NATGW => IGW
- 5 Gbps of bandwidth with automatic scaling up to 100 Gbps
SGs and NACLs
SGs
- Operates at instance level
- Stateful (always allow return traffic)
- Only support 'Allow' rules
- Evaluate all the rules before deciding to allow
- Newly created SG will 'Deny' every inbound traffic and 'Allow' every outbound traffic
NACLs
- Operates at subnet level
- Stateless
- Supports both 'Allow' and 'Deny' rules
- One NACL per subnet, new subnets are assigned the Default NACL
- NACLs and subnets are decoupled and NACLs live in VPC
- Default NACL is "allow all"
- Newly created NACLs will deny everything (inbound or outbound)
NACL have to be configured to allow inbound and outbound ephemeral ports since it is
stateless
NACL Rules
- Rules have a number (1-32766), higher precedence with a lower number
- First rule match will drive the decision
- The last rule is an asterisk (*) and denies a request in case of no rule match
VPC Peering
- Privately connect two VPCs using AWS network
- Peer VPCs must not have overlapping CIDRs
- VPC Peering connection is NOT transitive
- Route tables of subnets in both VPC have to be updated to route the traffic to other VPC
through peer connection
Can create VPC Peering connection between VPCs in different AWS accounts/regions- Can
reference a security group in a peered VPC (cross accounts but same region)
VPC End Points
-VPC Endpoints (powered by AWS PrivateLink) allows to connect to AWS
services using a private network instead of using the public Internet
- Remove the need of IGW, NATGW, ... to access AWS Services
Types
- Interface Endpoint
- Gateway Endpoint
Interface Endpoint
- Provisions an ENI (private IP address) as an entry point (must attach a Security Group)
- Supports most AWS services
- $ per hour + $ per GB of data processed
- Can be used to connect to another VPC
- Uses AWS PrivateLink to connect the endpoint to services
Gateway Endpoint
- Provisions a gateway and must be used as a target in a route table (does not use security
groups)
- Free
- Supports S3 and DynamoDB
- If S3 or DynamoDB is not in the same region as the subnet, Gateway
Endpoint cannot be used since Gateway Endpoint is a regional service
(use NAT gateway or Interface Endpoint instead)
- can attach an endpoint policy that controls access to the service to which you are connecting
does not use AWS PrivateLink
Flow Logs
- Capture information about IP traffic going into your interfaces
- Can query VPC flow logs using Athena on S3 or CloudWatch Logs Insights
Flow Logs data can go into:
- S3
- Cloudwatch logs
- Kinesis Data Firehose
Site-to-site VPN Connection
- To connect VPC with on-prem servers through private VPN connection over public network
- Site-to-site VPN connection can be used as a backup connection to Dx connection
Need 2 things:
- Virtual Private Gateway (VGW)
- Customer Gateway (CGW)
VGW
- VPN concentrator on the AWS side of the VPN connection
- VGW is created and attached to the VPC from which you want to create the Site-to-Site VPN
connection
- Need to enable Route Propagation for the VGW in the route table that is associated with the
subnets in the VPC
CGW
- Software application or physical device on customer side of the VPN connection
- Need public Internet-routable IP address for the Customer Gateway device
If CGW is private, need NAT device to enable public routing
VPN Cloudhub
- Provide secure communication between multiple sites, if you have multiple VPN connections
- To set it up, connect multiple VPN connections on the same VGW, setup dynamic routing and
configure route tables
Direct Connect (Dx)
- Provides a dedicated private connection from a remote network to your VPC
- Dedicated connection must be setup between the data center and AWS Direct Connect
locations
- Need to setup a VGW at VPC side
- Lead times are often longer than 1 month to establish a new connection
Connection Flows
- Private VPC Connection
- Public Resources Connection
Private Connection Flow
VGW => Dx Connector in Dx locations => Customer router in Dx locations => Customer router in
customer network
Public Connection Flow
Public
AWS resources (like s3) => Dx Connector in Dx locations =>
Customer router in Dx locations => Customer router in customer
network
Direct Connect Gateway
- If you want to setup a Direct Connect to one or more VPC in many
different regions (same account), you must use a Direct Connect Gateway
- Dx connection connects to Direct Connect Gateway and Direct Connect Gateway connects to
multiple VGWs
Connection Types
- Dedicated Connections
- Hosted Connections
Dedicated Connections
- 1Gbps,10 Gbps and 100 Gbps capacity
- Physical ethernet port dedicated to a customer
- Request made to AWS first, then completed by "AWS Direct Connect Partners"
Hosted Connections
- 50Mbps, 500 Mbps, to 10 Gbps
- Connection requests are made via "AWS Direct Connect Partners"
- Capacity can be added or removed on demand
- 1, 2, 5, 10 Gbps available at select AWS Direct Connect Partners
Encryption
- Data in transit is not encrypted but is private
- AWS Direct Connect + VPN provides an IPsec-encrypted private connection
Resiliency
- High resiliency
- Max resiliency
High resiliency
One connection at multiple Dx locations
Max resiliency
Maximum resilience is achieved by separate connections terminating on separate devices in
more than one location.
Transit Gateway
- Transit Gateway sits in the middle to connect multiple VPCs
transitively and can also connect to Dx Gateway and Site-to-site VPN
connections
- Regional resource
- Share cross-account using Resource Access Manager (RAM)
- You can peer Transit Gateways across regions
- Route Tables: limit which VPC can talk with other VPC
- Supports IP Multicast
- Can peer multiple transit gateways in multiple regions
Site-to-site VPN ECMP (Equal Cost Multiple Paths)
- Routing strategy to allow to forward a packet over multiple best path
- Use case: create multiple Site- to-Site VPN connections to increase the bandwidth of your
connection to AWS
VPC Traffic Mirroring
- Capture and mirror the traffic to send the mirrored traffic into own security appliances to
analyze, monitor or troubleshoot
- Source and Target can be in the same VPC or different VPCs (VPC Peering)
Egress-only Internet Gateway
- Used for IPv6 only
- Similar to a NAT Gateway but for IPv6
- Must update the Route Tables
- Allows instances in your VPC outbound connections over IPv6 while
preventing the internet to initiate an IPv6 connection to your instances
AWS Network Firewall
- Protect entire VPC
- From Layer 3 to Layer 7 protection
- Internally uses AWS Gateway Load Balancer
- Rules can be centrally managed cross- account by AWS Firewall Manager to apply to many
VPCs
- Can send logs of rule matches to Amazon S3, CloudWatch Logs, Kinesis Data Firehose
Protect directions
- VPC to VPC traffic
- Outbound to internet
- Inbound from internet
- To/from Direct Connect & Site-to-Site VPN
Fine-grained Controls
- IP & port - example: 10,000s of IPs filtering
- Protocol – example: block the SMB protocol for outbound communications
- Stateful domain list rule groups: only allow outbound traffic to *.mycorp.com or third-party
software repo
- General pattern matching using regex
- etc
Cost
Cost Explorer
- Visualize, understand, and manage AWS costs and usage over time
- Create custom reports that analyze cost and usage data
- Monthly, hourly, resource level granularity
- Forecast usage up to 12 months based on previous usage
- Have API support with pagination
Cost Anomaly Detection
- Continuously monitor cost and usage using ML to detect unusual spends
- Monitor AWS services, member accounts, cost allocation tags, or cost categories
- Sends the anomaly detection report with root-cause analysis
- Get notified with individual alerts or daily/weekly summary (using SNS)