0% found this document useful (0 votes)
2 views

AWS Developer Notes

The document provides an overview of various AWS services including IAM, EBS, Elastic Load Balancing, RDS, Aurora, ElastiCache, Route 53, VPC, and S3, detailing their features, use cases, and configurations. Key topics include IAM security tools, EBS multi-attach capabilities, load balancer types, RDS management features, caching strategies, DNS routing policies, and S3 storage classes. It emphasizes important concepts for exam preparation, such as health checks, auto-scaling, and data replication methods.

Uploaded by

reoles
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

AWS Developer Notes

The document provides an overview of various AWS services including IAM, EBS, Elastic Load Balancing, RDS, Aurora, ElastiCache, Route 53, VPC, and S3, detailing their features, use cases, and configurations. Key topics include IAM security tools, EBS multi-attach capabilities, load balancer types, RDS management features, caching strategies, DNS routing policies, and S3 storage classes. It emphasizes important concepts for exam preparation, such as health checks, auto-scaling, and data replication methods.

Uploaded by

reoles
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

IAM

IAM Security Tools


IAM Credentials Report (account-level) – a report that lists all your account’s users and the
status of their various credentials.
IAM Access Advisor (user-level) - access advisor shows the service permissions granted to a
user and when those services were last accessed. You can use this information to revise your
policies..

EBS
EBS Multi Attach – ability to attach the same EBS volume to multiple EC2 instances in the same
AZ
Each instance has full read & write permissions to the high-performance volume

Use case:
When you are trying to achieve higher application availability in clustered Linux applications (ex:
Teradata)
When applications must manage concurrent write operations
Up to 16 EC2 Instances at a time
Must use a file system that’s cluster-aware (not XFS, EXT4, etc…)

Elastic Load Balancing (ELB)


EC2 instance allows security group from Load Balancer while Load Balancer has the usual set up
for Security Group of its own.

Application Load Balancer (ALB)


Provides fixed hostname.
The application servers don’t see the IP of the client directly. Instead, true ip of the client is
inserted in the header X-Forwarded-For. We also get Port (X-Forwarded-Port) and proto (X-
Forwarded-Proto).

Support by protocol
Application Load Balancer - HTTP and HTTPS
Network Load Balancer - TCP, UDP, TLS, when you need millions of requests handled - high
performance.
Gateway Load Balancer - analyzing network traffic-for security,
Network Load Balancer (NLB)
NLB - one static IP per AZ. When exam asks you if your application should only be accessed from
1,2 or 3 IPs, then NLB is solution (question from exam)

Target groups can be:


EC2
Private IP
ALB

Important for exam - Health Checks support TCP, HTTP, HTTPS

Gateway Load Balancer


Used with 3rd party network virtual appliances. If you want to have your traffic inspected before
forwarded to your applicances, use Gateway Load Balancer.

if we see GENEVE protocol on port 6081 on exam - it's gateway load balancer,

Combines functions of Transparent Network Gateway - single entry/exit for all traffic and Load
Balancer - distributes traffic to virtual appliances.

Example: Firewalls, Intrusion Detection and Prevention Systems, Deep Packet Inspection
Systems, payload manipulation.

Target groups can be either EC2 instances or Ip Addresses

Elastic Load Balancer


Sticky sessions - Used for user not loosing session data.
Important to know there are 2 types of cookies - application based and duration based.
Application-based Cookies
- Custom cookie: generated by the target, can include any custom attributes required by the
application, cookie name must be specified individually for each target group.
Reserved names: AWSALB, AWSALBAPP, AWSALBTG
- Application cookie: generated by the load balancer. Reserved name: AWSALBAPP
Duration-based cookies
Generated by load balancer. Reserved names: AWSALB for ALB, AWSELB for CLB.

Application load Balancer - Cross-Zone is enabled by default and there is no addtl charges.
Network Load Balancer and Gateway Load Balancer are disabled by default.
Server Name Indication - multiple SSL certificates onto one web server to serve multiple
websites

Connection Draining – for CLB or Deregistration Delay for ALBand NLB


Concept is that it gives time to complete “in-flight” requests while the instance is de-
regeistering or is unhealthy. While the instance is being drained, it will stop sending new
requests to it.
Process: Users who are already connected to the instance that is being drained are going to be
given enough time (which is draining period) to complete their transactions. While for the new
requests, ELB will be smart enough to forward these requests just to other instances.
1-3600 seconds (default is 300). Can be disabled (set value to 0). Set a low value if your request
is short, because it will allow for quick ec2 instance draining.

Auto Scaling groups

Scale out - adding new instances in the case of increased demand.


Scale in - removing instances (lowering number of instances)

Good metrics to scale on


CPUUtilization - Average CPU utilization across your instances
RequestCountPerTarget - to make sure the number of requests per EC2 instances is stable
Average Network In / Out (if you’re application is network bound)
Any custom metric (that you push using CloudWatch)
3 types of dynamic scaling
This is to tell Auto Scalling when to Auto Scale:
Target Tracking - Example: I want the average ASG CPU to stay at around 40%
Simple/Step – Example: When a CloudWatch alarm is triggered (example CPU > 70%), then add
2 units
Scheduled Actions - Anticipate a scaling based on known usage patterns
Example: increase the min capacity to 10 at 5 pm on Fridays

Aster scaling activity happens, cooldown period proceeds.Default cooldown period - 300
seconds.

RDS
RDS is a managed service:
• Automated provisioning, OS patching
• Continuous backups and restore to specific timestamp (Point in Time Restore)!
• Monitoring dashboards
• Read replicas for improved read performance
• Multi AZ setup for DR (Disaster Recovery)
• Maintenance windows for upgrades
• Scaling capability (vertical and horizontal)
• Storage backed by EBS (gp2 or io1)
• BUT you can’t SSH into your instances

Storage auto scaling


Maximum Storage Threshold
Will automatically scale when:
Free storage is less than 10% of allocated storage
Low-storage lasts at least 5 minutes
6 hours have passed since last modification

RDS Read Replicas


Up to 15 Read replicas
Within AZ, Cross AZ or Cross Region
Replication is Async
Replicas can be promoted to their DB
Applications must update connection strings
RDS Read Replicas within same region – no extra fee
You can only select – can’t write

RDS Multi AZ
Sync Replication
One DNS Name – no need to update connection strings.
Increase availability
Failover in case of loss of AZ, loss of network.
Good for DR

From single AZ to Multi AZ


Zero downtime operation – no need to stop the DB
Process:
Snapshot is taken
A new DB is restored from the snapshot in a new AZ
Synchronization is established between two databases.

Aurora

• 6 copies of your data across 3 AZ:


• 4 copies out of 6 needed for writes
• 3 copies out of 6 need for reads
• Self healing with peer-to-peer replication
• Storage is striped across 100s of volumes
• One Aurora Instance takes writes (master)
• Automated failover for master in less than 30 seconds
• Master + up to 15 Aurora Read Replicas serve reads
• Support for Cross Region Replication
1 master, multiple replicas, self healing

Features
Automatic fail-over
• Backup and Recovery
• Isolation and security
• Industry compliance
• Push-button scaling
• Automated Patching with Zero Downtime
• Advanced Monitoring
• Routine Maintenance
• Backtrack: restore data at any point of time without using backups

RDS Proxy

Use RDS Proxy to improve database efficiency by reducing the stress on database resources and
minimize open connections (important for exam)
RDS proxy handles failover.
Allows apps to pool and share DB connections established with the database.
Supports both RDS and Aurora.
No code changes needed.
Never publicly accessible
Enforce IAM Authentication for DB, and securely store credentials in AWS Secrets Manager

ElastiCache

REDIS
Multi AZ with Auto-Failover
Read Replicas to scale reads and have high availability
Backup and restore features
Supports Sets and Sorted Sets – keywords for the exam

Memcached
Multi node for partitioning of data (sharding)
No high availability
Non peristant
No back up and restore
Multi threaded architecture
REDIS – high availability , MEMCACHED – pure cache.

Lazy Loading / Cache-Aside / Lazy population


Application lookds first in ElasticCache. If Cache Hit – OK. If cache miss, look for data in RDS.
Once retrieved from RDS, write that data to cache
PROS:
Only the requested data will be cached (very efficiant)
Node failures are not fatal

CONS:
In case of cache miss, read penalty is that 3 calls have to be made
It's possible to have stale data

Write through – cache is updated every time RDS is updated


PROS
Data is never stale
Write penalty vs read penalty (only 2 calls are required)

CONS
Data missing until data is written.
Cache churn – a lot of data will never be read.

So combine with lazy loading – try first write through, if data is not found, do lazy loading.
Cache evictions and time to live
3 ways to evict cache:
Delete item explicitly in the cache
Item is evicted because the memory is full and it’s not recently use
Set TTL

TTL good for leaderboards, comments, activity streams.


Amazon MemoryDB for Redis

Route 53

A – maps hostname to IPv4


AAAA – maps hostname to IPv6
CNAME – maps a hostname to another hostname. Can’t create CNAME for toplevel - domains
NS – Name Servers for the Hosted Zone

Public Hosted Zones


Whenever you buy a public domain – you can store it in public hosted zone

Private Hosted Zones


Internal, not visibile publicaly, for traffic within one or more VPCs.

TTL – High TTL vs Low TTL


Client is saying please cache this request for a certain amount of time, defined by TTL
If High TTL – less traffic con Route 53, but it’s possible outdated records.
if Low TTL – more traffic on Route 53, but records are more up to date and it’s easy to do
changes.

TTL is mandatory for each DNS record except for Alias


Cname vs Alias
CNAME – points hostname to another hostname. Only for non root domains.
ALIAS – points hostname to aws service such as ALB, S3 etc. Works for ROOT. Can’t set TTL on it.
Alias is always type of A/AAAA. Automatically recognizes changes in the resource’s IP address
(for example if there is a change in IP for ALB, it will be automatically picked up)

What can Alias point to? Possible target groups:


Elastic Load Balancers
• CloudFront Distributions
• API Gateway
• Elastic Beanstalk environments
• S3 Websites
• VPC Interface Endpoints
• Global Accelerator accelerator
• Route 53 record in the same hosted zone

You cannot set an ALIAS record for an EC2 DNS name


Important to know for exam – ALIAS record can be set at both root and non root tomains. When
creating alias records you can have health check be performed automatically. It’s always
A/AAAA DNS.

Routing policies
Simple
Route traffic to a single resource.
It’s possible to specify multiple values for the same record. For example, you can add multiple A
records for the same domain. In that case, when queried, random one will be chosen by the
client.
When Alias is enabled, you can only have one aws resource
Can’t be associated with health checks.

Weighted
Control the percentage of the requests that go to each specific resource
Can be associated with health checks.
Use case: Load balancing between multiple regions, testing new application versions.
If you want to stop sending traffic to resource, assign it weight of 0. If all resources have weight
of 0, traffic will be returned equally.

Latency Based
Redirect to the resource that has the least latency close to us.
Super helpful when latency is priority.
Latency is based on traffic between users and regions.
Can be associated with Health Checks
Failover
Failover can be primary or secondary. Primary is the one where traffic goes to when healthy.
Secondary is for the traffic failover when primary becomes unhealthy.

Geolocation
Routing based on user location.
You should have default location in case there is no match on location
Use cases: website localization, restrict content distribution, load balancing, …
Can be associated with Health Checks

Geoproximity
Bias is used to manipulate geoproximity.

Multivalue
Not substitution for ELB.
ELB is client side routing.

To change the size of the geographic region, specify bias values:


• To expand (1 to 99) – more traffic to the resource
• To shrink (-1 to -99) – less traffic to the resource

Geoproximity is really helpful when you need to shift traffic from one region to another, by
increasing the bias - IMPORTANT FOR THE EXAM

IP-based Routing
Routing is based on clients’ IP addresses
You provide a list of CIDRs for your clients and the corresponding endpoints/locations (user-IP-
to-endpoint mappings)
Use cases: Optimize performance, reduce network costs…
Example: route end users from a particular ISP to a specific endpoint

Health Checks
Monitor an endpoint: 15 global healthcheckers, automated, supported protocols HTTP, HTTPS,
TCP, if > 18% health checkers report healthy, Route 53 considers it healthy. Health Check only
passes when response is with 2xx and 3xx status codes.

Health checks that monitor other health checks (calculated health checks): combines results of
multiple health checks into a single one.

Health checks that monitor CW Alarms (full control): You can create a CloudWatch Metric and
associate a CloudWatch Alarm, then create a Health Check that checks the alarm itself

To visually represent traffic flow and maintain complex decision trees – use traffic flow diagram.
VPC

VPC: private network to deploy your resources (regional resource)


Subnets allow you to partition your network inside your VPC (Availability Zone resource)
A public subnet is a subnet that is accessible from the internet
A private subnet is a subnet that is not accessible from the internet
To define access to the internet and between subnets, we use Route Tables.

Internet gateway connects VPC to internet. Public subnets have route to the internet gateway.

NAT Gateways and instances allow instances in your private subnets to access the internet while
remaining private.

NACL and Security Groups


NACL: firewall that controls traffic from and to subnet.
Can have ALLOW and DENY rules.
Attached at subnet level
Rules Only include IP addresses

Security Groups
Firewall that controls traffic to and from an ENI/EC2 instance
Can have Only ALLOW Rules

VPC Flow Logs data can go to S3, CloudWatch Logs and Kinsesis Data Firehose.

VPC Endpoints
Any time time exam is asking you to privately connect to an AWS Service, VPC Endpoint is the
way.

S3

Not a global service – buckets are created in a region

Security
User-Based
IAM Policies – which API calls should be allowed for a specific user from IAM

Resource-Based
Bucket Policies – bucket wide rules from the S3 console - allows cross account
Object Access Control List (ACL) – finer grain (can be disabled)
Bucket Access Control List (ACL) – less common (can be disabled)
Encryption: encrypt objects in Amazon S3 using encryption keys
If on s3 webstie hosting you get 403 Forbidden error, make sure the bucket policy allows public
read.

Versioning
Enabled on bucket level.

Replication
Cross-Region Replication and Same-Region Replication
Must enable Versioning for it to work
Copying is asynchronous
Use cases:
• CRR – compliance, lower latency access, replication across
accounts
• SRR – log aggregation, live replication between production and test
Accounts

S3 Batch Replication – replicates existing objects and objects that failed replication.

Delete Marker replication important for exam!


Delete markers created by S3 deleted operations will be replicated. Deleted markers created by
lifecycle rules are not replicated!

If I permanently delete object in the source bucket it will not be deleted in the destination
(replicated) bucket

S3 Storage Classes
Amazon S3 Standard - General Purpose
Used for frequently accessed data. Low latency and high throughput.
Use Cases: Big data analytics, mobile and gaming applications, content distribution…

Amazon S3 Standard-Infrequent Access (IA)


For data that is less frequently accessed, but requires rapid access when needed.
Lower cost than S3 Standard
S3 Standard-IA: Use Case: Disaster recovery, backups

Amazon S3 One Zone-Infrequent Access


high availability in one AZ. Data lost when AZ destroyed. Use case: storing secondary backup
copies or data that you can recreate.

Amazon S3 Glacier Instant Retrieval


Millisecond retrieval, great for data accessed once a quarter
Minimum storage duration of 90 days
Amazon S3 Glacier Flexible Retrieval
Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – bulk is free
Minimum storage duration of 90 days

Amazon S3 Glacier Deep Archive – long term storage


Standard (12 hours), Bulk (48 hours)
Minimum storage duration of 180 days

Amazon S3 Intelligent Tiering


Small monthly monitoring and auto-tiering fee
Moves objects automatically between Access Tiers based on usage
There are no retrieval charges in S3 Intelligent-Tiering

Transitioning
You can transition between storage classes. For infrequently accessed object, move them to
Standard IA. For archive, move to Glacier, or Glacier Deep Archive.

Transition Action – configure object to transition from one storage class to another
Expiration action – configure object to expire (be deleted) after certain time. Can be used to
delete old version of file or to delete incomplete multi-part uploads.

S3 notifications can be sent to SQS, SNS and Lambda and Event Bridge Notification for more
services.

S3 Performance

Multi-Part upload
Recommended for files greater than 100 MB, must be used for files greater than 5 GB. Can help
parallelize uploads
S3 Transfer Acceleration
Increase transfer speed by transferring file to aws edge location which will forward data to s3
bucket. Compatible with multi-part.

S3 Byte-Range Fetches
Parallelizing gets to speed up downloads by requesting specific byte ranges.

Important for the exam is to know these performance options for speeding up download and
upload of the files

S3 Select & Glacier Select


Retrieve less data using SQL by performing server-side filtering
S3 Object Tags and metadata
If you want to upload your own metadata when uploading file, metadata names must begin
with "x-amz-meta-"
Tags are used usually for fine-grained permissions or analytics purposes.
Important to remember is that you cannot search object metadata or object tags.
Common exam question - how would you search - answer you must use external DB as a search
index such as DynamoDB and then retrieve a file name from that DB and into s3.In that external
DB you would save object metadata and object tags.

S3 Object Encryption

Server-Side Encryption (SSE)


Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3) – Enabled by
Default
Encrypts S3 objects using keys handled, managed, and owned by AWS
Must set header "x-amz-server-side-encryption": "AES256"
Encryption is type AWS-256
Object encrypted server side
Server-Side Encryption with KMS Keys stored in AWS KMS (SSE-KMS)
Leverage AWS Key Management Service (AWS KMS) to manage encryption keys
object encrypted server side
Must set header "x-amz-server-side-encryption": "aws:kms"

IMPORTANT LIMITATION

If you use SSE-KMS, you may be impacted by the KMS limits


When you upload, it calls the GenerateDataKey KMS API
When you download, it calls the Decrypt KMS API
Count towards the KMS quota per second (5500, 10000, 30000 req/s based on region)
You can request a quota increase
Server-Side Encryption with Customer-Provided Keys (SSE-C)
When you want to manage your own encryption keys
HTTPS MUST be used.

Client-Side Encryption
Client must encrypt data themselves before sending to s3. Also decrypt when receiving file from
s3.

Encryption in transit (SSL/TLS)


S3 exposes 2 endpoints:
HTTP Endpoint – not encrypted
HTTPS Endpoint – encrypted. This one is recommended and mandatory for SSE-C

How to force Encryption in transit


EC2 Instance Metadata

AWS EC2 instance can learn about themselves without using IAM Role for that purpose.

The URL is http://169.254.169.254/latest/meta-data


You can retrieve the IAM Role name from the metadata, but you CANNOT retrieve the IAM
Policy.
Metadata = Info about the EC2 instance
Userdata = launch script of the EC2 instance

IMDSv1 vs IMDSv2
V1 is accessing link directly
V2 needs to get the session token of limited validity and use it to make a call

Question at the exam: How to use MFA with CLI or SDK?


With CLI: STS GetSessionToken! API Call.

AWS SDK

Official SDKs are…


• Java
• .NET
• Node.js
• PHP
• Python (named boto3 / botocore)
• Go
• Ruby
• C++

If you don’t specify region, it’s us-east-1 by default.

API Rate Limits


DescribeInstances API for EC2 has a limit of 100 calls per seconds
GetObject on S3 has a limit of 5500 GET per second per prefix
For Intermittent Errors: implement Exponential Backoff
For Consistent Errors: request an API throttling limit increase

Service Quotas (Service Limits)


Running On-Demand Standard Instances: 1152 vCPU
You can request a service limit increase by opening a ticket
You can request a service quota increase by using the Service Quotas API
Exponential backoff – If you get ThrottlingException intermittently.
Must only implement retries on 5xx server errors and throttling.
DO NOT IMPLEMENT on 4xx client erros.

EXAM QUESTION
Any time you get ThrottlingException because we did too many API calls - use exponential
backoff.
Which kind of errors you should retry on an Exponential Backoff?
When you receive server error that has 5xx server errors.
You SHOULD NOT IMPLEMENT RETRY ON 4XX CLIENT ERRORS.

Exam question!
Look for order of credentials chain priority at the beginning of video.
1. Command line options – --region, --output, and --profile
2. Environment variables – AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, and
AWS_SESSION_TOKEN
3. CLI credentials file –aws configure
~/.aws/credentials on Linux / Mac & C:\Users\user\.aws\credentials on Windows
4. CLI configuration file – aws configure
~/.aws/config on Linux / macOS & C:\Users\USERNAME\.aws\config on Windows
5. Container credentials – for ECS tasks
6. Instance profile credentials – for EC2 Instance Profiles

When you have API request, you need to sign it. and you sign it with SigV4.

There are 2 ways to transmit signatures.


http header in authorization
and query string (X-Amz-Signature)

You might also like