AWS Amazon Interview Question and Answers
AWS Amazon Interview Question and Answers
AWS Amazon Interview Question and Answers
VPC has IPv4 enabled. For outbound only IPv4 traffic from a private
subnet, you need: NAT instance or NAT Gateway
IPv6 uses Egress only Internet Gateway for outbound requests from a private
Subnet. For IPv4 oubtound internet traffic from a private subnet, you can use a
NAT instance or NAT Gateway
NAT gateway is deployed inside a subnet and it can scale only inside that
subnet. For fault tolerance, it is recommended that you deploy one NAT gateway
per availability zone
• Under shared responsibility model, customer has full control of the instance and
is responsible for OS level patching of the instance
In order to connect and logon to an EC2 instance , you would need to have
private keys that match the public key specified when launching the instance
EC2 instances have their own hourly charges that are pro-rated to nearest
second (with a one minute minimum). Data Transfer charges depends on the
destination: for same region it is usually 1 cent per GB, to another region it is
usually 2 cent per GB and outgoing to internet at 9 cent per GB. Incoming data
from internet is free. EBS Storage costs are separate based on GB of storage
allocated.
SSD instance storage is optimal for applications that need very high random I/O
requirement (high IOPS). Magnetic instance storage is optimal for applications
that need very high throughput requirement (sequential I/O). Since files are read
in whole, magnetic instance storage would offer very high sequential read
throughput at a low cost. EBS volumes are not needed as files are safely stored
in S3.
• All these storage classes offer low latency millisecond access to object at
different price points. Glacier storage class may take minutes to several hours
You have an Allow policy that grants permissions only when access is
made from a specific Public IP address : 78.124.30.112
You have a Deny policy for all actions when access is not made from
85.154.120.55
What is the net effect if both these policies are attached to an IAM User and
user is making a request from an EC2 instance with public
IP address 85.154.120.55? There is no other policy attached to the
IAM user.
Policy evaluation - Since the request is coming from 85.154.120.55, deny policy
does not apply. However, there are no other allow conditions for the requests
coming from 85.154.120.55, so default deny applies. IAM Policy Evaluation Logic
-
https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_evaluatio
n-logic.html
You would like to revoke programmatic access for an IAM user. What steps
do you need to take?
A: Remove access key credintials
You would like to version control your policy changes in IAM so that you
have ability to rollback changes. What features can you use toautomatically
track versions?
Managed Policies are automatically version controlled and maintain previous five
versions
You have defined groups for managing IAM users. You need to grant permissions so that
IAM users can access S3 bucket.Which one these choices will NOT work?
Attach necessary permissions in a S3 bucket resource level policy with the group
specified as the Principal
Group cannot be specified as a principal. Group is not considered an identity and
used merely for managing users.
In your web application, you allow users to register with their existing
identifies with external internet identity providers like Amazon, Google,
Facebook and other OpenID Connect compatible identity providers. Once
authenticated, your users should be able to access specific AWS services
related to your application. Which one of these choices is recommended on
AWS?
Verify identity of the user using Cognito service. Authorized users are mapped to
IAM role defined and they will gain temporary privileges defined in that role
Your company has a corporate directory that is Security Assertion Markup
Language 2.0 compliant for maintaining identities of the employees.
If your employees need access to AWS Services, you need to:
Use Identity federation and configure your corporate directory to provide single
sign on access to AWS
SAML 2.0 based federation. This feature enables federated single sign-on
(SSO), so users can log into the AWS Management Console or call the AWS
APIs without you having to create an IAM user for everyone in your organization
Which one of these RDS databases automatically replicates data across
availability zones irrespective of multi-AZ configuration?
Aurora automatically replicates data six ways across three different availability
zones. For other database engines in RDS, to replicate data to a different AZ,
you would need to enable multi-az deployment to setup a standby instance in a
different availability zone
• Aurora Serverless has a pause and resume capability to automatically stop the
database compute capacity after a specified period of inactivity. When paused,
you are charged only for storage. It automatically resumes when new database
connections are requested
• In Aurora, Read Replica is promoted as a primary during a primary instance
failure. If you do not have an Aurora Read Replica, then Aurora would launch a
new instance and promote it to primary. In other RDS products, you would need
to use a multi-AZ deployment to configure a standby instance
When connecting to an Aurora database, client application needs to use:
• Aurora supports MySQL or PostgreSQL compatibility when launching an Aurora
database. This allows existing tools and clients to connect to Aurora without
requiring modification
You have configured an Aurora database with five read replicas. What is
the recommended mechanism for clients to connect to read replicas?
Use Aurora database reader endpoint
Each Aurora DB cluster has a reader endpoint. If there is more than one Aurora
Replica, the reader endpoint directs each connection request to one of the
Aurora Replicas.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Over
view.Endpoints.html
Minimum charge computed for object in Standard-IA is based on 128KB size. If
object is less than 128KB, it is still charged for 128KB. There is also a 30 day
minimum charge.
Amazon Route 53 does not have a default TTL for any record type
An application uses Geo Location Based Routing on Route 53. Route 53 receives a DNS
Query and it is unable to detect requester’s Geo location.How will Route 53 respond in this
case?
A: Default location is returned if default record is configured. Otherwise No
Answer response is returned
Time To Live (TTL) value for Route53 Alias Record is: Route53 uses TTL
Settings configured in the service that alias records point to
When using Alias Resource Record Set, Amazon Route 53 uses the CloudFront,
Elastic Beanstalk, Elastic Load Balancing, or Amazon S3 TTLs
Account B created a 100MB object in a S3 bucket that is owned by Account
A. Account B has full permissions to that bucket.IAM users belonging to
Account A have full access to the bucket as well.
There are no other permissions associated with the object. Who can read
the contents of that object?
only ACC B read the contents
If Object owner is a different account, it has to explicitly grant access to the
object for bucket owner to access it
An application is deployed in multiple AWS regions and Route 53 is
configured to route request to the region that offers lowest latency for the
client.
Due to an unplanned downtime, Application is not available in one of the
regions.How will Route 53 handle this scenario?
If health check is configured, Route 53 would direct request to another region
that has lower latency
Health Check needs to be configured for Route 53 to become aware of
application down scenarios. It will then act on the routing configuration specified
• You want to grant cross account access to your S3 object using
ACL. How would you specify the grantee information?
S3 ACL allows you to specify the AWS account using email address or the
canonical user ID
You have enabled cross region replication for your S3 bucket. If you delete
a specific object version in the source bucket, what is the behavior
observed in replicated bucket?
When you delete a specific version of an object in the source bucket, cross
region replication does not remove object version in replicated bucket. This is to
protect against accidental and malicious deletes. You need to have a separate
lifecycle management policy for replicated bucket. If you delete an object (without
specifying version) in the source bucket, then delete marker is created in the
source bucket. Delete marker is replicated in the destination bucket as well.
You would like to point your zone apex (example.com) to your web
application hosted in AWS. Web application uses Elastic Load Balancing.
In order to configure Route 53 Zone Apex record to ELB end point, you can
use: d to ELB end point, you can use:Route53 Alias record
To point Zone Apex Record to another AWS supported end point, you need to
Alias resource record set.
You want to prioritize messages into High, Medium, and Low
categories. High priority messages need to be processed first followed by
medium and then low.How can you implement this with SQS?
You need to have three separate queues one for each priority. Application logic
should process messages by priority. FIFO/Group ID can be used for separating
messages into different FIFO queues. However, there is no guarantee that
highest priority messages will be returned first
A health care provider uses RDS to store the medical records. Compliance
requirements call for storing the Database Backups for a period 5
years. How can do this with RDS?
Snapshots are available until customer explicitly deletes them
An application polls SQS Standard Queue for processing pending messages. Application
polls with a batch size set to 10 and long polling wait time set to 10 seconds.There is only
one message currently available in the queueWhat will happen when the application makes
a long polling receive request?
Call returns immediately with 1 available message
Response to the ReceiveMessage request contains at least one of the available
messages and up to the maximum number of messages specified in the
ReceiveMessage action. Long polling also protects you from receiving empty
responses from standard queue when small number of messages are available.
This can happen with short polling due to distribute queue implementation. With
short polling you have to simply repeat the request to receive message.
An Application uses RDS as backend and has a RDS managed read replica to handle
analytics queries and reporting queries.
If the read replica experiences slowness due to heavy load on the EC2 compute instance,
what would be the impact to primary DB instance?
Transactions in primary are not impacted
RDS Read replica is created based on asynchronous replication technology and
does not impact primary db transactions. Read-replica may see a backlog build
up if there are momentary interruptions
An application reads pending messages from SQS Queue. Once the
request is processed, message is deleted by the application.
However, there is an issue with one message and application runs into an
error and fails to remove the message.You need the message to be
analyzed by the development team for fixing the issue. What is the best
way to handle this situation?
Configure a dead letter queue to automatically move the failed messages after
configured number of delivery attempts
Dead Letter Queue allows you to capture poison pill messages that application is
unable to process. When Dead Letter Queue is configured, SQS Service would
automatically move the message to DLQ after specified number of delivery
attempts
There are five messages M1, M2, M3, M4, M5 available in a Standard
SQS Queue. Messages were sent to the queue in the above order. There
are two consumers A and B. Consumer A is currently processing the
message M1 and still inside visibility timeout window. Consumer B makes
a receive request.What message will B receive?
Consumer B may receive any of the messages pending to be processed and in
rare cases, it may also receive M1
Standard Queue offers best effort ordering and in rare cases, can send duplicate
messages
Your application uses two SQS Standard queues for receiving requests.
One queue contains critical message and another queue contains normal
messages. Your application is single threaded. When polling for the
messages, you should use: Since application is single threaded, this is one
corner case where long polling will not work as it will block the thread and
prevent you from processing messages in other queue. Short polling should be
used for this specific scenario. Otherwise, in general Long Polling is
recommended
• Network ACL rules are processed in sequence starting from smallest rule
number. First rule that matches the traffic is applied. In this case, first evaluated
is rule #10 and it would allow SSH traffic from anywhere
Your existing software license is tied to physical cores and sockets. Since
AWS provides virtual instances, what option does AWS have for existing
software licenses at hardware level?
For license that are tied to hardware socket and cores, you can use either
dedicated hosts or bare metal instances
You are managing an Elastic Map Reduce (EMR) cluster that maintains and
processes critical data in your organization. Your customers are asking to
speed up data processing while keeping the costs low. You are exploring
possibility of using spot instances to add additional processing capacity.
For this usage, spot instances are suitable for:
Master node controls and directs the cluster. When master node terminates, it
ends the EMR cluster. Since the data is critical, we cannot use spot instance for
master node. Core node process data and store using HDFS. When you
terminate a core instance, there is a risk of data loss. We cannot use spot
instance for core node as the question mentions that data is critical. Task nodes
process data but do not hold persistent data in HDFS. If a task node terminates,
there is no risk of data loss. Adding additional spot capacity to task nodes is a
great way to speed up data processing
You have a fleet of hundreds of EC2 instances. If there are problems with AWS managed
items like Physical host, network connectivity to host, and system power, how would you be
able to track it?
System status check would identify AWS infrastructure related issues. Instance
status check would also fail if a system status check fails. So, either one can be
used.
You want to monitor application log files and OS log files and create
appropriate alerts when there are critical messages logged in these files.
What capability can you use for this?
A: CloudWatch Logs
When you recover an EC2 instance using CloudWatch Alarm, what
happens to the instance?
Instance is moved to a different physical host. Instance has same metadata
including public IP Address and Private IP Address
You would like to monitor two different metrics in an AWS Service and take
automated action based on the metric value. How many
CloudWatch alarms would you need?
Alarm is associated with one metric. So, we need one alarm per metric.
We'd like to have CloudWatch Metrics for EC2 at a 1 minute rate. What should
we do?
Enable Detailed Monitoring
This is a paid offering and gives you EC2 metrics at a 1 minute rate
High Resolution Custom Metrics can have a minimum resolution of 1 second
An Alarm on a High Resolution Metric can be triggered as often as 10 seconds
You have made a configuration change and would like to evaluate the impact
of it on the performance of your application. Which service do you use?
CloudWatch is used to monitor the applications performance / metrics
Someone has terminated an EC2 instance in your account last week, which
was hosting a critical database. You would like to understand who did it and
when, how can you achieve that?
CloudTrail helps audit the API calls made within your account, so the database
deletion API call will appear here (regardless if made from the console, the CLI, or
an SDK)
Your CloudWatch alarm is triggered and controls an ASG. The alarm should
trigger 1 instance being deleted from your ASG, but your ASG has already 2
instances running and the minimum capacity is 2. What will happen?
The number of instances in an ASG cannot go below the minimum, even if the alarm
would in theory trigger an instance termination
EBS volume can by a single EC2 instance at any point in time. You can cleanly
unmount and detach from one instance and attach it to a different instance;
however, at any point in time, only one EC2 instance can use an EBS volume.
EFS is shared file storage meant to be used by multiple systems simultaneously.
S3 is internet cloud storage that can be concurrently accessed by several clients
Your application uses Elastic Load Balancer and has six EC2 instances
registered with three each in Availability Zones A and B.
You want to benchmark a newer generation high performance instance
with your application that will allow you to serve the traffic with
fewer instances.
You removed the 3 registered instances in Availability Zone A and replaced
them with a single new generation instance. When you were doing the load
test, you notice that only 25% of the traffic is reaching the new instance.
You want to ensure at least 50% of the traffic reaches the new instance.
What could you do to accomplish this? Disable crosszone LB
If cross-zone load balancing is enabled, the load balancer distributes traffic
evenly across all registered instances in all enabled Availability Zones. In
this case Availability Zone A has one instance and B has three instances.
Each instance is receiving 25% of the traffic. When cross zone load
balancing is disabled, each availability zone would get 50% of the traffic.
Your application demand is stable, and you are not expecting any changes in
demand soon. There are six EC2 instances currently handling all the
requests.Given this scenario would you still use auto scaling?
Auto scaling can monitor health of your instances and replace them if they
are not healthy
Auto Scaling can automatically maintain desired capacity. In this case,
auto scaling can ensure six instances are always available and replace
unhealthy instances
You are using Kinesis Firehose for your streaming data collection and
usage. You are expecting a 10x increase in data collection. What are your
options to increase scalability and throughput?
Firehose scalability is automatically managed. If needed, contact AWS Support
to increase the soft limit
You are designing an order processing application. Components of the
application are broken into containers and you have logically grouped the
components into two different ECS (Elastic Container Service) task
definitions: Web and App.
You want to grant least privilege needed for each task definition. What
approach is recommended for granting permissions?
Use ECS Task Role to grant permissions
ECS Instance Role is used for granting permissions to the EC2 instance. All
containers running on that instance will gain privileges granted with that role.
ECS Task Role allows you to grant fine grained access based on task specific
needs. All containers that are part of this task will gain privileges granted as part
of the role. ECS Container role is not supported IAM User access key/secret
access key can give control, but is not needed as roles can do the job.
You would like your Kinesis streaming data to be available for 35 days.
What are your options?
Store the data in other AWS Services like S3 or Redshift or Elasticsearch
Kinesis Streams has a maximum retention of 7 days and
Kinesis Firehose has a retention of 1 day
An ELB Classic Load Balancer is used for routing traffic to your containers.
A building has several sensors to monitor air quality. These sensors publish data to AWS
Kinesis streams for analysis. You would like to analyze the daily air quality trend over the
past one year across all sensors. This analysis would be considered as:
Batch Processing and this particular analysis is not suitable for Kinesis
Deeper analysis over longer duration should be considered as a batch
processing use case. These are not stream processing use cases. Kinesis
Streams can store data for a maximum of 7 days. You can store Kinesis data in
other systems like RedShift for deeper analysis
An online gaming platform uses DynamoDB table for keeping track of scores.
The Primary key consists of PlayerID as hash key and GameTitle as sort key.There are no
other indexes defined for that table. To find the top 10 high scoring players for a game:
DynamoDB has to scan the entire table to find the answer
Since there is no secondary index defined for game title and top score, it has to
scan the entire table. To speed-up, you can create a secondary index based on
these two attributes. Reference:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/BestPrac
tices.html
You are developing a mobile application that customizes user experience based on
logged on user. Your application needs to scale to very large number of concurrent
users with consistent millisecond latency. What backend store can you use for storing
user sessions?
S3 ACL allows you to specify the AWS account using email address or the
canonical user ID
A web application currently has four EC2 instances receiving traffic from
Elastic Load Balancer. Sticky session is enabled on the load balancer.
Each EC2 instance is capable of processing 1000 requests/second.
Depending on the request rate, your autoscaling policy maintains enough
instances.
Total requests suddenly spike to 6,000 requests per second (around 1500
per server). Autoscaling group responds to the situation by adding 2
additional servers. What will happen to the client requests?
Requests from existing clients may still be routed to only four servers
When Sticky session is enabled, request from a client are routed to the same
server. However, be aware that it can cause unpredictable behavior when traffic
pattern shifts very quickly. Newly added capacity may not be used to handle
existing user workload. When stick session is disabled, requests are evenly
distributed across all available instances
You company has a Finance department user named Alice. You use IAM to
manage your users.
After Alice left the company, you deleted the IAM user account 'Alice'.
After few months, another person named Alice joined the organization's IT
department.
Now you created a new IAM account with the same name Alice. S3 has a
finance bucket that granted access to original Alice using the ARN:
arn:aws:iam::123456789012:user/Alice; this permission still exists as part
of bucket level policy.
The newly joined Alice also has the same ARN; would she be able to
access the S3 finance bucket?
Newly added Alice would not gain access to finance bucket
ARN is transformed to the user's unique principal ID when the policy is saved.
This helps mitigate the risk of someone escalating their privileges by removing
and recreating the user. Each user has a unique ID.
False :
When an IAM entity is deleted, managed policy associated with it is automatically
deleted
Managed policy lives independently of the attached entity. Managed policies are
reusable with automatic versioning
Account B needs read access to a bucket owned by Account A. Which one of these options
would work for this scenario?
A: Create IAM Roles in Account A with Account B as a trusted entity. Attach
necessary read permissions to the role
For cross account access is performed in two steps: resource owner account
needs to trust the requester account and requester needs to explicitly delegate
permissions to other IAM users in requester’s account. If permissions are not
explicitly delegated, only requester’s root account and administrative accounts
can access resources in the other account. Account to account trust can be
established using IAM Roles or by using resource level policies
You want to put restrictions in a S3 bucket so that only your EC2 instances
can access the bucket. EC2 instances access S3 bucket using a VPC
endpoint. What condition can you use in the S3 Bucket Policy to restrict
access?
You need to upload several large files to S3. What is the recommended
process for uploading data? A: Multi-Part Upload
For objects larger than 100 megabytes, you should consider using the Multipart
Upload capability. Multi-part upload is now supported in AWS Command Line
tools, and SDKs
How do you prove your identity with Amazon SES and ISPs when you are
sending emails from your application?
Both SPF and DKIM are recommended
SPF is for identifying email servers that are authorized to send emails on your
domain’s behalf. This information is specified as part of your DNS resource
records. A recipient can query DNS service to cross check the server name and
detect if somebody is spoofing your email address DKIM is for protecting your
email messages from tampering. It is done using digital signing and your public
key needs to be listed as part of your DNS resource records. Recipient can query
DNS service to get public key and cross check the signature. Best practice is to
use both these methods
Your application has a Lambda function that needs access to both internet
services and a database hosted in private subnet of your VPC. What steps
are needed to accomplish this?
Configure Lambda function to run inside a private subnet of your VPC and
ensure route table has a path to NAT
Lambda functions, by default, are allowed access to internet resources. To
access databases and other resources in your VPC, you need to configure
Lambda function to run inside the context of a private subnet in your VPC. When
this is done: your lambda function gets a private IP address and can reach
resources in your VPC. In this mode, it can access internet services only if
private subnet has a route to a NAT device
You are using S3 and invoke a lambda function every-time an object is
added to S3 bucket. What influences concurrency when large number of
objects are added to S3 bucket?
Each published event is an unit of concurrency
There is an upper limit on number of concurrent lambda function executions for
your account in each region. You can optionally specify concurrent execution
limit at a function level to prevent too many concurrent executions. In this case,
lambda executions that exceed the limit are throttled. When synchronously
invoked, caller is responsible for retries. When asynchronously invoked, Lambda
service automatically retry twice. You can configure Dead Lead Queue where
failed events can be stored. S3 invokes Lambda asynchronously and unit of
concurrency is the number of configured events
You have multiple versions of lambda functions. You would like to control
which version of your function is invoked when an event source calls
lambda function. What capability can you use in Lambda to achieve this?
Use Lambda Alias and point to desired version. Configure Event source to use
Alias ARN
Lambda support versioning and you can maintain one or more versions of your
lambda function. Each lambda function has a unique ARN. Lambda also
supports Alias for each of your functions. Lambda alias is a pointer to a specific
lambda function version. Alias enables you to promote new lambda function
versions to production and if you need to rollback a function, you can simply
update the alias to point to the desired version. Event source needs to use Alias
ARN for invoking the lambda function.
https://docs.aws.amazon.com/lambda/latest/dg/aliases-intro.html
You like the efficiency gains of columnar storage Redshift offering by AWS.
You are developing an online transaction processing system that has very
large number of attributes that needs to be collected. Once the rows are
created, there will be frequent updates to few columns.
Is Redshift a good fit for this use case?
No
Redshift is optimized for batched write operations and for reading high volumes
of data. Columnar storage minimizes I/O and maximize data throughput by
retrieving only the blocks that contain data for the selected columns. It is not
meant for high frequency update use cases typically seen in OLTP systems
You are noticing performance issues with RDS MySQL database. Which
one of these is not a viable option to address the issue?
Scale out cluster to handle more write throughput
You can address performance issues in RDS MySQL database in several ways:
upgrade the server instance size, you can create read replica to offload read
queries, you can use elasticache to cache data that can be reused across
multiple requests. You can also optimize your database and tables. Scale out is
not supported in RDS/MySql for increasing write throughput.
A company has on-premises data center. For disaster recovery and archiving,
data is backed up to tapes and stored in an offsite location. You are asked to
provide suggestions for migrating to AWS cloud.
Which one of these is not correct?
Snowball is used for one-time migration to cloud or for moving large amount of data
to cloud. It is not meant for continuous backup
Benefit from 99.999999999% of durability for long term storage at very low cost
Avoid Manual Tape management Process using Storage Gateway Virtual Tape
Backup Solution
Store your data in a region of your choice.
Your business is transitioning from on-premise systems to Cloud based systems. Your
existing data size is over 100 TB. You need to transfer all this data in a timely and cost-
effective way to AWS Cloud.
What options do you have for this initial transfer?
Snowball offers convenient way to transfer large amount of data by physically
shipping the data using secure snowball appliance
In the mapping 'RegionMap', first do a lookup based on the region where the script
is running. Finally, filter based based on key '64'.
CloudFormation Template can be scripted in JSON or YAML format
In order to release a new version of your application software that is managed
using Elastic Beanstalk, you can:
You can upgrade an existing environment or create a brand new environment for the
application version
You can use GetAtt function to query the value of an attribute from a resource in the
template. Reference:
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-
function-reference-getatt.html
Swap Environment URL option in Elastic Beanstalk is convenient for handling
blue/green deployment scenarios. It allows you to:
Change a new environment to a production environment
Change a production environment into non-production environment
Revert back to old production environment
CloudFormation does not check for account limits. So, it is possible that your stack
creation may fail if it exceeds account limits
TRUE: Before deleting your resource in your existing stack, you can optionally
snapshot the supported resources
When managing application with Elastic Beanstalk, which one of these
choices is NOT recommended?
Deploying RDS instances with Elastic Beanstalk is not recommended. When you
delete an environment, you will loose the database. In addition, deploying database
with application forces you to rev both at the same time. This is not recommended
for production as you need flexibility to update database and application at their own
cadence
You need to deploy a load balancer in the EC2-Classic Network. Which load
balancing product supports EC2-Classic Network?
Only Classic Load Balancer can be deployed in a EC2-Classic network
You are hired to help with migrating a customers solution to the AWS Cloud.
Customer wants to perform a phased migration strategy and would like to
have complete stack running in on-premises data center and couple of new
web servers deployed in AWS that points to on-premises data base.
Client requests should routed across web servers in AWS as well as web
servers located on-premises.
Which load balancing product can be used for this requirement?
Both Application and Network Load Balancers allow you to add targets by IP
address. You can use this capability to register instances located on-premises and
VPC to the same load balancer. Do note that instances can be added only using
private IP address and on-premises data center should have a VPN connection to
AWS VPC or a Direct Connect link to your AWS infrastructure
You want to ensure only certain IP ranges are able to access your elastic load
balancer.
You are using Network Load Balancer that distributes traffic to a set of EC2
instances. Where can you enforce this policy?
Security Group of the EC2 instances
Network Load Balancer currently does not support Security Groups. You can
enforce filtering in Security Group of the EC2 instances. Network Load Balancer
forwards the requests to EC2 instances with source IP indicating the caller source IP
(Source IP is automatically provided by NLB when EC2 instances are registered
using Instance Target Type. If you register instances using IP address as Target
Type, then you would need to enable proxy protocol to forward source IP to EC2
instances)
You have an application for archiving images and videos related to sport
events. Game is divided in to 5 minute intervals for storage and a single
Archive file contains all images and videos for that 5 minute window. Archive
is stored in Glacier for long term storage. Occasionally, you get requests for
specific photos in an archive.
Range Retrieval allows you to retrieve only specified byte ranges. You pay only for
the actual data retrieved
Retrieve only the requested image from Glacier using Range Retrieval capability
Your team is using Glacier Archive as a low cost backup and archiving
solution. Occasionally, you get requests for retrieving files (typically less than
100 MB) and they need to be retrieved in under 30 minutes. What retrieval
option can you?
Expedited Retrieval can be used for Occasional requests and typically, data is
retrieved between 1-5 minutes (for files < 250 MB). Standard would take 3-5 hours
and Bulk would take 5-12 hours
Your team wants is using Glacier for low cost storage. You get frequent
requests for retrieving a small portion of archive (few 10s of MBs). These
requests need to be met in under 30 minutes. What retrieval option can you
use?
Expedited with Provisioned Capacity
Expedited retrieval would meet the requirement. However, expedited retrieval
request is accepted by Glacier only if there is capacity available. If capacity is not
available, Glacier will reject the request. To guarantee expedited retrieval
availability, you can purchase provisioned capacity
You want to track the memory utilization metrics for your EC2 instances.
Which one of these options can you use?
A media company has documents, videos, images and other paid content that
is currently hosted in their own datacenter. Content is personalized based on
signed-on user. Users are located world wide and they are complaining about
timeouts and poor performance. What options do you have to address this
situation? (Choose Two)
https://aws.amazon.com/premiumsupport/knowledge-center/?nc2=h_m_ma
You have an ASG that scales on demand based on the traffic going to your
new website: TriangleSunglasses.Com. You would like to optimise for cost, so
you have selected an ASG that scales based on demand going through your
ELB. Still, you want your solution to be highly available so you have selected
the minimum instances to 2. How can you further optimize the cost while
respecting the requirements?
Reserve two EC2 instances
This is the way to save further costs as we know we will run 2 EC2 instances no
matter what.
What is NOT helping to our application tier stateless?
t is NOT helping to our application tier stateless?
2. Package the software updates as an EBS snapshot and create EBS volumes for
each new software updates. This is too heavy in operations although it would work.
A: EFS is a network file system (NFS) and allows to mount the same file system to
100s of EC2 instances. Publishing software updates their allow each EC2 instance
to access them.
Golden AMI are a standard in making sure save the state after the installation or
pulling dependencies so that future instances can boot up from that AMI quickly.
If you're interested into reading more about disaster recovery, the whitepaper is here:
https://d1.awsstatic.com/asset-
repository/products/CloudEndure/CloudEndure_Affordable_Enterprise-
Grade_Disaster_Recovery_Using_AWS.pdf
You would like to get the DR strategy with the lowest RTO and RPO,
regardless of the cost, which one do you recommend?
A: Multi site
Which of the following strategies has a potentially high RPO and RTO?
Backup and restore
You have provisioned an 8TB gp2 EBS volume and you are running out of IOPS. What is
NOT a way to increase performance?
A: Increase the EBS volume size
Change to an io1 volume type
Mount EBS Volume in RAID 0
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#i
nstance-store-lifetime The data in an instance store persists only during the
lifetime of its associated instance. If an instance reboots (intentionally or
unintentionally), data in the instance store persists. However, data in the instance
store is lost under any of the following circumstances: The underlying disk drive
fails The instance stops The instance terminates
If you restart the instance, no data will be lost. If you stop the instance, data will be
lost
You would like to leverage EBS volumes in parallel to linearly increase
performance, while accepting greater failure risks. Which RAID mode helps
you in achieving that?
RAID
See https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/raid-config.html
EFS is a network file system (NFS) and allows to mount the same file system
on EC2 instances that are in different AZ
You would like to have a high performance cache for your application that
mustn't be shared. You don't mind losing the cache upon termination of your
instance. Which storage mechanism do you recommend as a
Solution Architect?
Application Load Balancer, and Network Load Balancer, they have static host
names. That means that basically we get a URL, we'll get a URL and that will be the
rest of time the URL that our application will use. We should not resolve that URL
and use the underlying IP. That is also a very popular question.
And so that is a very common exam question as well. Basically what is the feature
on the Load Balancer that allows our application to have multiple hostnames and
clients to connect using SSL? Server Name Indication
The answer is SNI.
SSL Certificates = ACM (Amazon certificate manager) = ability to support multiple
domains = SNI
for client Sever Name Indication feature.
But the question is, if the ASG were to do a scale in, scale in means remove an
instance, terminate an instance,
which instance in this whole diagram would it terminate? And that is an exam
question that is quite popular. So you need to understand that first you need to find
the AZ which has the most number of instances.
And that's the default policy, you can always change that, but the default is find the
AZ that hat the most number of instances. In our case, this is availability Zone A,
because it has four EC 2 instances. So it's definitely going to be one of these four
EC 2 instance, that's going to be terminated.
Load Balancers provide a static DNS name we can use in our application
The reason being that AWS wants your load balancer to be accessible using a static
endpoint, even if the underlying infrastructure that AWS manages changes
You are running a website with a load balancer and 10 EC2 instances. Your
users are complaining about the fact that your website always asks them to
re-authenticate when they switch pages. You are puzzled, because it's working
just fine on your machine and in the dev environment with 1 server. What
could be the reason?
Stickiness ensures traffic is sent to the same backend instance for a client. This
helps maintaining session data
This header is created by your load balancer and passed on to your backend
application
You quickly created an ELB and it turns out your users are complaining about
the fact that sometimes, the servers just don't work. You realise that indeed,
your servers do crash from time to time. How to protect your users from
seeing these crashes?
Health checks ensure your ELB won't send traffic to unhealthy (crashed) instances
You are designing a high performance application that will require millions of
connections to be handled, as well as low latency. The best Load Balancer for
this is
The application load balancer can route to different target groups based on
Hostname, Request Path all these except...Client IP
You are running at desired capacity of 3 and the maximum capacity of 3. You
have alarms set at 60% CPU to scale out your application. Your application is
now running at 80% capacity. What will happen?
The capacity of your ASG cannot go over the maximum capacity you have allocated
during scale out events
I have an ASG and an ALB, and I setup my ASG to get health status of
instances thanks to my ALB. One instance has just been reported unhealthy.
What will happen?
Because the ASG has been configured to leverage the ALB health checks,
unhealthy instances will be terminated
Your boss wants to scale your ASG based on the number of requests per
minute your application makes to your database.
The metric "requests per minute" is not an AWS metric, hence it needs to be a
custom metric
You would like to expose a fixed static IP to your end users for compliance
purposes, so they can write firewall rules that will be stable and approved by
regulators. Which Load Balancer should you use?
SNI (Server Name Indication) is a feature allowing you to expose multiple SSL certs
if the client supports it. Read more here: https://aws.amazon.com/blogs/aws/new-
application-load-balancer-sni/
An ASG spawns across 2 availability zones. AZ-A has 3 EC2 instances and
AZ-B has 4 EC2 instances. The ASG is about to go into a scale-in event. What
will happen?
The AZ-B will terminate the instance with the oldest launch configuration
Make sure you remember the Default Termination Policy for ASG. It tries to balance
across AZ first, and then delete based on the age of the launch configuration.
So TTL is basically a way for web browsers and clients to cache the response of a
DNS query. And the reason we do this is not to overload the DNS.
So we have Route 53, and that's our DNS for us,
Your company wants data to be encrypted in S3, and maintain control of the
rotation policy for the encryption keys, but not know the encryption keys
values. You recommend
With SSE-KMS you let AWS manage the encryption keys but you have full control of
the key rotation policy
With Client Side Encryption you fully manage the keys and perform the encryption
yourself, which is against the requirements of the question
Your company does not trust S3 for encryption and wants it to happen on the
application. You recommend
With Client Side Encryption you perform the encryption yourself and send the
encrypted data to AWS directly. AWS does not know your encryption keys and
cannot decrypt your data.
The bucket policy allows our users to read / write files in the bucket, yet we
were not able to perform a PutObject API call.
Explicit DENY in an IAM policy will take precedence over a bucket policy permission
You have a website that loads files from another S3 bucket. When you try the
URL of the files directly in your Chrome browser it works, but when the
website you're visiting tries to load these files it doesn't. What's the problem?
Cross-origin resource sharing (CORS) defines a way for client web applications that
are loaded in one domain to interact with resources in a different domain. To learn
more about CORS, go here:
https://docs.aws.amazon.com/AmazonS3/latest/dev/cors.html
Basically, when you attach an EC2 instance role, and you type security credentials,
you're going to get the role name, my first EC2 role. So my first EC2 role and what
we get out of this is an access key. A secret access key and a token, and so behind
the scenes, when you attach an IAM role, to an EC2 instance, the way for it to
perform API goals is that it queries this whole URL right here, which it gets an
access key ID, a secret access key and a token. And it turns out that this is a short
lived credentials. So as you can see, there is an expiration date
in here and that's usually something like one hour. And so the idea is that your EC2
instance gets temporary credentials through the IAM role that it got attached to it. So
this is basically how the IAM roles work
My EC2 Instance does not have the permissions to perform an API call
PutObject on S3. What should I do?
IAM roles are the right way to provide credentials and permissions to an EC2
instance
I should ask an administrator to attach a Policy to the IAM Role on my EC2 Instance
that authorises it to do the API call
I have an on-premise personal server that I'd like to use to perform
AWS API calls
Even better would be to create a user specifically for that one on-premise server
I should run `aws configure` and put my credentials there. Invalidate them when I'm
done
I need my colleagues help to debug my code. When he runs the application on
his machine, it's working fine, whereas I get API authorisation exceptions.
What should I do?
Compare his IAM policy and my IAM policy in the policy simulator to understand the
differences
To get the instance id of my EC2 machine from the EC2 machine, the best
thing is to...
Query the meta data at http://169.254.169.254/latest/meta-data
encryption in flight = HTTPS, and HTTPs cannot be enabled without an SSL
certificate
Server side encryptions means the server will encrypt the data for us. We don't need
to encrypt it beforehand
In server side encryption, the decryption also happens on the server (in AWS, we
wouldn't be able to decrypt the data ourselves as we can't have access to the
corresponding encryption key)
With client side encryption, the server does not need to know any information about
the encryption being used, as the server won't perform any encryption or decryption
tasks
We need to create User Keys in KMS before using the encryption features for
EBS, S3, etc...false
we can use the AWS Managed Service Keys in KMS, therefore we don't need to
create our own keys
SSM Parameter Store has versioning and audit of values built-in directly
We need to gain access to a Role in another AWS account. How is it done?
STS will allow us to get cross account access through the creation of a role in our
account authorized to access a role in another account. See more here:
https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-
roles.html
OS patching is Amazon's responsibility for RDS. But if we use EC2, it is our
responsibility
Under the shared responsibility model, what are you responsible for in RDS?
Security Group Rules
You have a mobile application and would like to give your users access to
their own personal space in Amazon S3. How do you achieve that?
Cognito is made to federate mobile user accounts and provide them with their own
IAM policy. As such, they should be able thanks to that policy to access their own
personal space in Amazon S3.
Never make an S3 bucket public to your mobile application users. This would result
in data leaks! Read horror stories here:
https://businessinsights.bitdefender.com/worst-amazon-breaches
Read Replicas have asynchronous replication and therefore it's likely our users will
only observe eventual consistency
Which RDS feature does not require us to change our SQL connection string ?
Multi AZ keeps the same connection string regardless of which database is up.
Read Replicas imply we need to reference them individually in our application as
each read replica will have its own DNS name
Read Replicas will help as our analytics application can now perform queries against
it, and these queries won't impact the main production database.
You have a requirement to use TDE (Transparent Data Encryption) on top of
KMS. Which database technology does NOT support TDE on RDS?
PostgreSQL
Which RDS database technology does NOT support IAM authentication?
Oracle
You would like to ensure you have a database available in another region if a
disaster happens to your main region. Which database do you recommend?
The DNS protocol does not allow you to create a CNAME record for the top node of
a DNS namespace (mycoolcompany.com), also known as the zone apex
After updating a Route 53 record to point "myapp.mydomain.com" from an old
Load Balancer to a new load balancer, it looks like the users are still not
redirected to your new load balancer. You are wondering why...
DNS records have a TTL (Time to Live) in order for clients to know for how long to
caches these values and not overload the DNS with DNS requests. TTL should be
set to strike a balance between how long the value should be cached vs how much
pressure should go on the DNS.
You want your users to get the best possible user experience and that means
minimizing the response time from your servers to your users. Which routing
policy will help?
Latency will evaluate the latency results and help your users get a DNS response
that will minimize their latency (e.g. response time)
You have purchased a domain on Godaddy and would like to use it with Route
53. What do you need to change to make this work?
Private hosted zones are meant to be used for internal network queries and are not
publicly accessible. Public Hosted Zones are meant to be used for people
requesting your website through the public internet. Finally, NS records must be
updated on the 3rd party registrar.
So the AWS glue crawler is one component of glue. Basically it scans your data in
S3 and often the glue crawler will infer a schema automatically just based on the
structure of the data that it finds there in your S3 buckets. So if you have some CSV
or TSV data sitting in S3 it will automatically break out those columns for you
automatically.
Maybe sometimes you need more consistent network experience, because you're
experiencing data drops, you're experiencing connection shutdowns, you want to
have realtime data feeds
on your application, they're shutting down too often. Direct Connect is a great
option for this.
Or maybe you want to just have a hybrid environment (on prem + cloud )
helloare you there
security groups are stateful and if traffic can go out, then it can go back in
bb
CIDR not should overlap, and the max CIDR size in AWS is /16
Route tables must be updated in both VPC that are peered to communicate
Which are the only two services that have a Gateway Endpoint instead of an
Interface Endpoint as a VPC endpoint? ANS: s3 and Dynamo DB, all the other
ones have an interface endpoint (powered by Private Link - means a private IP)
With SSE-S3 you let go of the management of the encryption keys
Client side enc: Here you have full control over the encryption keys, and you must
do the encryption yourself
SSE-C: Here you have full control over the encryption keys, and let AWS do the
encryption
https://atom.io/packages/language-yaml
https://atom.io/ text editor for YAML
No Echo will ensure your parameter will not appear in any log, like a password!
MFA Delete forces users to use MFA tokens before deleting objects. It's an extra
level of security to prevent accidental deletes
You are preparing for the biggest day of sale of the year, where your traffic will
increase by 100x. You have already setup SQS standard queue. What should you
do? A: SQS scales automatically
Delay queues let you postpone the delivery of new messages to a queue for a
number of seconds. If you create a delay queue, any messages that you send to the
queue remain invisible to consumers for the duration of the delay period. The
default (minimum) delay for a queue is 0 seconds. The maximum is 15 minutes
You need to move hundreds of Terabytes into the cloud in S3, and after that pre-
process it using many EC2 instances in order to clean the data. You have a 1
Gbit/s broadband and would like to optimise the process of moving the data and
pre-processing it, in order to save time. What do you recommend?
Your SQS costs are extremely high. Upon closer look, you notice that your
consumers are polling SQS too often and getting empty data as a result. What
should you do?
Long polling helps reduce the cost of using Amazon SQS by eliminating the number
of empty responses (when there are no messages available for a ReceiveMessage
request) and false empty responses (when messages are available but aren't
included in a response)
ANS: Snowball Edge is the right answer as it comes with computing capabilities
and allows use to pre-process the data while it's being moved in Snowball, so we
save time on the pre-processing side as well.
CloudFront Signed URL are commonly used to distribute paid content through
dynamic CloudFront Signed URL generation. Q: Which features allows us to
distribute paid content from S3 securely, globally, if the S3 bucket is secured to
only exchange data with CloudFront?
S3 CRR allows you to replicate the data from one bucket in a region to another
bucket in another region
• Geo Restriction allows you to specify a list of whitelisted or blacklisted countries
in your CloudFront distribution. Q: How can you ensure that only users who access
our website through Canada are authorized in CloudFront?
You'd like to send a message to 3 different applications all using SQS. You should
This is a common pattern as only one message is sent to SNS and then "fan out" to
multiple SQS queues
You have a Kinesis stream usually receiving 5MB/s of data and sending out 8 MB/s
of data. You have provisioned 6 shards. Some days, your traffic spikes up to 2
times and you get a throughput exception. You should add more shards
Each shard allows for 1MB/s incoming and 2MB/s outgoing of data
You are sending a clickstream for your users navigating your website, all the way
to Kinesis. It seems that the users data is not ordered in Kinesis, and the data for
one individual user is spread across many shards. How to fix that problem?
By providing a partition key we ensure the data is ordered for our users
• Kinesis Analytics is the product to use, with Kinesis Streams as the underlying
source of data
K streams + firehose is a perfect combo of technology for loading data near real-
time in S3 and Redshift
You want to send email notifications to your users. You should use SNS
Has that feature by default
You have many microservices running on-premise and they currently
communicate using a message broker that supports the MQTT protocol. You would
like to migrate these applications and the message broker to the cloud without
changing the application logic. Which technology allows you to get a managed
message broker that supports the MQTT protocol?Amazon MQ Supports JMS,
NMS, AMQP, STOMP, MQTT, and WebSocket
You'd like to have a dynamic DB_URL variable loaded in your Lambda code
Environment variables allow for your Lambda to have dynamic variables from
within
A DynamoDB table has been provisioned with 10 RCU and 10 WCU. You would
like to increase the RCU to sustain more read traffic. What is true about RCU and
WCU?
You are about to enter the Christmas sale and you know a few items in your
website are very popular and will be read often. Last year you had a
ProvisionedThroughputExceededException. What should you do this year?
You would like to automate sending welcome emails to the users who subscribe to
the Users table in DynamoDB. How can you achieve that? Enable DB streams and
have lambda fns recieve events in real time
Amazon Cognito Sync is an AWS service and client library that enables cross-
device syncing of application-related user data
This would work but require a lot more manual work --DB user table with
Lambda authorizer
As a solutions architect, you have been tasked to implement a fully Serverless REST
API. Which technology choices do you recommend? API gateway + lambda
Lambda does not have an out of the box caching feature (it's often paired with API
gateway for that)
Which service allows to federate mobile users and generate temporary credentials
so that they can access their own S3 bucket sub-folder? cognito in combination
with STS
You would like to distribute your static content which currently lives in Amazon
S3 to multiple regions around the world, such as the US, France and Australia.
What do you recommend? cloudfront
You would like to create a micro service whose sole purpose is to encode video files
with your specific algorithm from S3 back into S3. You would like to make that
micro-service reliable and retry upon failure. Processing a video may take over 25
minutes. The service is asynchronous and it should be possible for the service to be
stopped for a day and resume the next day from the videos that haven't been
encoded yet. Which of the following service would you recommend to implement
this service?
SQS allows you to retain messages for days and process them later, while we take
down our EC2 instances
You would like to distribute paid software installation files globally for your
customers that have indeed purchased the content. The software may be
purchased by different users, and you want to protect the download URL with
security including IP restriction. Which solution do you recommend A:
Cloudfront pre s urls This will have security including IP restriction
You are a photo hosting service and publish every month a master pack of
beautiful mountains images, that are over 50 GB in size and downloaded from all
around the world. The content is currently hosted on EFS and distributed by ELB
and EC2 instances. You are experiencing high load each month and very high
network costs. What can you recommend that won't force an application refactor
and reduce network costs and EC2 load dramatically?
A: CloudFront can be used in front of an ELB
You would like to deliver big data streams in real time to multiple consuming
applications, with replay features. Which technology do you recommend? Kenisis
Data streams
https://www.cyberciti.biz/tips/top-linux-monitoring-tools.html
https://www.hackerearth.com/practice/python/getting-started/input-and-
output/tutorial/
https://github.com/awslabs/aws-cloudformation-templates
https://github.com/awslabs/aws-cloudformation-
templates/blob/master/aws/solutions/WordPress_Single_Instance.yaml
https://github.com/cloudtools/troposphere
https://github.com/londonappbrewery/Flutter-Course-Resources
keybr.com