Cantrill Review - Serverless and Application Services
Cantrill Review - Serverless and Application Services
• Monolithic Architecture
• Fails together
• Scales together
• Billed together
• Tiered Architecture
• Each tier can be scaled independently
• Each tier connects to each
• Enables horizontal scaling of individual tiers since a single
endpoint (LB) is used to direct traffic between the tiers
• Each tier has to be running something for the app to function
• Ex. Upload expects and REQUIRES Processing to respond
• Event Driven Architecture
• Producers, Consumers, or BOTH
• No constant running or waiting for things
• Producers generate events when something happens
• Clicks, errors, criteria met, uploads, actions
• Events are delivered to consumers
• Generally via Event Router
• Actions are taken and the system returns to waiting
• Mature event-driven architecture only consumes resources while
handling events
• serverless
Serverless Architecture –
• Serverless isn’t one single thing
• You manage few, if any servers – low overhead
• Applications are a collection of small & specialized functions
• Functions run in Stateless and Ephemeral environments – duration billing
• Event-driven – consumption only when being used
• FaaS is used where possible for compute functionality
• No consistent use for compute
• Managed services are used where possible
• S3, DynamoDB, OIDC
Step Functions –
• Long – running serverless workflow-based applications.
• Some problems with Lambda
• Lambda is FaaS
• 15 minute max execution time
• Can be chained together – gets messy at scale
• Runtime Environments are stateless
• Step Function State Machines
• Serverless workflow - START –> STATES -> END
• States are THINGS that occur
• Maximum Duration – 1 year
• Standard Workflow (Default) – 1 year execution limit
• Express Workflow – 5 minute
• High volume event processing workloads
• IOT, Streaming, Mobile app backend
• Highly transactional
• Started via API Gateway, IOT Rule, EventBridge, Lambda
• Generally used for backend processing
• Amazon States Language (ASL) – JSON Template
• IAM Role is used for permissions
• States –
• Succeed & Fail –
• Wait – waits for certain period of time or specific date and
time
• Choice – allows the state to change based on the input
• Email Only
• Email and SMS
• SMS Only
• Parallel –
• Map – takes action based on the item/input
• Task – single unit of work performed by a State Machine –
integrates with:
• Lambda
• Batch
• DynamoDB
• ECS
• SNS
• SQS
• Glue
• SageMaker
• EMR
• Step Functions
API Gateway 101 –
• Create and Manage APIs
• Endpoint/entry-point for applications
• Sits between applications & integrations (services)
• Highly available, scalable, handles authorization, throttling, caching,
CORS (secure cross domain), transformations, OpenAPI spec (third-party API
integration) direct integration with a range of AWS services.
• Public Service
• Can connect to services/endpoints in AWS or on-premises
• Supported types:
• HTTP APIs
• REST APIs
• WebsSocket APIs
• Phases –
• Request
• Authorize, Transform, Validate
• Client
• Action taken via integrated service – DDB, SNS, Step
Functions, HTTP Endpoints, Lambda
• Response
• Transform, Prepare, Return
• CloudWatch Logs can store and manage full state request and response
logs.
• CloudWatch can store metrics for client and integration sides
• API Gateway cache can be used to reduce the number of calls made to
backend integrations and improve client performance
• API Gateway Authentication
• Cognito User Pools
• Client authenticates with cognito and receives a token if
successful auth, passed to services
• Lambda Based Authorization
• Formerly known as Custom Authentication
• Client calls API Gateway with a bearer token (ID)
• Lambda authorizer called – checks local user store or
external ID provider
• IAM policy and principal identifier
• API Gateway handles return request via Lambda integration
or returns error 403 ACCESS_DENIED to client
• API Gateway Endpoint Types
• Edge Optimized
• Any incoming requests are routed to the nearest CloudFront
POP (Presence)
• Regional
• Clients in the same region
• Doesn’t use CF network
• Low overhead
• Private
• Only accessible within a VPC via an interface endpoint
• API Gateway Stages
• APIs are deployed to stages – each sage has one deployment
• Rollback supported
• Canary deployments
• If enabled – deployments are made to the canary not the
stage
• Canary = sub part of a Stage
• Stages enabled for canary deployments can be configured so
a percentage of traffic is sent to the canary.
• This can be adjusted over time or the canary can be
promoted to make it the new base stage
• API Gateway – ERRORS
• 4XX – Client Error – Invalid request on client side
• 5XX – Server Error – Valid request, backend issue
• 400 – Bad Request – Generic
• 403 – Access Denied – Authorizer denies OR WAF Filtered
• 429 – API Gateway can throttle – this means you’ve exceeded that
amount (configured)
• 502 – Bad Gateway Exception – bad output returned by lambda
(backend compute error)
• 503 – Service Unavailable – backing endpoint offline? Major
Service issues
• 504 – Integration Failure/Timeout – 29s (Backed by
Lambda/Limitation of Lambda)
AWS Glue
• Fully managed serverless extract, transform, and load (ETL) service that
makes it easy for customers to prepare and load data for analytics.
• Point AWS Glue to your data stored on AWS, AWS Glue discovers the
data and stores the associated meta data (e.g., table definition and schema) in the AWS
Glue Data Catalog. Once cataloged, the data is immediately searchable, queryable, and
available for ETL.
• Serverless ETL (Extract, Transform, & Load)
• Vs. Datapipeline (which can do ETL), but uses compute(EMR Servers)
• Moves and transforms data between source and destination
• Crawls data sources and generates the AWS Glue Data Catalog
• Data Sources
• Stores
• S3, RDS, JDBC Compatible & DynamoDB
• Streams
• Kinesis Data Stream & Apache Kafka
• Data Targets
• S3, RDS, JDBC (Java Database Connectivity) Databases
• Data Catalog
• Collection of meta data
• Persistent metadata about data sources in a region
• One catalog per region per account
• Avoids data silos – improves visibility
• Amazon Athena, Redshift Spectrum, EMR & AWS Lake Formation
all use Data Catalog
• Configure crawlers for data sources – give credentials + point at
sources
• Crawlers connect to datastores, determine schema and create metadata in
the data catalog
• When resources are required, glue allocates from a AWS Warm Pool to
perform the ERL processes
• Glue Jobs can be initiated manually or via events using Event Bridge
Amazon MQ 101
• SNS and SQS are AWS Services – using AWS APIs
• SNS Provides TOPIC and SQS provides QUEUES
• Public Services, Highly Scalable, AWS Integrated
• Many ORGS already use Topics and Queues
• And want to Migrate into AWS
• SNS and SQS won’t work out of the box
• We need a standards compliant solution for migration
• Amazon MQ
• Open-source message broker
• Based on Managed ApacheMQ
• JMS API Protocols –
• AMQP
• MQTT
• OpenWire
• STOMP
• Provides QUEUES and TOPICS
• One-to-one OR One-to-many
• Provided message broker servers
• Single Instance – Test, Dev, Cheap
• HA Pair – Active/Standby
• VPC Based – NOT A PUBLIC Service – Private networking
required
• Requires VPN or Direct Connect
• No AWS native integration – delivers ActiveMQ product which
you manage
• Amazon MQ vs SNS+SQS
• SNS or SQS for most new implementations (default to this)
• SNS or SQS if AWS integration is required – logging, permissions,
encryption, service integration)
• Amazon MQ if you need to migrate from an existing system with
little to no application change
• Amazon MQ if APIs such as JMS or protocols such as AMQP,
MQTT, OpenWire, and STOMP are needed
• Remember you need private networking for Amazon MQ – VPN or
Direct Connect
Amazon AppFlow
• Fully Managed Integration Service
• Exchange data between applications (connectors) using flows
• Flow consists of Source and Destination connectors
• Sync data across applications
• Aggregate data from different sources
• Public Endpoints, but works with PrivateLink (Privacy)
• AppFlow Custom Connector SDK (build your own)
• Use cases –
• Contact records from Salesforce => Redshift
• Support Tickets from Zendesk => S3
• Connections store configuration & credentials to access applications
• Connections can be reused across many flows – they are defined separately
• Configure Source to Destination field Mapping
• Optional – Data Transformation
• Optional – Configure Filters and Validation
• Frequency –
• Schedule
• In response to a business event
• On-demand
• Automatically encrypts data in flight, and allows users to restrict data from
flowing over the public internet for SaaS applications that are integrated with AWS
PrivateLink
• Reduces exposure to security threats