Handout Accelerate To Production With Serverless Compute For Generative AI Applications
Handout Accelerate To Production With Serverless Compute For Generative AI Applications
Handout Accelerate To Production With Serverless Compute For Generative AI Applications
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Accelerate to production with serverless
compute for generative AI applications
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do I prioritize my projects?
The Year of How I can I scale this? Which models should I use?
Production
Should I train my own model? How do I manage risks?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How foundation models differ from other ML models
Tasks Tasks
Labeled
ML model Text generation Text generation
data
Labeled
ML model Summarization Summarization
data
Labeled
ML model Q&A Q&A
data
Labeled
ML model Chatbot Chatbot
data
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is serverless?
Compute
Physical
infrastructure
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI personas
Consumers
Tuners
Builders
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RACQ - Claim Research Assistant
Business objectives
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RACQ - continued
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI on AWS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI stack
Applica tions that leverage LLMs and other foundation models (FMs)
Amazon Bedrock
Amazon EC2 Elastic Fabric Amazon EC2 AWS Nitro AWS Neuron
UltraClusters Adapter (EFA) Capacity Blocks
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI stack
Applica tions that leverage LLMs and other foundation models (FMs)
Amazon Bedrock
Amazon EC2 Elastic Fabric Amazon EC2 AWS Nitro AWS Neuron
UltraClusters Adapter (EFA) Capacity Blocks
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Choice of leading FMs through a single API
Model customization
Amazon Bedrock
Retrieval Augmented Generation (RAG)
The easiest way to build and scale
generative AI applications with
foundation models (FMs) Agents that execute multistep tasks
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Bedrock
simplifies
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Bedrock
Broad choice of models
Contextual answers, Text summarization, Summarization, Text generation, Q&A and reading Text summarization, High-quality images
summarization, generation, complex reasoning, search, classification comprehension text classification, and art
paraphrasing Q&A, search, writing, coding text completion,
image generation code generation, Q&A
Jamba-Instruct Amazon Titan Claude 3.5 Sonnet Command Llama 3 8B Mistral Small Stable Diffusion XL1.0
Text Premier
Jurassic-2 Ultra Claude 3 Opus Command Light Llama 3 70B Mistral Large Stable Diffusion
Amazon Titan XL 0.8
Jurassic-2 Mid Claude 3 Sonnet Embed English Llama 2 13B Mistral 7B
Text Lite
Claude 3 Haiku Embed Multilingual Llama 2 70B Mixtral 8x7B
Amazon Titan
Text Express Claude 2.1 Command R+
Amazon Titan Text Claude 2 Command R
Embeddings Claude Instant
Amazon Titan Text
Embeddings V2
Amazon Titan
Multimodal
Embeddings
Amazon Titan
Image Generator
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Anthropic Claude Models on Amazon Bedrock
Choose the exact combination of intelligence, speed, and cost to suit your needs
Vision ✓ ✓ ✓ ✓
Input: $0.003 $0.00025 $0.003 $0.015
Cost*
Output: $0.015 $0.00125 $0.015 $0.075
*Per 1K tokens
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RACQ – Claim Research Assistant –
Technical objectives
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Architecture
AWS Cloud
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
AWS Step Functions – Gather data
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
AWS Step Functions – Generate “Prompt Chunks”
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
AWS Step Functions – Process “Prompt Chunks”
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
AWS Step Functions – Process Eval “Prompt
Chunks”
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
AWS Step Functions – Generate summary
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
AWS Step Functions – Notify the user
Disclaimer: The following content is for information purposes only. RACQ provides no warranties as to completeness or accuracy and accepts no liability for any reliance on this content.
Why build generative AI
applications?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Combining speed and power
Rapid delivery
of smarter
Power of generative applications and
Speed of serverless features with
AI
focus on
innovation
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Serverless spectrum
AWS offers a wide portfolio of serverless services
Compute Storage
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hosting and serving generative AI
applications with serverless
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hosting and serving generative AI on serverless - Architecture
Source: Build and scale generative AI applications with Amazon Bedrock workshop
Generative AI layer
AWS Lambda
Send mail Amazon SNS
Front-end layer
User
Amazon CloudFront Application Load
Balancer
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Focus on AI code
serverless
compute for Cost-effective
generative
AI? Agility with rapidly
evolving FMs
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Focus on AI code Built-in auto scaling, high
availability, and fault
tolerance to ensure
Why Flexible integrations developer productivity,
serverless alleviating teams from the
complexity of infrastructure
compute for planning and management
generative Cost-effective for high throughput
inference workloads.
AI?
Agility with rapidly
evolving FMs
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is a serverless
operational model?
Business logic
Customer
Serverless services simplify the API
management and scaling of cloud
applications by shifting undifferentiated Messaging & orchestration
operational tasks to the cloud provider so
development teams can focus on writing
Storage & databases
code that solve business problems
AWS
Compute
Physical infrastructure
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Focus on AI code
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Focus on AI code
Flexible integrations
Why serverless
compute for Cost-effective
Cost-effectively scale
infrastructure to train and run
generative AI? FMs containing hundreds of
billions of parameters with pay-
by-request, ensuring that you
Agility with rapidly only pay for the duration and
evolving FMs quantity of inferences.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Focus on AI code
Flexible integrations
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Emerging generative AI patterns
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Generative AI patterns for consumers and tuners
Building Generative AI applications with Serverless Solutions
Inference enrichment
Transform image text
Model fine-tuning
Fine-tuning to
enhance generative
AWS AWS Step AI performance
Lambda Functions
Amazon ECS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Serverless and generative AI = Faster path to
production
1 2 3 4
Enhance reliability Reduced infrastructure Cost-effective Rapid development and
management pricing model composability
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Get started with serverless and generative AI
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visit the Migrate. Modernize. Build. resource hub
Dive deeper into these resources:
… and more!
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Training and Certification
Access 600+ free digital courses with AWS Skill Builder
Focus on the cloud skills and services that are most relevant to you across
30+ AWS solutions, including digital self-paced learning plans and ramp-up
guides
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you for attending AWS Innovate – Migrate. Modernize. Build.
twitter.com/AWSCloud
facebook.com/AmazonWebServices
youtube.com/user/AmazonWebServices
linkedin.com/company/amazon-web-services
twitch.tv/aws
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.