0% found this document useful (0 votes)
48 views29 pages

Architecture Patterns For Building Generative AI Applications

Uploaded by

soumya paul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views29 pages

Architecture Patterns For Building Generative AI Applications

Uploaded by

soumya paul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

ACOT104

Praveen Jayakumar
Head of AI/ML Solutions Architecture
AWS India

© 2023, Amazon Web


© 2023,
Services,
Amazon
Inc. or
Webits Services,
affiliates.Inc.
All or
rights
its affiliates.
reserved.All
Amazon
rights Confidential
reserved. Amazon
and Trademark.
Confidential and Trademark.
Amazon
Broad choice of models

Jurassic-2 Ultra Titan Text Embeddings Claude 2 Command + Embed Llama 2 Stable Diffusion XL1.0
Jurassic-2 Mid Titan Multimodal Embeddings Claude 2.1 Cohere Command Light Llama 2 13B
Titan Text Lite Claude Instant Cohere Embed English Llama 2 70B
Titan Text Express Cohere Embed Multilingual
Titan Image Generator

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Why customize?

Customize to Adapt to Enhance Improve


specific business domain-specific performance context-awareness
needs language for specific tasks in responses

E.g. Healthcare – Understand E.g. Finance – Teach financial & E.g. Customer Service – Improve E.g. Legal Services – Better
medical terminology and provide accounting terms to provide good ability to understand and respond understand case facts and law to
accurate responses related to analysis for earnings reports to customer’s inquires and provide useful insights for
patient’s health complaints attorneys

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Common approaches for customizing foundation models (FMs)

Complexity,
Quality,
Cost,
Time
Adjust behavior of
a pre-trained FM

Augment knowledge
without changing
pre-trained model Train FM
weights from scratch

Retrieval Customize
Augmented
Prompt
Generation
Engineering (RAG)

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 4
Customize vs. augment
External data
Consolidated or
sources
historical info
or up-to-date info Task
information

Relatively static Dynamic


information information Complex Simple
(e.g. docs, FAQs) Real-time (e.g. DBs, APIs) or specific or generic
Simple
required? task?

Augment with Prompt


Augment with RAG Customize
agents and tools engineering

Amazon Bedrock Amazon Bedrock Amazon Bedrock Amazon Bedrock


Knowledge Bases Agents Custom Models FMs
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 5
What is Retrieval Augmented Generation?

Retrieval Augmentation Generation

Fetches the relevant Adding the retrieved Response from the


content from the relevant context to the foundation model based
external knowledge base user prompt, which goes on the augmented
or data sources based on as an input to the prompt.
a user query foundation model

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 6
RAG use cases

Improved content Contextual chatbots and Personalized search Real-time data


quality question answering summarization

E.g., helps in reducing E.g., enhance chatbot capabilities E.g., searching based on user E.g., retrieving and summarizing
hallucinations and connecting by integrating with real-time data previous search history and transactional data from
with recent knowledge including persona databases, or API calls
enterprise data

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 7
Types of retrieval

0.89 -0.02 -0.53 0.95 0.17 -0.38

Rule Based Structured data Semantic Search


Get relevant documents
Fetches unstructured Transactional retrieval
based on text
data like documents from database or API
embeddings

e.g., Select customers Subway


e.g., Key word searches New York Statue Liberty
from All_orders where Tall buildings
order == ‘XYZ’

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 8
What are embeddings?
• Numerical representation of
text (vectors) that captures
semantics and relationships New York 0.027 -0.011 … -0.023
between words.
Paris 0.025 -0.009 … -0.025

• Embedding models capture


features and nuances of the EMBEDDING
-0.011 0.021 … 0.013
Animal MODEL
text.
Horse -0.009 0.019 … 0.015
• Rich embeddings can be used
to compare text similarity. Human Text Vector Embeddings

• Multilingual Text Embeddings


can identify meaning in
different languages.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 9
Why are embeddings important for RAG?

Powers text retrieval based Used to augment prompts High-accuracy embeddings


on semantic meaning. with more accurate leads to improved context
context from vector stores and higher quality LLM-
using the Retrieval generated responses to a
Augmented Generation user query.
(RAG).
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 10
Titan text embeddings model

Amazon Titan Text Embeddings Highlights


V2.0

Translates text inputs (words, phrases) into numerical • Titan Text Embeddings offers fast, cost
representations (embeddings). Comparing effective, high-performance, accurate
embeddings produces more relevant and contextual embeddings in 25 languages.
responses than word matching.
• Optimized for text retrieval tasks, semantic
similarity and clustering.

• Applications of this model includes


Max Tokens: 8,000 semantic search and personalization.
Output Vectors: 1,536
Language: Multilingual (25 languages)

Model ID: amazon.titan-embed-g1-text-02

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 11
RAG in Action

User Input

Prompt Large Language


Text User
augmentation Model
Response
Generation
Workflow Embeddings
Context
model

Embedding 0.89 -0.02 -0.53 0.95 0.17 -0.38

Data Ingestion Semantic


search
Workflow
Vector store Embeddings model Document chunks Data source

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 12
However, when it comes to implementing RAG,
there are challenges…

Managing Creating vector Incremental


multiple data embeddings for large updates to vector
sources volumes of data store

Coding effort Scaling retrieval Orchestration


mechanism
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 13
Fully managed support for end-to-end
RAG workflow

Knowledge Bases for Securely connect FMs and agents to

Amazon Bedrock data sources

Gives FMs and agents contextual


information from your private data Easily retrieve relevant data and
sources for Retrieval Augmented augment prompts
Generation (RAG) to deliver more
relevant, accurate, and customized
responses.
Provide source attribution

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 14
Data Ingestion Workflow
KNOWLEDGE BASES FOR AMAZON BEDROCK

Fully
managed
data
ingestion
Data source Embeddings
workflow New data Document chunks
model
Vector store

• Choose your data source • Choose your • Choose your • Choose your vector
(Amazon S3) chunking strategy embedding model store
• Support for incremental • Fixed chunks • Amazon Titan • Open search
updates • No chunking serverless
• Multiple data file formats • Default (200 • Pinecone
supported tokens) • Redis

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 15
Fully managed data ingestion
KNOWLEDGE BASES FOR AMAZON BEDROCK

Fully
managed
data
ingestion
Data source Embeddings Vector store
workflow New data Document chunks

Automated and fully managed data ingestion using


model

• Choose your data source • Choose your • Choose your • Choose your vector
Knowledge
(Amazon S3) Bases •for
chunking Amazon
strategy Bedrock
embedding model store
• Fixed chunks
Support for incremental • Amazon Titan • Open search
updates • No chunking serverless
• Multiple data file format • Default (200 • Pinecone
tokens) • Redis
supported

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 16
RetrieveAndGenerate API
KNOWLEDGE BASES FOR AMAZON BEDROCK

User Input
RetrieveAndGenerate
API

User Response
Fully User query Generated
managed response
RAG

Generate query Retrieve similar Augment query with Generate response


embedding documents from retrieved documents from LLM
knowledge bases

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 17
Customize RAG workflows using Retrieve API
KNOWLEDGE BASES FOR AMAZON BEDROCK

User Input

User Prompt Large Response


augmentation Language
Model

Customized
RAG Retrieve API
workflow
User Retrieved
Context
query documents

Generate query Retrieve similar documents


embedding from knowledge bases

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 18
Knowledge Bases integration with Agents

Search

Knowledge Bases
Query

Retrieval

Agent

Query + Retrieval
Large Language
Model
Response Generation

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 19
Customize

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 20
Instruction fine-tuning and continued pre-training

Domain
adaptation
(e.g. extend
knowledge)
Instruction fine-tuning Continued
pre-training
• Instruction training dataset is available?
+
• Specific style, behavior required? Continued Instruction
Pre-training fine-tuning
Continued pre-training
• Raw dataset (e.g. PDFs)
• Additional knowledge through domain adaptation Instruction fine-
tuning

Task specialization
(e.g. behavior, style)
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 21
Datasets for instruction fine-tuning and continued pre-training

Instruction dataset Raw data


(e.g. question-answer) (e.g. PDFs)
Dataset

Instruction Continued
fine-tuning pre-training

Amazon Bedrock Amazon Bedrock


Custom Models Custom Models

{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 22
Example of Instruction fine-tuning dataset

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 23
Amazon Bedrock custom models New!

Create custom models using the console or APIs

Maximize accuracy of FMs by providing labeled


or raw unlabeled data

Once deployed, custom models are invoked the


same way as base models
(playground or API)

Customizations now supported for Amazon


Titan and some third party FMs

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 24
Components of a customization job

Inputs Outputs Storage Inferencing

Base FM Metrics and Custom Models Playground


Logs Stored Securely
by Amazon
Hyper Bedrock API
Parameters Output
Model
Input
Data

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 25
Customization architecture overview
Amazon Bedrock service account Model deployment account
(AWS owned and operated)
via the console, SDKs, and API
All incoming network traffic

Training orchestration

Amazon Bedrock service


Base model S3
bucket Custom
Provisioned Job
Runtime inference Capacity
API Compute
endpoint

Fine-tuned model
S3 bucket

Customer account

Identity & Access,


Virtual private cloud
monitoring & logging

Training data
AWS Amazon AWS S3 bucket
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
CloudTrail CloudWatch IAM 26
Security and privacy
You are always in control of your data

ü Data not used to improve models, and not shared with model providers

ü Customer data remain in Region

ü Support for AWS PrivateLink and VPC configurations

ü Integration with AWS IAM

ü API monitoring in AWS CloudTrail, logging & metrics in Amazon CloudWatch

ü Custom models encrypted and stored with Service or Customer Managed Keys
(CMK) - Only you have access to your models

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 27
Recap

Knowledge Bases for Fine tuning and


Customization vs RAG concepts Amazon Bedrock Customization Continued Pre-
augmentation concepts training

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 28
Thank you!
Praveen Jayakumar
[email protected]

© 2023, Amazon Web


© 2023,
Services,
Amazon
Inc. or
Webits Services,
affiliates.Inc.
All or
rights
its affiliates.
reserved.All
Amazon
rights Confidential
reserved. Amazon
and Trademark.
Confidential and Trademark.

You might also like