0% found this document useful (0 votes)

48 views29 pages

Architecture Patterns For Building Generative AI Applications

Uploaded by

soumya paul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views29 pages

Architecture Patterns For Building Generative AI Applications

Uploaded by

soumya paul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

ACOT104

Praveen Jayakumar
Head of AI/ML Solutions Architecture
AWS India

© 2023, Amazon Web

© 2023,
Services,
Amazon
Inc. or
Webits Services,
affiliates.Inc.
All or
rights
its affiliates.
reserved.All
Amazon
rights Confidential
reserved. Amazon
and Trademark.
Confidential and Trademark.
Amazon
Broad choice of models

Jurassic-2 Ultra Titan Text Embeddings Claude 2 Command + Embed Llama 2 Stable Diffusion XL1.0
Jurassic-2 Mid Titan Multimodal Embeddings Claude 2.1 Cohere Command Light Llama 2 13B
Titan Text Lite Claude Instant Cohere Embed English Llama 2 70B
Titan Text Express Cohere Embed Multilingual
Titan Image Generator

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Why customize?

Customize to Adapt to Enhance Improve

specific business domain-specific performance context-awareness
needs language for specific tasks in responses

E.g. Healthcare – Understand E.g. Finance – Teach financial & E.g. Customer Service – Improve E.g. Legal Services – Better
medical terminology and provide accounting terms to provide good ability to understand and respond understand case facts and law to
accurate responses related to analysis for earnings reports to customer’s inquires and provide useful insights for
patient’s health complaints attorneys

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
Common approaches for customizing foundation models (FMs)

Complexity,
Quality,
Cost,
Time
Adjust behavior of
a pre-trained FM

Augment knowledge
without changing
pre-trained model Train FM
weights from scratch

Retrieval Customize
Augmented
Prompt
Generation
Engineering (RAG)

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 4
Customize vs. augment
External data
Consolidated or
sources
historical info
or up-to-date info Task
information

Relatively static Dynamic

information information Complex Simple
(e.g. docs, FAQs) Real-time (e.g. DBs, APIs) or specific or generic
Simple
required? task?

Augment with Prompt

Augment with RAG Customize
agents and tools engineering

Amazon Bedrock Amazon Bedrock Amazon Bedrock Amazon Bedrock

Knowledge Bases Agents Custom Models FMs
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 5
What is Retrieval Augmented Generation?

Retrieval Augmentation Generation

Fetches the relevant Adding the retrieved Response from the

content from the relevant context to the foundation model based
external knowledge base user prompt, which goes on the augmented
or data sources based on as an input to the prompt.
a user query foundation model

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 6
RAG use cases

Improved content Contextual chatbots and Personalized search Real-time data

quality question answering summarization

E.g., helps in reducing E.g., enhance chatbot capabilities E.g., searching based on user E.g., retrieving and summarizing
hallucinations and connecting by integrating with real-time data previous search history and transactional data from
with recent knowledge including persona databases, or API calls
enterprise data

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 7
Types of retrieval

0.89 -0.02 -0.53 0.95 0.17 -0.38

Rule Based Structured data Semantic Search

Get relevant documents
Fetches unstructured Transactional retrieval
based on text
data like documents from database or API
embeddings

e.g., Select customers Subway

e.g., Key word searches New York Statue Liberty
from All_orders where Tall buildings
order == ‘XYZ’

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 8
What are embeddings?
• Numerical representation of
text (vectors) that captures
semantics and relationships New York 0.027 -0.011 … -0.023
between words.
Paris 0.025 -0.009 … -0.025

• Embedding models capture

features and nuances of the EMBEDDING
-0.011 0.021 … 0.013
Animal MODEL
text.
Horse -0.009 0.019 … 0.015
• Rich embeddings can be used
to compare text similarity. Human Text Vector Embeddings

• Multilingual Text Embeddings

can identify meaning in
different languages.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 9
Why are embeddings important for RAG?

Powers text retrieval based Used to augment prompts High-accuracy embeddings

on semantic meaning. with more accurate leads to improved context
context from vector stores and higher quality LLM-
using the Retrieval generated responses to a
Augmented Generation user query.
(RAG).
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 10
Titan text embeddings model

Amazon Titan Text Embeddings Highlights

V2.0

Translates text inputs (words, phrases) into numerical • Titan Text Embeddings offers fast, cost
representations (embeddings). Comparing effective, high-performance, accurate
embeddings produces more relevant and contextual embeddings in 25 languages.
responses than word matching.
• Optimized for text retrieval tasks, semantic
similarity and clustering.

• Applications of this model includes

Max Tokens: 8,000 semantic search and personalization.
Output Vectors: 1,536
Language: Multilingual (25 languages)

Model ID: amazon.titan-embed-g1-text-02

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 11
RAG in Action

User Input

Prompt Large Language

Text User
augmentation Model
Response
Generation
Workflow Embeddings
Context
model

Embedding 0.89 -0.02 -0.53 0.95 0.17 -0.38

Data Ingestion Semantic

search
Workflow
Vector store Embeddings model Document chunks Data source

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 12
However, when it comes to implementing RAG,
there are challenges…

Managing Creating vector Incremental

multiple data embeddings for large updates to vector
sources volumes of data store

Coding effort Scaling retrieval Orchestration

mechanism
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 13
Fully managed support for end-to-end
RAG workflow

Knowledge Bases for Securely connect FMs and agents to

Amazon Bedrock data sources

Gives FMs and agents contextual

information from your private data Easily retrieve relevant data and
sources for Retrieval Augmented augment prompts
Generation (RAG) to deliver more
relevant, accurate, and customized
responses.
Provide source attribution

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 14
Data Ingestion Workflow
KNOWLEDGE BASES FOR AMAZON BEDROCK

Fully
managed
data
ingestion
Data source Embeddings
workflow New data Document chunks
model
Vector store

• Choose your data source • Choose your • Choose your • Choose your vector
(Amazon S3) chunking strategy embedding model store
• Support for incremental • Fixed chunks • Amazon Titan • Open search
updates • No chunking serverless
• Multiple data file formats • Default (200 • Pinecone
supported tokens) • Redis

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 15
Fully managed data ingestion
KNOWLEDGE BASES FOR AMAZON BEDROCK

Fully
managed
data
ingestion
Data source Embeddings Vector store
workflow New data Document chunks

Automated and fully managed data ingestion using

model

• Choose your data source • Choose your • Choose your • Choose your vector
Knowledge
(Amazon S3) Bases •for
chunking Amazon
strategy Bedrock
embedding model store
• Fixed chunks
Support for incremental • Amazon Titan • Open search
updates • No chunking serverless
• Multiple data file format • Default (200 • Pinecone
tokens) • Redis
supported

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 16
RetrieveAndGenerate API
KNOWLEDGE BASES FOR AMAZON BEDROCK

User Input
RetrieveAndGenerate
API

User Response
Fully User query Generated
managed response
RAG

Generate query Retrieve similar Augment query with Generate response

embedding documents from retrieved documents from LLM
knowledge bases

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 17
Customize RAG workflows using Retrieve API
KNOWLEDGE BASES FOR AMAZON BEDROCK

User Input

User Prompt Large Response

augmentation Language
Model

Customized
RAG Retrieve API
workflow
User Retrieved
Context
query documents

Generate query Retrieve similar documents

embedding from knowledge bases

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 18
Knowledge Bases integration with Agents

Knowledge Bases
Query

Retrieval

Agent

Query + Retrieval
Large Language
Model
Response Generation

Domain
adaptation
(e.g. extend
knowledge)
Instruction fine-tuning Continued
pre-training
• Instruction training dataset is available?
+
• Specific style, behavior required? Continued Instruction
Pre-training fine-tuning
Continued pre-training
• Raw dataset (e.g. PDFs)
• Additional knowledge through domain adaptation Instruction fine-
tuning

Task specialization
(e.g. behavior, style)
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 21
Datasets for instruction fine-tuning and continued pre-training

Instruction dataset Raw data

(e.g. question-answer) (e.g. PDFs)
Dataset

Instruction Continued
fine-tuning pre-training

Amazon Bedrock Amazon Bedrock

Custom Models Custom Models

{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}

Create custom models using the console or APIs

Maximize accuracy of FMs by providing labeled

or raw unlabeled data

Once deployed, custom models are invoked the

same way as base models
(playground or API)

Customizations now supported for Amazon

Titan and some third party FMs

Inputs Outputs Storage Inferencing

Base FM Metrics and Custom Models Playground

Logs Stored Securely
by Amazon
Hyper Bedrock API
Parameters Output
Model
Input
Data

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark. 25
Customization architecture overview
Amazon Bedrock service account Model deployment account
(AWS owned and operated)
via the console, SDKs, and API
All incoming network traffic

Training orchestration

Amazon Bedrock service

Base model S3
bucket Custom
Provisioned Job
Runtime inference Capacity
API Compute
endpoint

Fine-tuned model
S3 bucket

Customer account

Identity & Access,

Virtual private cloud
monitoring & logging

Training data
AWS Amazon AWS S3 bucket
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Confidential and Trademark.
CloudTrail CloudWatch IAM 26
Security and privacy
You are always in control of your data

ü Data not used to improve models, and not shared with model providers

ü Customer data remain in Region

ü Support for AWS PrivateLink and VPC configurations

ü Integration with AWS IAM

ü API monitoring in AWS CloudTrail, logging & metrics in Amazon CloudWatch

ü Custom models encrypted and stored with Service or Customer Managed Keys
(CMK) - Only you have access to your models

Knowledge Bases for Fine tuning and

Customization vs RAG concepts Amazon Bedrock Customization Continued Pre-
augmentation concepts training

Retrieval Augmented Generation - A Simple Introduction
No ratings yet
Retrieval Augmented Generation - A Simple Introduction
82 pages
RAG Understanding PDF
No ratings yet
RAG Understanding PDF
12 pages
Session 7 LLMs Fine Tuning and RAG
No ratings yet
Session 7 LLMs Fine Tuning and RAG
21 pages
Hospital Management System Project Report
No ratings yet
Hospital Management System Project Report
87 pages
Generative AI Keynote
No ratings yet
Generative AI Keynote
59 pages
Generative AI Executive Deck
No ratings yet
Generative AI Executive Deck
63 pages
RAG Slide ENG
No ratings yet
RAG Slide ENG
41 pages
Privacy First RAG Closed-Loop LLMs For Industrial Data Security
No ratings yet
Privacy First RAG Closed-Loop LLMs For Industrial Data Security
12 pages
7 Agentic RAG System Architectures To Build AI Agents
100% (1)
7 Agentic RAG System Architectures To Build AI Agents
12 pages
Retrieval-Augmented Generation For Large Language Models A Survey
No ratings yet
Retrieval-Augmented Generation For Large Language Models A Survey
26 pages
Building Blocks of Rag Ebook Final
100% (2)
Building Blocks of Rag Ebook Final
9 pages
A Taxonomy of Retrieval Augmented Generation
100% (2)
A Taxonomy of Retrieval Augmented Generation
56 pages
Build Your Generative AI Application With Amazon Bedrock
No ratings yet
Build Your Generative AI Application With Amazon Bedrock
23 pages
Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium
No ratings yet
Introduction To RAG (Retrieval Augmented Generation) and Vector Database - by Sachinsoni - Medium
18 pages
RAG Architecture
100% (8)
RAG Architecture
52 pages
Retrieval-Augmented Generation For AI-Generated Content A Survey
No ratings yet
Retrieval-Augmented Generation For AI-Generated Content A Survey
28 pages
RAG - A Simple Introduction
100% (5)
RAG - A Simple Introduction
75 pages
Evolving LLOMPS For RAG
No ratings yet
Evolving LLOMPS For RAG
6 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
Rag System Notes
No ratings yet
Rag System Notes
26 pages
A Comprehensive Guide To Building Agentic RAG Systems With LangGraph
No ratings yet
A Comprehensive Guide To Building Agentic RAG Systems With LangGraph
23 pages
How Is RAG Used in The Industry Launchpad - Rag - Seminar - q2 - 8 - May - 2025
No ratings yet
How Is RAG Used in The Industry Launchpad - Rag - Seminar - q2 - 8 - May - 2025
49 pages
RAG Developers Stack
No ratings yet
RAG Developers Stack
13 pages
AIM307 - Retrieval Augmented Generation With Amazon Bedrock
No ratings yet
AIM307 - Retrieval Augmented Generation With Amazon Bedrock
15 pages
Github - Blog - Ai and ML - Generative Ai - What Is Retrieval Augmented Generation and What Does It Do For Generative Ai
No ratings yet
Github - Blog - Ai and ML - Generative Ai - What Is Retrieval Augmented Generation and What Does It Do For Generative Ai
14 pages
Handout Build Scalable RAG Applications Using Amazon Bedrock Knowledge Bases
No ratings yet
Handout Build Scalable RAG Applications Using Amazon Bedrock Knowledge Bases
23 pages
Quiz (FSC200 FSG L2) - Attempt Review2
100% (1)
Quiz (FSC200 FSG L2) - Attempt Review2
11 pages
Practical RAG
No ratings yet
Practical RAG
127 pages
Digital Portable X-Ray Systems: Manual Ver1.7
100% (1)
Digital Portable X-Ray Systems: Manual Ver1.7
47 pages
GenAI PDF
No ratings yet
GenAI PDF
34 pages
Generative AI Applications
No ratings yet
Generative AI Applications
44 pages
Building RAG Apps
No ratings yet
Building RAG Apps
32 pages
What Is Retrieval-Augmented Generation, Aka RAG?: Rick Merritt
No ratings yet
What Is Retrieval-Augmented Generation, Aka RAG?: Rick Merritt
9 pages
17 (Advanced) RAG Techniques To Turn Your LLM App Prototype Into A Production-Ready Solution - by Dominik Polzer - Jun, 2024 - Towards Data Science
No ratings yet
17 (Advanced) RAG Techniques To Turn Your LLM App Prototype Into A Production-Ready Solution - by Dominik Polzer - Jun, 2024 - Towards Data Science
54 pages
Untitled 2
No ratings yet
Untitled 2
40 pages
Retrieval Augmented Generation Options Good 5 38
No ratings yet
Retrieval Augmented Generation Options Good 5 38
34 pages
RAG Cheat Sheet-2
No ratings yet
RAG Cheat Sheet-2
29 pages
NEW 25.02.03 AGENTIC-AI-RESEARCH 2501.09136v2
No ratings yet
NEW 25.02.03 AGENTIC-AI-RESEARCH 2501.09136v2
39 pages
MMBT3S4 Slides
No ratings yet
MMBT3S4 Slides
29 pages
Minor Proj
No ratings yet
Minor Proj
15 pages
WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag
12 pages
Rag Survey
No ratings yet
Rag Survey
22 pages
Exploring HuggingFace
No ratings yet
Exploring HuggingFace
16 pages
RAG - The Future of LLMs - LinkedIn
No ratings yet
RAG - The Future of LLMs - LinkedIn
7 pages
Rag
No ratings yet
Rag
10 pages
A Deep Dive Into Retrieval Augmented Generation: Team Members
No ratings yet
A Deep Dive Into Retrieval Augmented Generation: Team Members
14 pages
1 - Build A Complete OpenSource LLM RAG QA Chatbot - An In-Depth Journey (Introduction) - by Marco Bertelli - Level Up Coding
No ratings yet
1 - Build A Complete OpenSource LLM RAG QA Chatbot - An In-Depth Journey (Introduction) - by Marco Bertelli - Level Up Coding
12 pages
A Powerful Technique For Improved Text Generation and Efficiency
No ratings yet
A Powerful Technique For Improved Text Generation and Efficiency
14 pages
Implementing A Retrieval-Augmented Generation System
No ratings yet
Implementing A Retrieval-Augmented Generation System
3 pages
Blogs Nvidia Com Blog What-Is-Retrieval-Augmented-Generation
No ratings yet
Blogs Nvidia Com Blog What-Is-Retrieval-Augmented-Generation
12 pages
NVIDIA RAG Whitepaper
No ratings yet
NVIDIA RAG Whitepaper
7 pages
Natural Language Processing
No ratings yet
Natural Language Processing
11 pages
CrateDB and LangChain
No ratings yet
CrateDB and LangChain
14 pages
39-04 RAG Retrieval Augmented Generation
No ratings yet
39-04 RAG Retrieval Augmented Generation
7 pages
ROV Umbilical Winch 20210111 1S Rev 2 OM 4100 A3 4 180 190 FS NZ
No ratings yet
ROV Umbilical Winch 20210111 1S Rev 2 OM 4100 A3 4 180 190 FS NZ
7 pages
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
No ratings yet
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
7 pages
5th and 6th Topic
No ratings yet
5th and 6th Topic
8 pages
Annisa Reiny HF - UKSW - Summary 2
No ratings yet
Annisa Reiny HF - UKSW - Summary 2
3 pages
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
5 pages
Basics of Retrieval-Augmented Generation or RAG
No ratings yet
Basics of Retrieval-Augmented Generation or RAG
2 pages
RAG Foundations Cheatsheet - Codecademy
No ratings yet
RAG Foundations Cheatsheet - Codecademy
2 pages
Genai Tech Stacks
No ratings yet
Genai Tech Stacks
2 pages
RAG Research Document Abhishek
No ratings yet
RAG Research Document Abhishek
2 pages
Knowledge Management Strategy and Technology 1st Edition Richard F. Bellaver Instant Download
100% (1)
Knowledge Management Strategy and Technology 1st Edition Richard F. Bellaver Instant Download
61 pages
Learn Devops With A Grade Project
No ratings yet
Learn Devops With A Grade Project
13 pages
Justice Currie: English As A Foreign Language Teacher
No ratings yet
Justice Currie: English As A Foreign Language Teacher
2 pages
CVPR 2022 MainConference ProgramGuide Final
No ratings yet
CVPR 2022 MainConference ProgramGuide Final
70 pages
Simulation of Five-Level Five-Phase SVPWM Voltage Source Inverter PDF
No ratings yet
Simulation of Five-Level Five-Phase SVPWM Voltage Source Inverter PDF
5 pages
Resume Biography
100% (2)
Resume Biography
7 pages
All in One Designer SEO Handbook
No ratings yet
All in One Designer SEO Handbook
18 pages
Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW
No ratings yet
Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW
48 pages
Mantra MFS 110
No ratings yet
Mantra MFS 110
8 pages
Unit-3 JAVA
No ratings yet
Unit-3 JAVA
57 pages
Object Oriented Software Engineering Using UML Patterns and Java 3rd Edition by Bernd Bruegge, Allen H Dutoit ISBN 0133002098 9780133002096
100% (12)
Object Oriented Software Engineering Using UML Patterns and Java 3rd Edition by Bernd Bruegge, Allen H Dutoit ISBN 0133002098 9780133002096
76 pages
Uploadsh 046 005807 00 Passport 8 12 Service Manual (FDA) 2 0
No ratings yet
Uploadsh 046 005807 00 Passport 8 12 Service Manual (FDA) 2 0
106 pages
04 - Business Intelligence
No ratings yet
04 - Business Intelligence
32 pages
CubeCoders - AMP Installation
No ratings yet
CubeCoders - AMP Installation
3 pages
Across Unknown South America
No ratings yet
Across Unknown South America
42 pages
Unit 5 Notes - Unit5
No ratings yet
Unit 5 Notes - Unit5
10 pages
2ND Summative CSS 10
No ratings yet
2ND Summative CSS 10
3 pages
78-Identify Input and Output Devices
No ratings yet
78-Identify Input and Output Devices
16 pages
21BCS1027 - ARJAN DEV SINGH (Project Report) - Removed
No ratings yet
21BCS1027 - ARJAN DEV SINGH (Project Report) - Removed
66 pages
Bash Scripting
No ratings yet
Bash Scripting
20 pages
WEF Cybercrime Atlas 2024
No ratings yet
WEF Cybercrime Atlas 2024
18 pages
DX Diag
No ratings yet
DX Diag
35 pages
Activity 1
No ratings yet
Activity 1
7 pages
DS - Unit Wise Question Bank
No ratings yet
DS - Unit Wise Question Bank
2 pages
Com - Bat.loader Logcat
No ratings yet
Com - Bat.loader Logcat
5 pages
Traffic Shaping Study Guide and Slides
No ratings yet
Traffic Shaping Study Guide and Slides
3 pages
The Industrial Revolution
No ratings yet
The Industrial Revolution
2 pages
An Intro Bioskeletal System
No ratings yet
An Intro Bioskeletal System
2 pages
Template
No ratings yet
Template
2 pages
JS 1 Maths
No ratings yet
JS 1 Maths
2 pages
HP Scitex LX600 & LX 800 Printer Operator Training Guidelines and Checklist
No ratings yet
HP Scitex LX600 & LX 800 Printer Operator Training Guidelines and Checklist
7 pages
Cbse Class 10 Maths Pre Board Sample Paper For 2023 24
No ratings yet
Cbse Class 10 Maths Pre Board Sample Paper For 2023 24
7 pages
AWS Certified Solutions Architect Associate All-in-One Exam Guide, Second Edition (Exam SAA-C02)
From Everand
AWS Certified Solutions Architect Associate All-in-One Exam Guide, Second Edition (Exam SAA-C02)
Joyjeet Banerjee
5/5 (3)

Architecture Patterns For Building Generative AI Applications

Uploaded by

Architecture Patterns For Building Generative AI Applications

Uploaded by

ACOT104

© 2023, Amazon Web

Customize to Adapt to Enhance Improve

Relatively static Dynamic

Augment with Prompt

Amazon Bedrock Amazon Bedrock Amazon Bedrock Amazon Bedrock

Retrieval Augmentation Generation

Fetches the relevant Adding the retrieved Response from the

Improved content Contextual chatbots and Personalized search Real-time data

0.89 -0.02 -0.53 0.95 0.17 -0.38

Rule Based Structured data Semantic Search

e.g., Select customers Subway

• Embedding models capture

• Multilingual Text Embeddings

Powers text retrieval based Used to augment prompts High-accuracy embeddings

Amazon Titan Text Embeddings Highlights

• Applications of this model includes

Model ID: amazon.titan-embed-g1-text-02

Prompt Large Language

Embedding 0.89 -0.02 -0.53 0.95 0.17 -0.38

Data Ingestion Semantic

Managing Creating vector Incremental

Coding effort Scaling retrieval Orchestration

Knowledge Bases for Securely connect FMs and agents to

Amazon Bedrock data sources

Gives FMs and agents contextual

Automated and fully managed data ingestion using

Generate query Retrieve similar Augment query with Generate response

User Prompt Large Response

Generate query Retrieve similar documents

Instruction dataset Raw data

Amazon Bedrock Amazon Bedrock

Create custom models using the console or APIs

Maximize accuracy of FMs by providing labeled

Once deployed, custom models are invoked the

Customizations now supported for Amazon

Inputs Outputs Storage Inferencing

Base FM Metrics and Custom Models Playground

Amazon Bedrock service

Identity & Access,

ü Customer data remain in Region

ü Support for AWS PrivateLink and VPC configurations

ü Integration with AWS IAM

ü API monitoring in AWS CloudTrail, logging & metrics in Amazon CloudWatch

Knowledge Bases for Fine tuning and

© 2023, Amazon Web

You might also like