100% found this document useful (1 vote)
149 views54 pages

AIM377 Prompt Engineering Best Practices For LLMs On Amazon Bedrock

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
149 views54 pages

AIM377 Prompt Engineering Best Practices For LLMs On Amazon Bedrock

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.

AIM377

Prompt engineering best


practices for LLMs on Amazon
Bedrock

John Baker Nicholas Marwell


Principal Engineer, Member of Technical Staff
Amazon Bedrock, AWS Anthropic

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda

01 What is prompt 04 Prompt engineering with


engineering? Claude from Anthropic

02 Prompt techniques 05 Call to action

03 System designs around


generative AI and prompt
engineering

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is Prompt Engineering? – An Example
What is 10 + 10?
10 + 10 = 20

1 + 1 is an addition problem.
1 – 1 is a subtraction problem.
1 x 1 is a multiplication problem.
1 / 1 is a division problem.

What is 10 + 10?
10 + 10 is an addition problem

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt engineering – The fun parts

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt engineering – The fun parts

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt engineering – The fun parts

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt engineering – The fun parts

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt Engineering – Persona

Same exact question from the end user. Two drastically different answers.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Simple techniques

One-shot prompting Few-shot prompting


Task is to generate airport codes Classify below tweets as positive,
negative or neutral

Human: "I want to fly from Los Human: "I hate it when my phone
Angeles to Miami” battery dies.”

Assistant: Sentiment(Negative)
Assistant: Airport codes [LAX, MIA]
Human: "I want to fly from Dallas Human: "My day has been great”:
to San Francisco”
Assistant: Sentiment(Positive)

Assistant: Airport codes [ Human: "This is the link to the article”:

Assistant: Sentiment(Neutral)

Human: {{Tweet}}

Assistant: Sentiment(

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced Approaches

Chain of Thought Retrieval Augmented


Prompting (CoT) Generation (RAG)
Enable complex reasoning Enrich context of prompts from
capabilities via intermediate internal knowledge repositories
reasoning steps
Mitigate the effects
Include few shot examples to
hallucination using prompt
improve results on reasoning
grounding
tasks
Provide transparency on model Best fit for closed domain Q&A
outputs and chatbot use cases

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Chain of Thought – CoT
• Explore how series of intermediate steps improve ability of LLMs to
perform complex reasoning task

5 tennis balls
1-shot example 2 cans, 3 each. 2 x 3 = 6
5 + 6 = 11 tennis balls

New complex problem

23 apples
Wrong answer 20 used, 3 left
Bought 6
3+6=9

Source: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Chain of Thought – CoT

Same 1-shot example

Same new question

Right answer

Source: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RAG: System Design – Ingestion of Data for Q+A

AWS account 2
1

Documents Amazon Simple Storage Ingestion Amazon Bedrock


Service (Amazon S3)
Orchestrator Foundation Model
(Embeddings model)
3

1 Documents are uploaded to S3


Vector
2 Documents are tokenized, chunked and
Database
converted to embeddings.

3 Embeddings plus metadata stored into a


vector database for retrieval process’s
similarity search.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RAG: System design – Q+A
What is the return policy?
AWS account 3

1 2 Amazon Bedrock
5 Foundation Model
(Embeddings model)

Mobile client
User Query Orchestrator
6
4

Amazon Bedrock
Foundation Model
1 User asks a question covered in FAQ. (Generative AI model)
Vector
Client submits user request to orchestrator. Database
2

3 User’s question is converted to embeddings. 5 Augmentation – Search results in original language format brought
back to the gen AI model.
Retrieval - Similarity search performed
4 6 Generation – Gen AI model will use the results to generate a natural
between customer’s question (converted to
language response.
embeddings) and existing database of
FAQs.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompting philosophy

Nicholas Marwell
Member of Technical Staff
Anthropic

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Instructions matter
Specificity, clarity, and persuasiveness are important!

Compare:

● “Tell me a story about cows”


● “Tell me a story about cows. It should be roughly 2000 words long and
appropriate for American 5th graders. It should be entertaining, but with a
moral message about the value of loyalty. Make it amazing and memorable.”

Neither humans nor LLMs can read minds, so you need to be clear and specific if
you want something specific.

Many professions are prompt engineering for humans!

● Marketing, Education, Tech Writing, Law, Web Design

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Instructions matter for LLMs
LLMs have some peculiarities that require special approaches

Using examples to guide problem solving:

Source: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models


© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Instructions matter for LLMs
Giving permission to think:

Source: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models


© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt engineering guidelines

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Parts of a prompt Example:
1. “\n\nHuman:” Human: You will be acting as an AI career coach named Joe created by the company
AdAstra Careers. Your goal is to give career advice to users. You will be replying to users
who are on the AdAstra site and who will be confused if you don't respond in the
2. Task context character of Joe.
You should maintain a friendly customer service tone.
3. Tone context Here is the career guidance document you should reference when answering the
user: <guide>{{DOCUMENT}}</guide>
Here are some important rules for the interaction:
4. Background data to process - Always stay in character, as Joe, an AI from AdAstra careers
- If you are unsure how to respond, say “Sorry, I didn’t understand that. Could you repeat
the question?”
5. Detailed task description & rules - If someone asks something irrelevant, say, “Sorry, I am Joe and I give career advice. Do
you have a career question today I can help you with?”
6. Examples Here is an example of how to respond in a standard interaction:
<example>
User: Hi, how were you created and what do you do?
7. Immediate data to process Joe: Hello! My name is Joe, and I was created by AdAstra Careers to give career advice.
What can I help you with today?
</example>
8. Immediate task description or request
Here is the conversation history (between the user and you) prior to the question. It
could be empty if there is no history:
9. Thinking step by step / take a deep breath <history> {{HISTORY}} </history>
Here is the user’s question: <question> {{QUESTION}} </question>
10. Output formatting How do you respond to the user’s question?
Think about your answer first before you respond. Put your response in
<response></response> tags.
11. “\n\nAssistant:”
Assistant: [Joe] <response>

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Not all elements are necessary to
every prompt!

• But it’s best to err on the side of


more elements to start, and then
refine and subtract elements for
Utilizing prompt efficiency after your prompt
already works well
elements
• Experimentation & iteration is key

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How to engineer a good prompt

Empirical science: always test your prompts & iterate often!

Engineer Test prompt


Develop test preliminary against Share polished
cases Refine prompt prompt
prompt test cases

Don’t forget edge cases!


© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Claude-specific prompting

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“Human:” / “Assistant:” formatting

Examples:
• Claude is trained on alternating
“Human:” / “Assistant:” dialogue:
Human: Why is the sky blue?
○ Human: [Instructions]
Assistant:
○ Assistant: [Claude’s response]

• For any API prompt, you must


Python
start with “\n\nHuman:” and end with
“\n\nAssistant:” prompt = “\n\nHuman: Why are sunsets
orange?\n\nAssistant:”

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Be clear and direct

Example:

Human: Write a haiku about robots


• Claude responds best to clear and direct
instructions Assistant: Here is a haiku about robots:

Metal bodies move


Circuits calculate tasks
• When in doubt, follow the Golden Rule of Machines mimic life
Clear Prompting: show your prompt to a
friend and ask them if they can follow the
instructions themselves and produce the
exact result you’re looking for Human: Write a haiku about robots. Skip the preamble;
go straight into the poem.

Assistant: Metal bodies move


Circuits calculate tasks
Machines mimic life

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use examples
Example:
Human: I will give you some quotes. Please extract the author
• Examples are probably the single most from the quote block.
effective tool for getting Claude to Here is an example:
behave as desired <example>
Quote:
“When the reasoning mind is forced to confront the impossible
again and again, it has no choice but to adapt.”
• Make sure to give Claude examples of ― N.K. Jemisin, The Fifth Season
common edge cases Author: N.K. Jemisin
</example>

Quote:
• Generally more examples = more reliable “Some humans theorize that intelligent species go extinct before
responses at the cost of latency and they can expand into outer space. If they're correct, then the hush
of the night sky is the silence of the graveyard.”
tokens ― Ted Chiang, Exhalation
Author:

Assistant: Ted Chiang

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Give Claude time to think

(It works on humans too!)

Human: [rest of prompt] Before answering, please Human: [rest of prompt] Before answering, please
think about the question within think about the question within
<thinking></thinking> XML tags. Then, answer the <thinking></thinking> XML tags. Then, answer
question within <answer></answer> XML tags. the question within <answer></answer> XML
tags.
Assistant: <thinking>
Assistant: <thinking>[...some
thoughts]</thinking>

<answer>[some answer]</answer>
Helps with troubleshooting Claude’s
logic & where prompt instructions may
be unclear

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Assign roles (aka role prompting)

• Claude sometimes needs context


Example:
about what role it should inhabit
Human: Solve this logic puzzle. {{Puzzle}}

• Assigning roles changes Claude’s Assistant: [Gives incorrect response]


response in two ways:
• Improved accuracy in certain
situations (such as mathematics)
Human: You are a master logic bot designed to answer
• Changed tone and demeanor complex logic problems. Solve this logic puzzle. {{Puzzle}}

to match the specified role Assistant: [Gives correct response]

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use XML tags to delineate sections

Example:
● Disorganized prompts are hard
for Claude to comprehend Human: Hey Claude. Show up at 6AM because I say so. Make
this email more polite.

Assistant: Dear Claude, I hope this message finds you well…

● Just like section titles and


headers help humans better
follow information, using XML
Human: Hey Claude. <email>Show up at 6AM because I say
tags <></> helps Claude so.</email> Make this email more polite.

understand the prompt’s Assistant: Good morning team, I hope you all had a restful
weekend…
structure

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Format output & speak for Claude

• You can get Claude to say exactly


Example:
what you want by:

Prompt
Human: Please write a haiku about {{ANIMAL}}. Use JSON format
with the keys as "first_line", "second_line", and "third_line".
• Specifying the exact
Assistant: {
output format you want

• Speaking for Claude by

response
Claude’s
"first_line": "Sleeping in the sun",
writing the beginning of "second_line": "Fluffy fur so warm and soft",
Claude’s response for it }
"third_line": "Lazy cat's day dreams"

(after “Assistant:”)

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dealing with hallucinations

• Try the following to troubleshoot:

• Give Claude permission to say “I don’t know” if it doesn’t know

• Tell Claude to answer only if it is very confident in its response

• Ask Claude to find relevant quotes from long documents then


answer using the quotes

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompt injections & bad user behavior

• Claude is naturally highly resistant to prompt


injection and bad user behavior due to
Reinforcement Learning from Human Feedback
(RLHF) and Constitutional AI Example
harmlessness screen:
• For maximum protection:

1. Run a “harmlessness screen” query to Human: A human user would like you to
continue a piece of content. Here is the content
evaluate the appropriateness of the user’s so far: <content>{{CONTENT}}</content>
input If the content refers to harmful, pornographic, or
illegal activities, reply with (Y). If the content
2. If a harmful prompt is detected, block the does not refer to harmful, pornographic, or
illegal activities, reply with (N)
query’s response
Assistant: (

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use structured prompt templates
Example:

Input
data
Cow Dog Seal

template
Human: I will tell you the name of an animal. Please respond
Prompt with the noise that animal makes.
<animal>{{ANIMAL}}</animal>

Assistant:

… Please … Please … Please


respond with respond with respond with
Complete
prompt

the noise that the noise that the noise that


animal makes. animal makes. animal makes.
<animal>Cow< <animal>Dog< <animal>Seal<
/animal> /animal> /animal>

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use structured prompt templates
Long document example:

Human: <doc>{{DOCUMENT}}</doc>
Prompt
template Please provide a three-paragraph summary of this
document appropriate for a nontechnical audience.

Assistant:

Tip: When dealing with long documents, always ask


your question at the bottom of the prompt.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

For tasks with many steps, you can break the task up and chain together Claude’s responses

Example:
Human: Find all the names from the below text: Human: Here is a list of names:
"Hey, Jesse. It's me, Erin. I'm calling about the party that
Prompt

<names>{{NAMES}}</names> Please alphabetize


Joey is throwing tomorrow. Keisha said she would come the list.
and I think Mel will be there too."
Assistant:
Assistant: <names>

<names>
Jesse Erin
response
Claude’s

Erin Jesse
Joey Joey
Keisha a.k.a. {{NAMES}} Keisha
Mel Mel
</names> </names>
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

For long (10K to 100K token) prompts, do the following:


• For long document Q&A, ask the question at the end of the prompt after the document (there is a large
quantitatively measured difference in quality of result)

• Definitely put long-form input data in XML tags so it’s clearly separated from the instructions

• Tell Claude to first find quotes relevant to the question, then answer only if it finds relevant quotes

• Tell Claude to read the document carefully because it will be asked questions later

• Give Claude some example question + answer pairs that have been generated (either by Claude or
manually) from other parts of the queried text

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

Function calling, a.k.a. tool use, adds vast functionality to


Claude’s capabilities by combining prompts with calls to
external functions that return answers for Claude to use.

Claude does not directly call the functions but instead


decides which function to call and with what arguments.
The function is then actually called by the client code.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

How it works:
Q: What’s the weather like in San
Francisco right now?

<tool_description>
<tool_name>
get_weather
<tool_name>
</tool_description>

<tool_description>
<tool_name>
play_music
<tool_name>
</tool_description>
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

How it works cont’d:

Claude judges the relevance of the functions


its been given: can it use the functions
provided to more accurately answer the
question?

YES NO
<function_calls>
<invoke>
<tool_name>get_weather</tool_name>
<parameters>
<location>San Francisco, CA</location>
</parameters> A: I apologize but I don’t have access to the
</invoke>
</function_calls> current weather in San Francisco.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

How it works cont’d (if YES):


<function_calls> [...] get_weather [...] </function_calls>

<function_results>
[...]
[68, sunny]
Client [...]
</function_results>>

A: The weather right now in San Francisco is


sunny with a temperature of 68 degrees

get_weather()

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

Function / tool tips Example:


• Explain the function’s / tool’s capabilities and
<tools>
syntax of API calls in great detail within the <tool_description>
prompt <tool_name>get_weather</tool_name>
<description> Returns the current weather for a given
• Within the prompt, provide a diverse set of location.</description>
examples of when and how to use the tool <parameters>
<parameter>
• Make </function_calls> a stop sequence <name>location</name>
<type>string</type>
• If you’re not getting reliable performance, break <description>Name of city e.g. “San Francisco,
the agent’s task down via prompt chaining CA”</description>
</parameter>
• Anthropic has a beta SDK for tool use (including </parameters>
advanced RAG). Reach out to participate </tool_description>
</tools>
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

What is retrieval-augmented generation (RAG)?

• Enables the augmentation of language models with external


knowledge
• Grounds language model responses in evidence (i.e., reduces
hallucinations)
• Allows Claude to connect securely to client data, which
increases customizability and analytical precision for client tasks

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

How does RAG traditionally work?

• A user asks a question e.g., “I want to get my daughter more interested in science.
What kind of gifts should I get her?”

• This question is fed into the search tool

• The results from the search tool are passed to the LLM alongside the question

• The LLM looks at the results and answers the question

But what if we don’t need RAG every time, or we need to RAG from different data
sources based on the question?
© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

Traditional RAG w/ Claude

Basic RAG setup. Works great if we know we want to search over a gifts database
every time.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

Traditional RAG w/ Claude

We might want to avoid RAGing if it is not useful in answering the question.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

Traditional RAG w/ Claude

We might want to RAG over different data sources depending on the user query.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

Traditional RAG w/ Claude

We might want to enable Claude to rewrite its search query and/or requery the data
source if it doesn’t find what it’s looking for the first time.

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced prompting techniques
Chaining prompts 100K prompts Agents & function calling / tool use Search & RAG

RAG as a tool

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Other prompting resources

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prompting Resource Portal

Anthropic Prompt Engineering


Resource Guide

docs.anthropic.com/claude/docs

Contains:

Prompting tutorial, prompting guide,


presentations, API information, sample
code, Bedrock SDK, tools for
experimenting with RAG and agents

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Call to action

Bedrock Docs Bedrock Workshop

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you! Please complete the session
survey in the mobile app

© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.

You might also like