Essentials of Prompt Engineering
Essentials of Prompt Engineering
Prompt Basics
Improving the way that you prompt a foundation model is the fastest way to
harness the power of generative artificial intelligence (generative AI). By interacting
with a model through a series of questions, statements, or instructions, you can
adjust model output behavior based on the specific context of the output that you
want to achieve.
Using effective prompt strategies can offer you the following benefits:
Enhance the model's capabilities and bolster its safety measures.
Equip the model with domain-specific knowledge and external tools
without modifying its parameters or undergoing fine-tuning.
Interact with language models to fully comprehend their potential.
Obtain higher-quality outputs by providing higher-quality inputs.
Understanding Prompts
Elements of a Prompt
A prompt's form depends on the task that you are giving to a model. As you explore
prompt engineering examples, you will review prompts containing some or all of the
following elements:
Instructions: This is a task for the large language model to do. It
provides a task description or instruction for how the model should
perform.
Context: This is external information to guide the model.
Input data: This is the input for which you want a response.
Output indicator: This is the output type or format.
Example prompt
Prompt
Given a list of customer orders and available inventory, determine which orders can be
fulfilled and which items have to be restocked.
This task is essential for inventory management and order fulfillment processes in ecommerce
or retail businesses.
Orders:
Order 1: Product A (5 units), Product B (3 units)
Order 2: Product C (2 units), Product B (2 units)
Inventory:
Product A: 8 units
Product B: 4 units
Product C: 1 unit
Fulfillment status:
The previous prompt includes all four elements of a prompt. You can break the
prompt into the following elements:
Instructions: Given a list of customer orders and available inventory,
determine which orders can be fulfilled and which items have to be
restocked.
Context: This task is essential for inventory management and order
fulfillment processes in ecommerce or retail businesses.
Input data:
Orders:
Order 1: Product A (5 units), Product B (3 units)
Order 2: Product C (2 units), Product B (2 units)
Inventory:
Product A: 8 units
Product B: 4 units
Product C: 1 unit
Scenario
Scenario prompt
Prompt
Modifying Prompts
Inference Parameters
When interacting with FMs, you can often configure inference parameters to limit or
influence the model response. The parameters available to you will vary based on
the model that you are using. Inference parameters fit into a range of categories,
with the most common being randomness and diversity and length.
Randomness and Diversity
This is the most common category of inference parameter. Randomness and
diversity parameters influence the variation in generated responses by limiting the
outputs to more likely outcomes or by changing the shape of the probability
distribution of outputs. Three of the more common parameters are temperature, top
k, and top p.
Temperature
This parameter controls the randomness or creativity of the model's output. A
higher temperature makes the output more diverse and unpredictable, and a
lower temperature makes it more focused and predictable. Temperature is set
between 0 and 1. The following are examples of different temperature
settings.
Low temperature (for example, 0.2) High temperature (for example, 1.0)
Outputs are more conservative, repetitive, and Outputs are more diverse, creative, and
focused on the most likely responses. unpredictable, but might be less coherent o
relevant.
Top P
Top p is a setting that controls the diversity of the text by limiting the number
of words that the model can choose from based on their probabilities. Top p is
also set on a scale from 0 to 1. The following are examples of different top p
settings.
Low top p (for example, 0.250) High top p (for example, 0.990)
With a low top p setting, like 0.250, the model With a high top p setting, like 0.990, the mo
will only consider words that make up the top will consider a broad range of possible word
25 percent of the total probability distribution. for the next word in the sequence, because
This can help the output be more focused and will include words that make up the top 99
coherent, because the model is limited to percent of the total probability distribution.
choosing from the most probable words given This can lead to more diverse and creative
the context. output, because the model has a wider pool
words to choose from.
Top K
Top k limits the number of words to the top k most probable words, regardless
of their percent probabilities. For instance, if top k is set to 50, the model will
only consider the 50 most likely words for the next word in the sequence,
even if those 50 words only make up a small portion of the total probability
distribution.
Low top k (for example, 10) High top k (for example, 500)
With a low setting, like 10, the model will only With a high top k setting, like 500, the mode
consider the 10 most probable words for the will consider the 500 most probable words f
next word in the sequence. This can help the the next word in the sequence, regardless o
output be more focused and coherent, because their individual probabilities. This can lead t
the model is limited to choosing from the most more diverse and creative output, because t
probable words given the context. model has a larger pool of potential words to
choose from.
Adjusting these inference parameters can significantly impact the model's output,
so you can fine-tune the level of creativity, diversity, and coherence to suit your
specific needs.
Length
The length inference parameter category refers to the settings that control the
maximum length of the generated output and specify the stop sequences that
signal the end of the generation process.
Maximum length
The maximum length setting determines the maximum number of tokens
that the model can generate during the inference process. This parameter
helps to prevent the model from generating excessive or infinite output,
which could lead to resource exhaustion or undesirable behavior. The
appropriate value for this setting depends on the specific task and the
desired output length. For instance, in natural language generation tasks like
text summarization or translation, the maximum length can be set based on
the typical length of the target text. In open-ended generation tasks, such as
creative writing or dialogue systems, a higher maximum length might be
desirable to allow for more extended outputs.
Stop sequences
Stop sequences are special tokens or sequences of tokens that signal the
model to stop generating further output. When the model encounters a stop
sequence during the inference process, it will terminate the generation
regardless of the maximum length setting. Stop sequences are particularly
useful in tasks where the desired output length is variable or difficult to
predict in advance. For example, in conversational artificial intelligence (AI)
systems, the stop sequence could be an end-of-conversation token or a
specific phrase that indicates the end of the response.
Compute the sum total of the subsequent sequence of numerals: 4, 8, 12, 16.
Good prompt
Good prompt
Provide a summary of this article to be used in a blog post: [insert article text]
Good prompt
What is the capital of New York? Provide the answer in a full sentence.
Good prompt
Calculate the area of a circle with a radius of 3 inches (7.5 cm). Round your answer to the
nearest integer.
Good prompt
Good prompt
Determine the sentiment of the following social media post using these examples:
post: "great pen" => Positive
post: "I hate when my phone battery dies" => Negative
[insert social media post] =>
Prompt
Updated prompt
Parameters
Temperature: 0.9
Top p: 0.999
Maximum length: 5,000
Prompt
Generate a comprehensive market analysis report for a new product category in the finance
industry for an audience of small and medium-sized businesses (SMBs). Structure the report
with the following sections:
1. Executive Summary
2. Industry Overview
3. Target Audience Analysis
4. Competitive Landscape
5. Product Opportunity and Recommendations
6. Financial Projections
The tone should be professional and tailored to the target audience of SMBs.
This updated prompt incorporates the following parameter settings and best
practices:
1. Parameters – The updated prompt has the parameters for temperature
and top p set high. This will encourage the model to produce a more
creative output that might include some points that you wouldn't
necessarily think of. The maximum length parameter is also set at 5,000.
2. Include context – The updated prompt clarifies that the company is in
the finance industry, which helps the model tailor the analysis accordingly.
3. Use directives for the appropriate response type – The prompt
breaks down the market analysis report into specific sections, making it
easier for the model to structure the output.
By incorporating some of these best practices, the updated prompt provides more
specific guidance to the generative model, increasing the likelihood of generating a
high-quality, relevant, and well-structured market analysis report tailored to the
finance industry.
Prompt Engineering Techniques
Zero-Shot Prompting
Zero-shot prompting is a technique where a user presents a task to a generative
model without providing any examples or explicit training for that specific task. In
this approach, the user relies on the model's general knowledge and capabilities to
understand and carry out the task without any prior exposure, or shots, of similar
tasks. Remarkably, modern FMs have demonstrated impressive zero-shot
performance, effectively tackling tasks thatthey were not explicitly trained for.
To optimize zero-shot prompting, consider the following tips:
The larger and more capable the FM, the higher the likelihood of obtaining
effective results from zero-shot prompts.
Instruction tuning, a process of fine-tuning models to better align with
human preferences, can enhance zero-shot learning capabilities. One
approach to scale instruction tuning is through reinforcement learning
from human feedback (RLHF), where the model is iteratively trained based
on human evaluations of its outputs.
The following is an example of a zero-shot prompt and resulting output.
Zero-shot prompt
Prompt Output
Note: This prompt did not provide any examples to the model. However, the model
was still effective in deciphering the task.
Few-Shot Prompting
Few-shot prompting is a technique that involves providing a language model with
contextual examples to guide its understanding and expected output for a specific
task. In this approach, you supplement the prompt with sample inputs and their
corresponding desired outputs, effectively giving the model a few shots or
demonstrations to condition it for the requested task. Although few-shot prompting
provides a model with multiple examples, you can also use single-shot or one-shot
prompting by providing just one example.
When employing a few-shot prompting technique, consider the following tips:
Make sure to select examples that are representative of the task that you
want the model to perform and cover a diverse range of inputs and
outputs. Additionally, aim to use clear and concise examples that
accurately demonstrate the desired behavior.
Experiment with the number of examples. The optimal number of
examples to include in a few-shot prompt can vary depending on the task,
the model, and the complexity of the examples themselves. Generally,
providing more examples can help the model better understand the task.
But too many examples might introduce noise or confusion.
The following is an example of a few-shot prompt and resulting output.
Few-shot prompt
Prompt Output
Chain-of-Though Prompting
Chain-of-thought (CoT) prompting is a technique that divides intricate reasoning
tasks into smaller, intermediary steps. This approach can be employed using either
zero-shot or few-shot prompting techniques. CoT prompts are tailored to specific
problem types. To initiate the chain-of-thought reasoning process in a machine
learning model, you can use the phrase "Think step by step." It is recommended to
use CoT prompting when the task requires multiple steps or a series of logical
reasoning.
The following are examples of CoT prompts using both zero-shot and few-shot
techniques.
CoT using zero-shot
Prompt Output
Which service requires a larger deposit based on The deposit for service A is 30 percent of
the following information? $50,000, which is
0.3 * 50,000 = $15,000
Prompt Output
Scenario
Consider the scenario used throughout this course. Suppose that you have a market
analysis report template. You also have a few market analysis reports for other new
products that your organization has launched. You can use the few-shot prompt
technique by including your organization's template and example market analysis
reports.
The resulting prompt might look something like this:
Updated scenario prompt using few-shot prompting
Prompt
Generate a comprehensive market analysis report for a new product category in the finance
industry. The target audience is small and medium-sized businesses (SMBs). Use the attached
template to structure the report into categories. [attach report template]
The following examples are market analysis reports for previously released products.
Example 1: [insert example market analysis report]
Prompt Output
It's important to note that prompt injection can also be employed for
nonmalicious purposes, such as overriding or customizing the responses
from models to suit specific needs. Examples include preserving product
names in translations or tailoring the model's outputs to align with
particular preferences or requirements.
Prompt Output
Prompt leaking
Prompt leaking refers to the unintentional disclosure or leakage of the prompts or
inputs (regardless of whether these are protected data or not) used within a model.
Prompt leaking does not necessarily expose protected data. But it can expose other
data used by the model, which can reveal information of how the model works and
this can be used against it.
The following example illustrates prompt leaking.
Prompt leaking example
Prompt Output
Jailbreaking
Jailbreaking refers to the practice of modifying or circumventing the constraints and
safety measures implemented in a generative model or AI assistant to gain
unauthorized access or functionality.
When an AI model is developed, it is typically trained with certain ethical and safety
constraints in place to prevent misuse or harmful outputs. These constraints can
include filtering out explicit or offensive content, restricting access to sensitive
information, or limiting the ability to carry out certain actions or commands.
Jailbreaking attempts involve crafting carefully constructed prompts or input
sequences that aim to bypass or exploit vulnerabilities in the AI system's filtering
mechanisms or constraints. The goal is to "break out" of the intended model
limitations.
The following example illustrates jailbreaking by asking the model to act as a
character.
Jailbreaking example
Initial prompt
Prompt Output
Updated prompt
Prompt Output