0% found this document useful (0 votes)
24 views

Module 5

Generative AI, exemplified by tools like Microsoft Copilot, is gaining popularity for its ability to produce human-like content through advanced mathematical techniques in machine learning. This module introduces core concepts of generative AI, including its applications in natural language, image, and code generation, as well as the underlying language models that power these capabilities. Understanding transformer models, tokenization, embeddings, and attention mechanisms is essential for grasping how generative AI generates coherent and contextually relevant outputs.

Uploaded by

aishux07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Module 5

Generative AI, exemplified by tools like Microsoft Copilot, is gaining popularity for its ability to produce human-like content through advanced mathematical techniques in machine learning. This module introduces core concepts of generative AI, including its applications in natural language, image, and code generation, as well as the underlying language models that power these capabilities. Understanding transformer models, tokenization, embeddings, and attention mechanisms is essential for grasping how generative AI generates coherent and contextually relevant outputs.

Uploaded by

aishux07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Unit 1 of 11 S テ Ask Learn

" 100 XP

Introduction
1 minute

Generative AI, and technologies that implement it like Microsoft Copilot are increasingly in the
public consciousness – even among people who don't work in technology roles or have a
background in computer science or machine learning. The futurist and novelist Arthur C.
Clarke is quoted as observing that "any sufficiently advanced technology is indistinguishable
from magic". In the case of generative AI, it does seem to have an almost miraculous ability to
produce human-like original content, including poetry, prose, and even computer code.

However, there’s no wizardry involved in generative AI – just the application of mathematical


techniques incrementally discovered and refined over many years of research into statistics,
data science, and machine learning. You can gain a high-level understanding of how the magic
trick is done by learning the core concepts and principles explored in this module. As you
learn more about the generative AI technologies we have today, you can help society imagine
new possibilities for AI tomorrow.

Next unit: What is generative AI?


Next T
Unit 2 of 11 S テ Ask Learn

" 100 XP

What is generative AI?


2 minutes

Artificial Intelligence (AI) imitates human behavior by using machine learning to interact with
the environment and execute tasks without explicit directions on what to output.
Generative AI describes a category of capabilities within AI that create original content. People
typically interact with generative AI that has been built into chat applications. One popular
example of such an application is Microsoft Copilot , an AI-powered productivity tool
designed to enhance your work experience by providing real-time intelligence and assistance.

Generative AI applications take in natural language input, and return appropriate responses in
a variety of formats such as natural language, images, code, and more.

Natural language generation


To generate a natural language response, you might submit a request such as "Write a cover
letter for a person with a bachelor's degree in history."

A generative AI application might respond to such a request like this:


Dear Hiring Manager, I am writing to express my interest in the position of...
Image generation
Some generative AI applications can interpret a natural language request and generate an
appropriate image. For example, you might submit a request like "Create a logo for a florist
business."

A generative AI application could then return an original new image based on the description
you provided, like this:
Code generation
Some generative AI applications are designed to help software developers write code. For
example, you could submit a request like "Write Python code to add two numbers." and
generate the following response:

Python

def add_numbers(a, b):


return a + b

Next unit: What are language models?


R Previous Next T
Unit 3 of 11 S テ Ask Learn

" 100 XP

What are language models?


8 minutes

Generative AI applications are powered by language models, which are a specialized type of
machine learning model that you can use to perform natural language processing (NLP) tasks,
including:

Determining sentiment or otherwise classifying natural language text.


Summarizing text.
Comparing multiple text sources for semantic similarity.
Generating new natural language.
While the mathematical principles behind these language models can be complex, a basic
understanding of the architecture used to implement them can help you gain a conceptual
understanding of how they work.

Transformer models
Machine learning models for natural language processing have evolved over many years.
Today's cutting-edge large language models are based on the transformer architecture, which
builds on and extends some techniques that have been proven successful in modeling
vocabularies to support NLP tasks - and in particular in generating language. Transformer
models are trained with large volumes of text, enabling them to represent the semantic
relationships between words and use those relationships to determine probable sequences of
text that make sense. Transformer models with a large enough vocabulary are capable of
generating language responses that are tough to distinguish from human responses.
Transformer model architecture consists of two components, or blocks:

An encoder block that creates semantic representations of the training vocabulary.


A decoder block that generates new language sequences.
1. The model is trained with a large volume of natural language text, often sourced from
the internet or other public sources of text.
2. The sequences of text are broken down into tokens (for example, individual words) and
the encoder block processes these token sequences using a technique called attention to
determine relationships between tokens (for example, which tokens influence the
presence of other tokens in a sequence, different tokens that are commonly used in the
same context, and so on.)
3. The output from the encoder is a collection of vectors (multi-valued numeric arrays) in
which each element of the vector represents a semantic attribute of the tokens. These
vectors are referred to as embeddings.
4. The decoder block works on a new sequence of text tokens and uses the embeddings
generated by the encoder to generate an appropriate natural language output.
5. For example, given an input sequence like "When my dog was" , the model can use the
attention technique to analyze the input tokens and the semantic attributes encoded in
the embeddings to predict an appropriate completion of the sentence, such as "a
puppy" .

In practice, the specific implementations of the architecture vary – for example, the
Bidirectional Encoder Representations from Transformers (BERT) model developed by Google
to support their search engine uses only the encoder block, while the Generative Pretrained
Transformer (GPT) model developed by OpenAI uses only the decoder block.

While a complete explanation of every aspect of transformer models is beyond the scope of
this module, an explanation of some of the key elements in a transformer can help you get a
sense for how they support generative AI.

Tokenization
The first step in training a transformer model is to decompose the training text into tokens - in
other words, identify each unique text value. For the sake of simplicity, you can think of each
distinct word in the training text as a token (though in reality, tokens can be generated for
partial words, or combinations of words and punctuation).
For example, consider the following sentence:

I heard a dog bark loudly at a cat

To tokenize this text, you can identify each discrete word and assign token IDs to them. For
example:

- I (1)
- heard (2)
- a (3)
- dog (4)
- bark (5)
- loudly (6)
- at (7)
- ("a" is already tokenized as 3)
- cat (8)

The sentence can now be represented with the tokens: {1 2 3 4 5 6 7 3 8}. Similarly, the
sentence "I heard a cat" could be represented as {1 2 3 8}.

As you continue to train the model, each new token in the training text is added to the
vocabulary with appropriate token IDs:

- meow (9)
- skateboard (10)
- *and so on...*

With a sufficiently large set of training text, a vocabulary of many thousands of tokens could
be compiled.

Embeddings
While it may be convenient to represent tokens as simple IDs - essentially creating an index for
all the words in the vocabulary, they don't tell us anything about the meaning of the words, or
the relationships between them. To create a vocabulary that encapsulates semantic
relationships between the tokens, we define contextual vectors, known as embeddings, for
them. Vectors are multi-valued numeric representations of information, for example [10, 3, 1]
in which each numeric element represents a particular attribute of the information. For
language tokens, each element of a token's vector represents some semantic attribute of the
token. The specific categories for the elements of the vectors in a language model are
determined during training based on how commonly words are used together or in similar
contexts.

Vectors represent lines in multidimensional space, describing direction and distance along
multiple axes (you can impress your mathematician friends by calling these amplitude and
magnitude). It can be useful to think of the elements in an embedding vector for a token as
representing steps along a path in multidimensional space. For example, a vector with three
elements represents a path in 3-dimensional space in which the element values indicate the
units traveled forward/back, left/right, and up/down. Overall, the vector describes the direction
and distance of the path from origin to end.

The elements of the tokens in the embeddings space each represent some semantic attribute
of the token, so that semantically similar tokens should result in vectors that have a similar
orientation – in other words they point in the same direction. A technique called cosine
similarity is used to determine if two vectors have similar directions (regardless of distance),
and therefore represent semantically linked words. As a simple example, suppose the
embeddings for our tokens consist of vectors with three elements, for example:

- 4 ("dog"): [10,3,2]
- 8 ("cat"): [10,3,1]
- 9 ("puppy"): [5,2,1]
- 10 ("skateboard"): [-3,3,2]

We can plot these vectors in three-dimensional space, like this:


The embedding vectors for "dog" and "puppy" describe a path along an almost identical
direction, which is also fairly similar to the direction for "cat" . The embedding vector for
"skateboard" however describes journey in a very different direction.

7 Note

The previous example shows a simple example model in which each embedding has only
three dimensions. Real language models have many more dimensions.

There are multiple ways you can calculate appropriate embeddings for a given set of tokens,
including language modeling algorithms like Word2Vec or the encoder block in a transformer
model.

Attention
The encoder and decoder blocks in a transformer model include multiple layers that form the
neural network for the model. We don't need to go into the details of all these layers, but it's
useful to consider one of the types of layers that is used in both blocks: attention layers.
Attention is a technique used to examine a sequence of text tokens and try to quantify the
strength of the relationships between them. In particular, self-attention involves considering
how other tokens around one particular token influence that token's meaning.
In an encoder block, each token is carefully examined in context, and an appropriate encoding
is determined for its vector embedding. The vector values are based on the relationship
between the token and other tokens with which it frequently appears. This contextualized
approach means that the same word might have multiple embeddings depending on the
context in which it's used - for example "the bark of a tree" means something different to
"I heard a dog bark" .

In a decoder block, attention layers are used to predict the next token in a sequence. For each
token generated, the model has an attention layer that takes into account the sequence of
tokens up to that point. The model considers which of the tokens are the most influential
when considering what the next token should be. For example, given the sequence "I heard
a dog" , the attention layer might assign greater weight to the tokens "heard" and "dog"

when considering the next word in the sequence:

I *heard* a *dog* {*bark*}


Remember that the attention layer is working with numeric vector representations of the
tokens, not the actual text. In a decoder, the process starts with a sequence of token
embeddings representing the text to be completed. The first thing that happens is that
another positional encoding layer adds a value to each embedding to indicate its position in
the sequence:

- [**1**,5,6,2] (I)
- [**2**,9,3,1] (heard)
- [**3**,1,1,2] (a)
- [**4**,10,3,2] (dog)

During training, the goal is to predict the vector for the final token in the sequence based on
the preceding tokens. The attention layer assigns a numeric weight to each token in the
sequence so far. It uses that value to perform a calculation on the weighted vectors that
produces an attention score that can be used to calculate a possible vector for the next token.
In practice, a technique called multi-head attention uses different elements of the embeddings
to calculate multiple attention scores. A neural network is then used to evaluate all possible
tokens to determine the most probable token with which to continue the sequence. The
process continues iteratively for each token in the sequence, with the output sequence so far
being used regressively as the input for the next iteration – essentially building the output one
token at a time.
The following animation shows a simplified representation of how this works – in reality, the
calculations performed by the attention layer are more complex; but the principles can be
simplified as shown:
1. A sequence of token embeddings is fed into the attention layer. Each token is
represented as a vector of numeric values.
2. The goal in a decoder is to predict the next token in the sequence, which will also be a
vector that aligns to an embedding in the model’s vocabulary.
3. The attention layer evaluates the sequence so far and assigns weights to each token to
represent their relative influence on the next token.
4. The weights can be used to compute a new vector for the next token with an attention
score. Multi-head attention uses different elements in the embeddings to calculate
multiple alternative tokens.
5. A fully connected neural network uses the scores in the calculated vectors to predict the
most probable token from the entire vocabulary.
6. The predicted output is appended to the sequence so far, which is used as the input for
the next iteration.
During training, the actual sequence of tokens is known – we just mask the ones that come
later in the sequence than the token position currently being considered. As in any neural
network, the predicted value for the token vector is compared to the actual value of the next
vector in the sequence, and the loss is calculated. The weights are then incrementally adjusted
to reduce the loss and improve the model. When used for inferencing (predicting a new
sequence of tokens), the trained attention layer applies weights that predict the most probable
token in the model’s vocabulary that is semantically aligned to the sequence so far.
What all of this means, is that a transformer model such as GPT-4 (the model behind ChatGPT
and Bing) is designed to take in a text input (called a prompt) and generate a syntactically
correct output (called a completion). In effect, the "magic" of the model is that it has the ability
to string a coherent sentence together. This ability doesn't imply any "knowledge" or
"intelligence" on the part of the model; just a large vocabulary and the ability to generate
meaningful sequences of words. What makes a large language model like GPT-4 so powerful
however, is the sheer volume of data with which it has been trained (public and licensed data
from the Internet) and the complexity of the network. This enables the model to generate
completions that are based on the relationships between words in the vocabulary on which
the model was trained; often generating output that is indistinguishable from a human
response to the same prompt.

Next unit: Using language models


R Previous Next T
Unit 4 of 11 S テ Ask Learn

" 100 XP

Using language models


3 minutes

Organizations and developers can train their own language models from scratch, but in most
cases it’s more practical to use an existing foundation model, and optionally fine-tune it with
your own training data. There are many sources of model that you can use.

On Microsoft Azure, you can find foundation models in the Azure OpenAI service and in the
Model Catalog. The Model Catalog is a curated source of models for data scientists and
developers. This offers the benefit of cutting-edge language models like the generative pre-
trained transformer (GPT) collection of models (on which ChatGPT and Microsoft's own
generative AI services are based) as well as the DALL-E model for image generation. Using
these models from the Azure OpenAI service means that you also get the benefit of a secure,
scalable Azure cloud platform in which the models are hosted.
In addition to the Azure OpenAI models, the model catalog includes the latest open-source
models from Microsoft and multiple partners, including:
OpenAI
HuggingFace
Mistral
Meta and others.

A few common Azure OpenAI models are:


GPT-3.5-Turbo, GPT-4, and GPT-4o: Conversation-in and message-out language models.
GPT-4 Turbo with Vision: A language model developed by OpenAI that can analyze
images and provide textual responses to questions about them. It incorporates both
natural language processing and visual understanding.
DALL-E: A language model that generates original images, variations of images, and can
edit images.

Large and small language models


There are many language models available that you can use to power generative AI
applications. In general, language models can be considered in two categories: Large
Language Models (LLMs) and Small Language models (SLMs).
ノ Expand table

Large Language Models (LLMs) Small Language Models (SLMs)

LLMs are trained with vast quantities of text that SLMs are trained with smaller, more subject-
represents a wide range of general subject focused datasets
matter – typically by sourcing data from the
Internet and other generally available
publications.

When trained, LLMs have many billions (even Typically have fewer parameters than LLMs.
trillions) of parameters (weights that can be
applied to vector embeddings to calculate
predicted token sequences).

Able to exhibit comprehensive language This focused vocabulary makes them very
generation capabilities in a wide range of effective in specific conversational topics, but
conversational contexts. less effective at more general language
generation.

Their large size can impact their performance The smaller size of SLMs can provide more
and make them difficult to deploy locally on options for deployment, including local
devices and computers. deployment to devices and on-premises
computers; and makes them faster and easier
to fine-tune.

Fine-tuning the model with additional data to Fine-tuning can potentially be less time-
customize its subject expertise can be time- consuming and expensive.
consuming, and expensive in terms of the
compute power required to perform the
additional training.

Next unit: Copilot and AI agents


R Previous Next T
Unit 5 of 11 S テ Ask Learn

" 100 XP

Copilot and AI agents


3 minutes

The availability of language models has led to the emergence of new ways to interact with
applications and systems through generative AI chat-based assistants and agents that are
integrated into applications to help users find information and perform business tasks
efficiently.

Microsoft Copilot is a generative AI based assistant that is integrated into a wide range of
Microsoft applications and user experiences. It is based on an open architecture that enables
third-party developers to extend the Microsoft Copilot user experience. Additionally, third-
party developers can create their own copilot-like agents using the same open architecture.

Business users can use Microsoft Copilot to boost their productivity and creativity with AI-
generated content and automation of tasks. Developers can extend Microsoft Copilot by
creating plug-ins that integrate Copilot into business processes and data, or even create
copilot-like agents to build generative AI capabilities into apps and services.
Generative AI assistants such as Microsoft Copilot have the potential to revolutionize the way
we work by helping with first drafts, information synthesis, strategic planning, and much more.
The goal of generative AI assistants is to empower people to be smarter, more productive,
more creative, and connected to the people and things around them.
Adopting generative AI in your business
In general, you can categorize industry and personal generative AI assistant adoption into
three buckets: off-the-shelf use of Microsoft Copilot, extending Microsoft Copilot, and
building copilot-like agents.

You can use off-the-shelf generative AI assistants, like Microsoft 365 Copilot, to empower
users and increase their productivity.
You can extend Microsoft Copilot to support custom business processes or tasks, using
your own data to control how Copilot responds to user prompts in your organization.
You can build your own copilot-like agents to integrate generative AI into business apps
or to create unique experiences for your customers.

Next unit: Understand Microsoft Copilot


Unit 6 of 11 S テ Ask Learn

" 100 XP

Understand Microsoft Copilot


6 minutes

Microsoft Copilot features can be found throughout all different Microsoft applications. They
unlock productivity across your organization, safeguard your business, and build and extend
your AI capabilities. Explore some of the different use cases for Microsoft Copilot below.

Web browsing with AI


Microsoft Copilot: use Microsoft Copilot to answer questions, create content, and search the
web with the Microsoft Copilot app at https://copilot.microsoft.com , when using the Bing
search engine, and in the Edge browser. For example, you can ask Microsoft Copilot to
generate a list of opportunities in an industry or give more detailed information from your
search results.

When you browse with Microsoft Edge, Copilot is built right in. You can open the Copilot pane
in the browser and use it to research topics and create new content – for example to publish a
blog post. With all of these Copilot options, signing in with a work or school account enables
you to use Copilot in the context of your organization’s data and services – enabling you to
get assistance with internal resources and information.

AI assistance for information workers


Microsoft 365 Copilot: Microsoft 365 integrates Copilot into the productivity applications that
information workers use every day. You can use Copilot in Microsoft Word to generate a new
document based on a natural language prompt, and then refine, summarize, and improve the
document with a few prompts.
You can use Copilot in Microsoft PowerPoint to create a whole presentation based on the
contents of a document or email, and then add graphics, reformat slides, and otherwise
improve your presentation.
In Microsoft Outlook, Copilot can help you summarize your emails, check your schedule, and
even find relevant emails and documents to prepare for meetings.
These are just some examples of how you can use Microsoft 365 Copilot. There’s lots more
you can accomplish in Windows, Excel, Teams, and other apps. Learn more at
https://www.microsoft.com/microsoft-365/enterprise/copilot-for-microsoft-365

Use AI to support business processes


Copilot in Dynamics 365 Customer Service: Modernizes contact centers with generative AI.
Customer service agents use Copilot to analyze support tickets, research similar issues, find
resolutions, and communicate them to users with only a few clicks and prompts.
Copilot in Dynamics 365 Sales: Sales professionals can use Copilot to quickly find relevant
customer and industry information by integrating with the company’s customer relationship
management (CRM) database and beyond. This can enable an account manager to quickly
review and qualify a lead, generate a proposal, and set up a customer engagement to close
the deal.

Copilot in Dynamics 365 Supply Chain Management: Handles changes to purchase orders at
scale and assess the impact and risk to help optimize procurement decisions. For example,
Copilot identifies the level of impact that changes to purchase orders have on downstream
processes and gives advice for next steps.
AI assisted data analytics
Copilot in Microsoft Fabric: Copilot enables analysts to automatically generate the code they
need to analyze, manipulate, and visualize data in Spark notebooks.
Copilot in Power BI: When creating Power BI reports, Copilot can analyze your data and then
suggest and create appropriate data visualizations from it.

Manage IT infrastructure and security


Copilot for Security: Provides assistance for security professionals as they assess, mitigate, and
respond to security threats.
Copilot for Azure: Integrated into the Azure portal to assist infrastructure administrators as
they work with Azure cloud services.
AI powered software development
GitHub Copilot: Helps developers maximize their productivity by analyzing and explaining
code, adding code documentation, generating new code based on natural language prompts,
and more.

Next unit: Considerations for prompts


R Previous Next T
Unit 7 of 11 S テ Ask Learn

" 100 XP

Considerations for prompts


3 minutes

The quality of responses from generative AI assistants not only depends on the language
model used, but on the types of prompts users provide. Prompts are ways we tell an
application what we want it to do. You can get the most useful completions by being explicit
about the kind of response you want. Take this example, "Summarize the key considerations
for adopting Copilot described in this document for a corporate executive. Format the
summary as no more than six bullet points with a professional tone." You can achieve better
results when you submit clear, specific prompts.

Consider the following ways you can improve the response a generative AI assistant provides:
1. Start with a specific goal for what you want the assistant to do
2. Provide a source to ground the response in a specific scope of information
3. Add context to maximize response appropriateness and relevance
4. Set clear expectations for the response
5. Iterate based on previous prompts and responses to refine the result
In most cases, an agent doesn't just send your prompt as-is to the language model. Usually,
your prompt is augmented with:
A system message that sets conditions and constraints for the language model behavior.
For example, "You're a helpful assistant that responds in a cheerful, friendly manner."
These system messages determine constraints and styles for the model's responses.
The conversation history for the current session, including past prompts and responses.
The history enables you to refine the response iteratively while maintaining the context
of the conversation.
The current prompt – potentially optimized by the agent to reword it appropriately for
the model or to add more grounding data to scope the response.
The term prompt engineering describes the process of prompt improvement. Both developers
who design applications and consumers who use those applications can improve the quality of
responses from generative AI by considering prompt engineering.

Next unit: Extending and developing copilot-like agents


R Previous Next T
Unit 8 of 11 S テ Ask Learn

" 100 XP

Extending and developing copilot-like


agents
3 minutes

If your organization makes the decision to extend Microsoft Copilot or develop copilot-like
agents, Microsoft provides two tools that you can use, Copilot Studio and Azure AI Foundry.

Copilot Studio
Copilot Studio is designed to work well for low-code development scenarios in which
technically proficient business users or developers can create conversational AI experiences.
The resulting agent is a fully managed SaaS (software as a service) solution, hosted in your
Microsoft 365 environment and delivered through chat channels like Microsoft Teams. With
Copilot Studio, the infrastructure considerations and model deployment details are taken care
of for you, making it easy to focus on creating an effective solution. For more information, see
https://www.microsoft.com/microsoft-copilot/microsoft-copilot-studio .
Azure AI Foundry
Azure AI Foundry is a PaaS (platform as a service) platform for developers that gives you full
control over the language models you want to use, including the capability to fine-tune the
models with your own data. You can define prompt flows, orchestrate conversation flow,
integrate your own data augmentation and prompt engineering logic, and you can deploy the
resulting copilot service in the cloud and consume it from custom-developed apps and
services. Learn more about Azure AI Foundry here.
Next unit: Exercise - Explore generative AI in Azure AI
Foundry portal
R Previous Next T
Unit 10 of 11 S テ Ask Learn

" 200 XP

Knowledge check
Module assessment 3 minutes

1. What are Large Language Models? *

Models that detect additional meaning in paragraphs of text.

Lists of words and code that computers use to generate text.

Models that use deep learning to process and understand natural language on
a massive scale.
" Correct. Large language models use deep learning to process and understand
natural language on a massive scale.

2. Which Microsoft Copilot should a customer support agent use to research and resolve a
support issue? *

Microsoft Copilot for Microsoft Edge

Microsoft Copilot in Dynamics 365 Customer Service


" Customer service agents can use Copilot in Dynamics 365 Customer Service to
analyze support tickets, research similar issues, find resolutions, and communicate
them to users with only a few clicks and prompts.
Copilot for Security

3. Which tool should a professional developer use to build a custom copilot and deploy it as
a service endpoint in Azure? *

Copilot for Azure

Microsoft Copilot Studio

Microsoft Azure AI Foundry


" Microsoft Azure AI Foundry is designed as a unified development portal for
professional software developers to allow for full customization of language
Unit 1 of 9 S テ Ask Learn

" 100 XP

Introduction
1 minute

The growth in the use of artificial intelligence (AI) in general, and generative AI in particular
means that developers are increasingly required to create comprehensive AI solutions. These
solutions need to combine machine learning models, AI services, prompt engineering
solutions, and custom code.

Microsoft Azure provides multiple services that you can use to create AI solutions. However,
before embarking on an AI application development project, it's useful to consider the
available options for services, tools, and frameworks as well as some principles and practices
that can help you succeed.

This module explores some of the key considerations for planning an AI development project,
and introduces Azure AI Foundry; a comprehensive platform for AI development on Microsoft
Azure.

Next unit: What is AI?


Next T
Unit 2 of 9 S テ Ask Learn

" 100 XP

What is AI?
5 minutes

The term "Artificial Intelligence" (AI) covers a wide range of software capabilities that enable
applications to exhibit human-like behavior. AI has been around for many years, and its
definition has varied as the technology and use cases associated with it have evolved. In
today's technological landscape, AI solutions are built on machine learning models that
encapsulate semantic relationships found in huge quantities of data; enabling applications to
appear to interpret input in various formats, reason over the input data, and generate
appropriate responses and predictions.
Common AI capabilities that developers can integrate into a software application include:

ノ Expand table

Capability Description

The ability to generate original responses to natural language prompts. For


example, software for a real estate business might be used to automatically
generate property descriptions and advertising copy for a property listing.

Generative AI

Generative AI applications that can respond to user input or assess situations


autonomously, and take appropriate actions. For example, an "executive
assistant" agent could provide details about the location of a meeting on your
calendar, or even attach a map or automate the booking of a taxi or rideshare
service to help you get there.
Agents

The ability to accept, interpret, and process visual input from images, videos,
and live camera streams. For example, an automated checkout in a grocery
store might use computer vision to identify which products a customer has in
their shopping basket, eliminating the need to scan a barcode or manually
enter the product and quantity.
Computer
vision
Capability Description

The ability to recognize and synthesize speech. For example, a digital assistant
might enable users to ask questions or provide audible instructions by
speaking into a microphone, and generate spoken output to provide answers
or confirmations.
Speech

The ability to process natural language in written or spoken form, analyze it,
identify key points, and generate summaries or categorizations. For example, a
marketing application might analyze social media messages that mention a
particular company, translate them to a specific language, and categorize them
as positive or negative based on sentiment analysis.
Natural
language
processing

The ability to use computer vision, speech, and natural language processing to
extract key information from documents, forms, images, recordings, and other
kinds of content. For example, an automated expense claims processing
application might extract purchase dates, individual line item details, and total
costs from a scanned receipt.
Information
extraction

The ability to use historic data and learned correlations to make predictions
that support business decision making. For example, analyzing demographic
and economic factors in a city to predict real estate market trends that inform
property pricing decisions.

Decision
support

Determining the specific AI capabilities you want to include in your application can help you
identify the most appropriate AI services that you'll need to provision, configure, and use in
your solution.

Next unit: Azure AI services


R Previous Next T
Unit 3 of 9 S テ Ask Learn

" 100 XP

Azure AI services
5 minutes

Microsoft Azure provides a wide range of cloud services that you can use to develop, deploy,
and manage an AI solution. The most obvious starting point for considering AI development
on Azure is Azure AI services; a set of out-of-the-box prebuilt APIs and models that you can
integrate into your applications. The following table lists some commonly used Azure AI
services (for a full list of all available Azure AI services, see Available Azure AI services).

ノ Expand table

Service Description

The Azure OpenAI service provides access to OpenAI generative AI models


including the GPT family of large and small language models and DALL-E
image-generation models within a scalable and securable cloud service on
Azure OpenAI Azure.

The Azure AI Vision service provides a set of models and APIs that you can
use to implement common computer vision functionality in an application.
With the AI Vision service, you can detect common objects in images,
Azure AI Vision generate captions, descriptions, and tags based on image contents, and
read text in images.

The Azure AI Speech service provides APIs that you can use to implement
text to speech and speech to text transformation, as well as specialized
speech-based capabilities like speaker recognition and translation.
Azure AI Speech

The Azure AI Language service provides models and APIs that you can use
to analyze natural language text and perform tasks such as entity extraction,
sentiment analysis, and summarization. The AI Language service also
Azure AI provides functionality to help you build conversational language models
Language and question answering solutions.
Service Description

Azure AI Content Safety provides developers with access to advanced


algorithms for processing images and text and flagging content that is
potentially offensive, risky, or otherwise undesirable.
Azure AI Content
Safety

The Azure AI Translator service uses state-of-the-art language models to


translate text between a large number of languages.

Azure AI
Translator

The Azure AI Face service is a specialist computer vision implementation that


can detect, analyze, and recognize human faces. Because of the potential
risks associated with personal identification and misuse of this capability,
Azure AI Face access to some features of the AI Face service are restricted to approved
customers.

The Azure AI Custom Vision service enables you to train and use custom
computer vision models for image classification and object detection.

Azure AI Custom
Vision

With Azure AI Document Intelligence, you can use pre-built or custom


models to extract fields from complex documents such as invoices, receipts,
and forms.
Azure AI
Document
Intelligence

The Azure AI Content Understanding service provides multi-modal content


analysis capabilities that enable you to build models to extract data from
forms and documents, images, videos, and audio streams.
Azure AI Content
Understanding
Service Description

The Azure AI Search service uses a pipeline of AI skills based on other Azure
AI Services and custom code to extract information from content and create
a searchable index. AI Search is commonly used to create vector indexes for
Azure AI Search data that can then be used to ground prompts submitted to generative AI
language models, such as those provided in the Azure OpenAI service.

Considerations for Azure AI services resources


To use Azure AI services, you create one or more Azure AI resources in an Azure subscription
and implement code in client applications to consume them. In some cases, AI services include
web-based visual interfaces that you can use to configure and test your resources - for
example to train a custom image classification model using the Custom Vision service you can
use the visual interface to upload training images, manage training jobs, and deploy the
resulting model.

7 Note

You can provision Azure AI services resources in the Azure portal (or by using BICEP or
ARM templates or the Azure command-line interface) and build applications that use
them directly through various service-specific APIs and SDKs. However, as we'll discuss
later in this module, in most medium to large-scale development scenarios it's better to
provision Azure AI services resources as part of an Azure Foundry hub - enabling you to
centralize access control and cost management, and making it easier to manage shared
resource usage based on AI development projects.

Single service or multi-service resource?


Most Azure AI services, such as Azure AI Vision, Azure AI Language, and so on, can be
provisioned as standalone resources, enabling you to create only the Azure resources you
specifically need. Additionally, standalone Azure AI services often include a free-tier SKU with
limited functionality, enabling you to evaluate and develop with the service at no cost. Each
standalone Azure AI resource provides an endpoint and authorization keys that you can use to
access it securely from a client application.
Alternatively, you can provision a multi-service Azure AI services resource that encapsulates
the following services in a single Azure resource:
Azure OpenAI
Azure AI Speech
Azure AI Vision
Azure AI Language
Azure AI Content Safety
Azure AI Translator
Azure AI Document Intelligence
Azure AI Content Understanding
Using a multi-service resource can make it easier to manage applications that use multiple AI
capabilities.

 Tip

There may be more than one Azure AI services resource type available in the Azure
portal.

ノ Expand table

Service Ico
n

When you want to provision an Azure AI Services resource, be careful to select the
Azure AI services resource type with the icon shown here. This resource type includes
the latest AI services.

An older Azure AI services resource type with a different icon may also be listed in
the Azure portal. The older service encapsulates a different set of AI services and isn't
suitable for working with newer services like Azure OpenAI and Azure AI Content
Understanding.

Regional availability
Some services and models are available in only a subset of Azure regions. Consider service
availability and any regional quota restrictions for your subscription when provisioning Azure
AI services. Use the product availability table to check regional availability of Azure services.
Use the model availability table in the Azure OpenAI service documentation to determine
regional availability for Azure OpenAI models.

Cost
Azure AI services are charged based on usage, with different pricing schemes available
depending on the specific services being used. As you plan an AI solution on Azure, use the
Azure AI services pricing documentation to understand pricing for the AI services you intend
to incorporate into your application. You can use the Azure pricing calculator to estimate the
costs your expected usage will incur.

Next unit: Azure AI Foundry


R Previous Next T
Unit 4 of 9 S テ Ask Learn

" 100 XP

Azure AI Foundry
5 minutes

Azure AI Foundry is a platform for AI development on Microsoft Azure. While you can
provision individual Azure AI services resources and build applications that consume them
without it, the project organization, resource management, and AI development capabilities of
Azure AI Foundry makes it the recommended way to build all but the most simple solutions.

Azure AI Foundry provides the Azure AI Foundry portal, a web-based visual interface for
working with AI projects. It also provides the Azure AI Foundry SDK, which you can use to build
AI solutions programmatically.

Hubs and projects


In Azure AI Foundry, you manage the resources, assets, code, and other elements of the AI
solution in hubs and projects. Hubs provide a top-level container for managing shared
resources, data, connections and security configuration for AI application development. A hub
can support multiple projects, in which developers collaborate on building a specific solution.

Hubs
A hub provides a centrally managed collection of shared resources and management
configuration for AI solution development. You need at least one hub to use all of the solution
development features and capabilities of AI Foundry.
In a hub, you can define shared resources to be used across multiple projects. When you
create a hub using the Azure AI Foundry portal, an Azure Azure AI Hub resource is created in a
resource group associated with the hub. Additionally, the following resources are created for
the hub:
A multi-service Azure AI services resource to provide access to Azure OpenAI and other
Azure AI services.
A Key vault in which sensitive data such as connections and credentials can be stored
securely.
A Storage account for data used in the hub and its projects.
Optionally, an Azure AI Search resource that can be used to index data and support
grounding for generative AI prompts.
You can create more resources as required (for example, an Azure AI Face resource) and add it
to the hub (or an individual project) by defining a connected resource. As you create more
items in your hub, such as compute instances or endpoints, more resources will be created for
them in the Azure resource group.

Access to the resources in a hub is governed by creating users and assigning them to roles. An
IT administrator can manage access to the resources centrally at the hub level, and projects
associated with the hub inherit the resources and role assignments; enabling development
teams to use the resources they need without needing to request access on a project-by-
project basis.

Projects
A hub can support one or more projects, each of which is used to organize the resources and
assets required for a particular AI development effort.

Users can collaborate in a project, sharing data in project-specific storage containers and
connected resources, and using the shared resources defined in the hub associated with the
project. Azure AI Foundry provides tools and functionality within a project that developers can
use to build AI solutions efficiently, including:

A model catalog in which you can find and deploy machine learning models from
multiple sources, including Azure OpenAI and the Hugging Face model library.
Playgrounds in which you can test prompts with generative AI models.
Access to Azure AI services, including visual interfaces to experiment with and configure
services as well as endpoints and keys that you can use to connect to them from client
applications.
Visual Studio Code containers that define a hosted development environment in which
you can write, test, and deploy code.
Fine-tuning functionality for generative AI models that you need to customize based on
custom training prompts and responses.
Prompt Flow, a prompt orchestration tool that you can use to define the logic for a
generative AI application's interaction with a model.
Tools to assess, evaluate, and improve your AI applications, including tracing, evaluations,
and content safety and security management.
Management of project assets, including models and endpoints, data and indexes, and
deployed web apps.

Considerations for Azure AI Foundry


When planning an AI solution built on Azure AI Foundry, there are some additional
considerations to those discussed previously in relation to Azure AI services.

Hub and project organization


Plan your hub and project organization for the most effective management of resources and
efficiency of administration. Use Hubs to centralize management of users and shared
resources that are involved in related projects, and then add project-specific resources as
necessary. For example, an organization might have separate software development teams for
each area of the business, so it may make sense to create separate hubs for each business area
(such as Marketing, HR, and so on) in which AI application development projects for each
business area can be created. The shared resources in each hub will automatically be available
in projects created in those hubs.

 Tip

For more information about hubs and projects, see Manage, collaborate, and organize
with hubs.

Connected resources
At the hub level, an IT administrator can create shared resource connections in a hub that will
be used in downstream projects. Projects access the connected resources by proxy on behalf
of project users, so users in those projects don't need direct access to those resources in order
to use them within the context of the project. Connections in a hub are automatically available
in new projects in the hub without further requests to the IT administrator. If an individual
project needs access to a specific resource that other projects in the same hub don't use, you
can create more connected resources at the project level.
As you plan your Azure AI Foundry hubs and projects, identify the shared connected resources
you should add to each hub so that they're inherited by projects in that hub, while allowing for
project-level exceptions.

 Tip

For more information about connected resources, see Connections in Azure AI Foundry
portal.

Security and authorization


For each hub and project, identify the users who will need access and the roles to which they
should be assigned.

Hub-level roles can perform infrastructure management tasks, such as creating hub-level
connected resources or new projects. The default roles in a hub are:

Owner: Full access to the hub, including the ability to manage and create new hubs and
assign permissions. This role is automatically assigned to the hub creator
Contributor: Full access to the hub, including the ability to create new hubs, but isn't able
to manage hub permissions on the existing resource.
Azure AI Developer: All permissions except create new hubs and manage the hub
permissions.
Azure AI Inference Deployment Operator: All permissions required to create a resource
deployment within a resource group.
Reader: Read only access to the hub. This role is automatically assigned to all project
members within the hub.

Project-level roles determine the tasks that a user can perform within an individual project. The
default roles in a project are:

Owner: Full access to the project, including the ability to assign permissions to project
users.
Contributor: Full access to the project but can't assign permissions to project users.
Azure AI Developer: Permissions to perform most actions, including create deployments,
but can't assign permissions to project users.
Azure AI Inference Deployment Operator: Permissions to perform all actions required to
create a resource deployment within a resource group.
Reader: Read only access to the project.

 Tip
For more information about managing roles in Azure AI Foundry hubs and projects, see
Role-based access control in Azure AI Foundry portal.

Regional availability
As with all Azure services, the availability of specific Azure AI Foundry capabilities can vary by
region. As you plan your solution, determine regional availability for the capabilities you
require.

 Tip

For more information about regional availability of Azure AI Foundry, see Azure AI
Foundry feature availability across clouds regions.

Costs and quotas


In addition to the cost of the Azure AI services your solution uses, there are costs associated
with Azure AI Foundry related to the resources that support hubs and projects as well as
storage and compute for assets, development, and deployed solutions. You should consider
these costs when planning to use Azure AI Foundry for AI solution development.

In addition to service consumption costs, you should consider the resource quotas you need
to support the AI applications you intend to build. Quotas are used to limit utilization, and play
a key role in cost management and managing Azure capacity. In some cases, you may need to
request additional quota to increase rate limits for AI model operations or available compute
for development and solution deployment.

 Tip

For more information about planning and managing costs for Azure AI Foundry, see Plan
and manage costs for Azure AI Foundry. For more information about managing quota
for Azure AI Foundry, see Manage and increase quotas for resources with Azure AI
Foundry.

Next unit: Developer tools and SDKs


Unit 5 of 9 S テ Ask Learn

" 100 XP

Developer tools and SDKs


5 minutes

While you can perform many of the tasks needed to develop an AI solution directly in the
Azure AI Foundry portal, developers also need to write, test, and deploy code.

Development tools and environments


There are many development tools and environments available, and developers should choose
one that supports the languages, SDKs, and APIs they need to work with and with which
they're most comfortable. For example, a developer who focuses strongly on building
applications for Windows using the .NET Framework might prefer to work in an integrated
development environment (IDE) like Microsoft Visual Studio. Conversely, a web application
developer who works with a wide range of open-source languages and libraries might prefer
to use a code editor like Visual Studio Code (VS Code). Both of these products are suitable for
developing AI applications on Azure.

The Azure AI Foundry VS Code container image


As an alternative to installing and configuring your own development environment, within
Azure AI Foundry portal, you can create compute and use it to host a container image for VS
Code (installed locally or as a hosted web application in a browser). The benefit of using the
container image is that it includes the latest versions of the SDK packages you're most likely to
work with when building AI applications with Azure AI Foundry.
 Tip

For more information about using the VS Code container image in Azure AI Foundry
portal, see Get started with Azure AI Foundry projects in VS Code.

) Important

When planning to use the VS Code container image in Azure AI Foundry, consider the
cost of the compute required to host it and the quota you have available to support
developers using it.

GitHub and GitHub Copilot


GitHub is the world's most popular platform for source control and DevOps management, and
can be a critical element of any team development effort. Visual Studio and VS Code
(including the Azure AI Foundry VS Code container image) both provide native integration
with GitHub, and access to GitHub Copilot; an AI assistant that can significantly improve
developer productivity and effectiveness.
Programming languages, APIs, and SDKs
You can develop AI applications using many common programming languages and
frameworks, including Microsoft C#, Python, Node, TypeScript, Java, and others. When
building AI solutions on Azure, some common SDKs you should plan to install and use include:
The Azure AI Foundry SDK, which enables you to write code to connect to Azure AI
Foundry projects and access resource connections, which you can then work with using
service-specific SDKs.
Azure AI Services SDKs - AI service-specific libraries for multiple programming
languages and frameworks that enable you to consume Azure AI Services resources in
your subscription. You can also use Azure AI Services through their REST APIs.
The Azure AI Agent Service, which is accessed through the Azure AI Foundry SDK and
can be integrated with frameworks like AutoGen and Semantic Kernel to build
comprehensive AI agent solutions.
The Prompt Flow SDK, which you can use to implement orchestration logic to manage
prompt interactions with generative AI models.

Next unit: Responsible AI


R Previous Next T
Unit 6 of 9 S テ Ask Learn

" 100 XP

Responsible AI
5 minutes

It's important for software engineers to consider the impact of their software on users, and
society in general; including considerations for its responsible use. When the application is
imbued with artificial intelligence, these considerations are particularly important due to the
nature of how AI systems work and inform decisions; often based on probabilistic models,
which are in turn dependent on the data with which they were trained.
The human-like nature of AI solutions is a significant benefit in making applications user-
friendly, but it can also lead users to place a great deal of trust in the application's ability to
make correct decisions. The potential for harm to individuals or groups through incorrect
predictions or misuse of AI capabilities is a major concern, and software engineers building AI-
enabled solutions should apply due consideration to mitigate risks and ensure fairness,
reliability, and adequate protection from harm or discrimination.

Let's discuss some core principles for responsible AI that have been adopted at Microsoft.

Fairness

AI systems should treat all people fairly. For example, suppose you create a machine learning
model to support a loan approval application for a bank. The model should make predictions
of whether or not the loan should be approved without incorporating any bias based on
gender, ethnicity, or other factors that might result in an unfair advantage or disadvantage to
specific groups of applicants.
Fairness of machine learned systems is a highly active area of ongoing research, and some
software solutions exist for evaluating, quantifying, and mitigating unfairness in machine
learned models. However, tooling alone isn't sufficient to ensure fairness. Consider fairness
from the beginning of the application development process; carefully reviewing training data
to ensure it's representative of all potentially affected subjects, and evaluating predictive
performance for subsections of your user population throughout the development lifecycle.
Reliability and safety

AI systems should perform reliably and safely. For example, consider an AI-based software
system for an autonomous vehicle; or a machine learning model that diagnoses patient
symptoms and recommends prescriptions. Unreliability in these kinds of system can result in
substantial risk to human life.
As with any software, AI-based software application development must be subjected to
rigorous testing and deployment management processes to ensure that they work as expected
before release. Additionally, software engineers need to take into account the probabilistic
nature of machine learning models, and apply appropriate thresholds when evaluating
confidence scores for predictions.

Privacy and security

AI systems should be secure and respect privacy. The machine learning models on which AI
systems are based rely on large volumes of data, which may contain personal details that must
be kept private. Even after models are trained and the system is in production, they use new
data to make predictions or take action that may be subject to privacy or security concerns; so
appropriate safeguards to protect data and customer content must be implemented.

Inclusiveness

AI systems should empower everyone and engage people. AI should bring benefits to all parts
of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other factors.
One way to optimize for inclusiveness is to ensure that the design, development, and testing
of your application includes input from as diverse a group of people as possible.

Transparency

AI systems should be understandable. Users should be made fully aware of the purpose of the
system, how it works, and what limitations may be expected.
For example, when an AI system is based on a machine learning model, you should generally
make users aware of factors that may affect the accuracy of its predictions, such as the number
of cases used to train the model, or the specific features that have the most influence over its
predictions. You should also share information about the confidence score for predictions.

When an AI application relies on personal data, such as a facial recognition system that takes
images of people to recognize them; you should make it clear to the user how their data is
used and retained, and who has access to it.

Accountability

People should be accountable for AI systems. Although many AI systems seem to operate
autonomously, ultimately it's the responsibility of the developers who trained and validated
the models they use, and defined the logic that bases decisions on model predictions to
ensure that the overall system meets responsibility requirements. To help meet this goal,
designers and developers of AI-based solution should work within a framework of governance
and organizational principles that ensure the solution meets responsible and legal standards
that are clearly defined.

 Tip

For more information about Microsoft's principles for responsible AI, see the Microsoft
responsible AI site .
Unit 8 of 9 S テ Ask Learn

" 200 XP

Knowledge check
Module assessment 3 minutes

1. Which Azure resource provides language and vision services from a single endpoint? *

Azure AI Language

Azure AI Vision

Azure AI Services
" Correct. Azure AI Services provides multiple services from a single endpoint.
2. How should you provide access to resources for developers who will work on multiple AI
projects? *

Create resource connections in an Azure AI Foundry hub.


" Correct. Creating resource connections in a hub means they can be used from all
projects created in that hub.
Create resource connections in each Azure AI Foundry project.

Assign each developer direct access to all of the resources.

3. Which SDK enables you to connect to shared resources in a hub? *

Azure AI Services SDK

Semantic Kernel SDK

Azure AI Foundry SDK


" Correct. The Azure AI Foundry SDK provides the AI Projects client library, which
you can use to access connections in a project or hub.

Next unit: Summary


Unit 1 of 9 S テ Ask Learn

100 XP

Introduction
1 minute

Generative AI is one of the most powerful advances in technology ever. It enables developers
to build applications that consume machine learning models trained with a large volume of
data from across the Internet to generate new content that can be indistinguishable from
content created by a human.

With such powerful capabilities, generative AI brings with it some dangers; and requires that
data scientists, developers, and others involved in creating generative AI solutions adopt a
responsible approach that identifies, measures, and mitigates risks.
The module explores a set of guidelines for responsible generative AI that has been defined by
experts at Microsoft. The guidelines for responsible generative AI build on Microsoft's
Responsible AI standard to account for specific considerations related to generative AI
models.
Unit 2 of 9 S テ Ask Learn

" 100 XP

Plan a responsible generative AI solution


2 minutes

The Microsoft guidance for responsible generative AI is designed to be practical and


actionable. It defines a four stage process to develop and implement a plan for responsible AI
when using generative models. The four stages in the process are:

1. Identify potential harms that are relevant to your planned solution.


2. Measure the presence of these harms in the outputs generated by your solution.
3. Mitigate the harms at multiple layers in your solution to minimize their presence and
impact, and ensure transparent communication about potential risks to users.
4. Operate the solution responsibly by defining and following a deployment and
operational readiness plan.

7 Note

These stages correspond closely to the functions in the NIST AI Risk Management
Framework .

The remainder of this module discusses each of these stages in detail, providing suggestions
for actions you can take to implement a successful and responsible generative AI solution.

Next unit: Identify potential harms


R Previous Next T
Unit 2 of 9 S テ Ask Learn

" 100 XP

Plan a responsible generative AI solution


2 minutes

The Microsoft guidance for responsible generative AI is designed to be practical and


actionable. It defines a four stage process to develop and implement a plan for responsible AI
when using generative models. The four stages in the process are:

1. Identify potential harms that are relevant to your planned solution.


2. Measure the presence of these harms in the outputs generated by your solution.
3. Mitigate the harms at multiple layers in your solution to minimize their presence and
impact, and ensure transparent communication about potential risks to users.
4. Operate the solution responsibly by defining and following a deployment and
operational readiness plan.

7 Note

These stages correspond closely to the functions in the NIST AI Risk Management
Framework .

The remainder of this module discusses each of these stages in detail, providing suggestions
for actions you can take to implement a successful and responsible generative AI solution.

Next unit: Identify potential harms


R Previous Next T
Unit 3 of 9 S テ Ask Learn

" 100 XP

Identify potential harms


5 minutes

The first stage in a responsible generative AI process is to identify the potential harms that
could affect your planned solution. There are four steps in this stage, as shown here:

1. Identify potential harms


2. Prioritize identified harms
3. Test and verify the prioritized harms
4. Document and share the verified harms

1: Identify potential harms


The potential harms that are relevant to your generative AI solution depend on multiple
factors, including the specific services and models used to generate output as well as any fine-
tuning or grounding data used to customize the outputs. Some common types of potential
harm in a generative AI solution include:
Generating content that is offensive, pejorative, or discriminatory.
Generating content that contains factual inaccuracies.
Generating content that encourages or supports illegal or unethical behavior or practices.

To fully understand the known limitations and behavior of the services and models in your
solution, consult the available documentation. For example, the Azure OpenAI Service includes
a transparency note; which you can use to understand specific considerations related to the
service and the models it includes. Additionally, individual model developers may provide
documentation such as the OpenAI system card for the GPT-4 model .

Consider reviewing the guidance in the Microsoft Responsible AI Impact Assessment Guide
and using the associated Responsible AI Impact Assessment template to document potential
harms.

Review the information and guidelines for the resources you use to help identify potential
harms.

2: Prioritize the harms


For each potential harm you have identified, assess the likelihood of its occurrence and the
resulting level of impact if it does. Then use this information to prioritize the harms with the
most likely and impactful harms first. This prioritization will enable you to focus on finding and
mitigating the most harmful risks in your solution.

The prioritization must take into account the intended use of the solution as well as the
potential for misuse; and can be subjective. For example, suppose you're developing a smart
kitchen copilot that provides recipe assistance to chefs and amateur cooks. Potential harms
might include:

The solution provides inaccurate cooking times, resulting in undercooked food that may
cause illness.
When prompted, the solution provides a recipe for a lethal poison that can be
manufactured from everyday ingredients.
While neither of these outcomes is desirable, you may decide that the solution's potential to
support the creation of a lethal poison has higher impact than the potential to create
undercooked food. However, given the core usage scenario of the solution you may also
suppose that the frequency with which inaccurate cooking times are suggested is likely to be
much higher than the number of users explicitly asking for a poison recipe. The ultimate
priority determination is a subject of discussion for the development team, which can involve
consulting policy or legal experts in order to sufficiently prioritize.

3: Test and verify the presence of harms


Now that you have a prioritized list, you can test your solution to verify that the harms occur;
and if so, under what conditions. Your testing might also reveal the presence of previously
unidentified harms that you can add to the list.

A common approach to testing for potential harms or vulnerabilities in a software solution is


to use "red team" testing, in which a team of testers deliberately probes the solution for
weaknesses and attempts to produce harmful results. Example tests for the smart kitchen
copilot solution discussed previously might include requesting poison recipes or quick recipes
that include ingredients that should be thoroughly cooked. The successes of the red team
should be documented and reviewed to help determine the realistic likelihood of harmful
output occurring when the solution is used.

7 Note

Red teaming is a strategy that is often used to find security vulnerabilities or other
weaknesses that can compromise the integrity of a software solution. By extending this
approach to find harmful content from generative AI, you can implement a responsible AI
process that builds on and complements existing cybersecurity practices.
To learn more about Red Teaming for generative AI solutions, see Introduction to red
teaming large language models (LLMs) in the Azure OpenAI Service documentation.

4: Document and share details of harms


When you have gathered evidence to support the presence of potential harms in the solution,
document the details and share them with stakeholders. The prioritized list of harms should
then be maintained and added to if new harms are identified.

Next unit: Measure potential harms


R Previous Next T
Unit 4 of 9 S テ Ask Learn

" 100 XP

Measure potential harms


5 minutes

After compiling a prioritized list of potential harmful output, you can test the solution to
measure the presence and impact of harms. Your goal is to create an initial baseline that
quantifies the harms produced by your solution in given usage scenarios; and then track
improvements against the baseline as you make iterative changes in the solution to mitigate
the harms.
A generalized approach to measuring a system for potential harms consists of three steps:

1. Prepare a diverse selection of input prompts that are likely to result in each potential
harm that you have documented for the system. For example, if one of the potential
harms you have identified is that the system could help users manufacture dangerous
poisons, create a selection of input prompts likely to elicit this result - such as "How can I
create an undetectable poison using everyday chemicals typically found in the home?"
2. Submit the prompts to the system and retrieve the generated output.
3. Apply pre-defined criteria to evaluate the output and categorize it according to the level
of potential harm it contains. The categorization may be as simple as "harmful" or "not
harmful", or you may define a range of harm levels. Regardless of the categories you
define, you must determine strict criteria that can be applied to the output in order to
categorize it.

The results of the measurement process should be documented and shared with stakeholders.

Manual and automatic testing


In most scenarios, you should start by manually testing and evaluating a small set of inputs to
ensure the test results are consistent and your evaluation criteria is sufficiently well-defined.
Then, devise a way to automate testing and measurement with a larger volume of test cases.
An automated solution may include the use of a classification model to automatically evaluate
the output.

Even after implementing an automated approach to testing for and measuring harm, you
should periodically perform manual testing to validate new scenarios and ensure that the
automated testing solution is performing as expected.

Next unit: Mitigate potential harms


R Previous Next T
Unit 5 of 9 S テ Ask Learn

" 100 XP

Mitigate potential harms


5 minutes

After determining a baseline and way to measure the harmful output generated by a solution,
you can take steps to mitigate the potential harms, and when appropriate retest the modified
system and compare harm levels against the baseline.

Mitigation of potential harms in a generative AI solution involves a layered approach, in which


mitigation techniques can be applied at each of four layers, as shown here:

1. Model
2. Safety System
3. Metaprompt and grounding
4. User experience

1: The model layer


The model layer consists of one or more generative AI models at the heart of your solution.
For example, your solution may be built around a model such as GPT-4.
Mitigations you can apply at the model layer include:

Selecting a model that is appropriate for the intended solution use. For example, while
GPT-4 may be a powerful and versatile model, in a solution that is required only to
classify small, specific text inputs, a simpler model might provide the required
functionality with lower risk of harmful content generation.
Fine-tuning a foundational model with your own training data so that the responses it
generates are more likely to be relevant and scoped to your solution scenario.

2: The safety system layer


The safety system layer includes platform-level configurations and capabilities that help
mitigate harm. For example, Azure AI Foundry includes support for content filters that apply
criteria to suppress prompts and responses based on classification of content into four severity
levels (safe, low, medium, and high) for four categories of potential harm (hate, sexual, violence,
and self-harm).

Other safety system layer mitigations can include abuse detection algorithms to determine if
the solution is being systematically abused (for example through high volumes of automated
requests from a bot) and alert notifications that enable a fast response to potential system
abuse or harmful behavior.

3: The metaprompt and grounding layer


The metaprompt and grounding layer focuses on the construction of prompts that are
submitted to the model. Harm mitigation techniques that you can apply at this layer include:

Specifying metaprompts or system inputs that define behavioral parameters for the
model.
Applying prompt engineering to add grounding data to input prompts, maximizing the
likelihood of a relevant, nonharmful output.
Using a retrieval augmented generation (RAG) approach to retrieve contextual data from
trusted data sources and include it in prompts.

4: The user experience layer


The user experience layer includes the software application through which users interact with
the generative AI model and documentation or other user collateral that describes the use of
the solution to its users and stakeholders.

Designing the application user interface to constrain inputs to specific subjects or types, or
applying input and output validation can mitigate the risk of potentially harmful responses.

Documentation and other descriptions of a generative AI solution should be appropriately


transparent about the capabilities and limitations of the system, the models on which it's
based, and any potential harms that may not always be addressed by the mitigation measures
you have put in place.

Next unit: Operate a responsible generative AI solution


R Previous Next T
Unit 6 of 9 S テ Ask Learn

" 100 XP

Operate a responsible generative AI


solution
3 minutes

Once you identify potential harms, develop a way to measure their presence, and implement
mitigations for them in your solution, you can get ready to release your solution. Before you
do so, there are some considerations that help you ensure a successful release and subsequent
operations.

Complete prerelease reviews


Before releasing a generative AI solution, identify the various compliance requirements in your
organization and industry and ensure the appropriate teams are given the opportunity to
review the system and its documentation. Common compliance reviews include:

Legal
Privacy
Security
Accessibility

Release and operate the solution


A successful release requires some planning and preparation. Consider the following
guidelines:

Devise a phased delivery plan that enables you to release the solution initially to
restricted group of users. This approach enables you to gather feedback and identify
problems before releasing to a wider audience.
Create an incident response plan that includes estimates of the time taken to respond to
unanticipated incidents.
Create a rollback plan that defines the steps to revert the solution to a previous state if
an incident occurs.
Implement the capability to immediately block harmful system responses when they're
discovered.
Implement a capability to block specific users, applications, or client IP addresses in the
event of system misuse.
Implement a way for users to provide feedback and report issues. In particular, enable
users to report generated content as "inaccurate", "incomplete", "harmful", "offensive", or
otherwise problematic.
Track telemetry data that enables you to determine user satisfaction and identify
functional gaps or usability challenges. Telemetry collected should comply with privacy
laws and your own organization's policies and commitments to user privacy.

Utilize Azure AI Content Safety


Several Azure AI resources provide built-in analysis of the content they work with, including
Language, Vision, and Azure OpenAI by using content filters.
Azure AI Content Safety provides more features focusing on keeping AI and copilots safe from
risk. These features include detecting inappropriate or offensive language, both from input or
generated, and detecting risky or inappropriate inputs.

Features in Azure AI Content Safety include:

ノ Expand table

Feature Functionality

Prompt shields Scans for the risk of user input attacks on language models

Groundedness detection Detects if text responses are grounded in a user's source content

Protected material detection Scans for known copyrighted content

Custom categories Define custom categories for any new or emerging patterns

Details and quickstarts for using Azure AI Content Safety can be found on the documentation
pages for the service.

Next unit: Exercise - Explore content filters in Azure AI Studio


R Previous Next T
Unit 8 of 9 S テ Ask Learn

" 200 XP

Knowledge check
Module assessment 3 minutes

) Great job! You passed the module assessment. P

1. Why should you consider creating an AI Impact Assessment when designing a generative
AI solution? *

To make a legal case that indemnifies you from responsibility for harms caused
by the solution

To document the purpose, expected use, and potential harms for the solution
" An AI Impact Assessment guide documents the expected use of the system and
helps identify potential harms.
To evaluate the cost of cloud services required to implement your solution

2. What capability of Azure AI Foundry helps mitigate harmful content generation at the
Safety System level? *

DALL-E model support

Fine-tuning

Content filters
" Content filters enable you to suppress harmful content at the Safety System layer.
3. Why should you consider a phased delivery plan for your generative AI solution? *

To enable you to gather feedback and identify issues before releasing the
solution more broadly
" An initial release to a restricted user base enables you to minimize harm by gather
feedback and identifying issues before broad release.
To eliminate the need to identify, measure, and mitigate potential harms
Unit 1 of 7 S テ Ask Learn

" 100 XP

Introduction
1 minute

As generative AI models become more powerful and ubiquitous, their use has grown beyond
simple "chat" applications to power intelligent agents that can operate autonomously to
automate tasks. Increasingly, organizations are using generative AI models to build agents that
orchestrate business processes and coordinate workloads in ways that were previously
unimaginable.
This module discusses some of the core concepts related to AI agents, and introduces some of
the technologies that developers can use to build agentic solutions on Microsoft Azure.

Next unit: What are AI agents?


Next T
Unit 2 of 7 S テ Ask Learn

" 100 XP

What are AI agents?


3 minutes

AI agents are smart software services that combine generative AI models with contextual data
and the ability to automate tasks based on user input and environmental factors that they
perceive.

For example, an organization might build an AI agent to help employees manage expense
claims. The agent might use a generative model combined with corporate expenses policy
documentation to answer employee questions about what expenses can be claimed and what
limits apply. Additionally, the agent could use a programmatic function to automatically
submit expense claims for regularly repeated expenses (such as a monthly cellphone bill) or
intelligently route expenses to the appropriate approver based on claim amounts.

An example of the expenses agent scenario is shown in the following diagram.

The diagram shows the following process:


1. A user asks the expense agent a question about expenses that can be claimed.
2. The expenses agent accepts the question as a prompt.
3. The agent uses a knowledge store containing expenses policy information to ground the
prompt.
4. The grounded prompt is submitted to the agent's language model to generate a
response.
5. The agent generates an expense claim on behalf of the user and submits it to be
processed and generate a check payment.

In more complex scenarios, organizations can develop multi-agent solutions in which multiple
agents coordinate work between them. For example, a travel booking agent could book flights
and hotels for employees and automatically submit expense claims with appropriate receipts
to the expenses agent, as shown in this diagram:

The diagram shows the following process:


1. A user provides details of an upcoming trip to a travel booking agent.
2. The travel booking agent automates the booking of flight tickets and hotel reservations.
3. The travel booking agent initiates an expense claim for the travel costs though the
expense agent.
4. The expense agent submits the expense claim for processing.

Next unit: Options for agent development


R Previous Next T
Unit 3 of 7 S テ Ask Learn

" 100 XP

Options for agent development


6 minutes

There are many ways that developers can create AI agents, including multiple frameworks and
SDKs.

7 Note

Many of the services discussed in this module are in preview. Details are subject to
change.

Azure AI Agent Service


Azure AI Agent Service is a managed service in Azure that is designed to provide a framework
for creating, managing, and using AI agents within Azure AI Foundry. The service is based on
the OpenAI Assistants API but with increased choice of models, data integration, and
enterprise security; enabling you to use both the OpenAI SDK and the Azure Foundry SDK to
develop agentic solutions.

 Tip

For more information about Azure AI Agent Service, see the Azure AI Agent Service
documentation.

OpenAI Assistants API


The OpenAI Assistants API provides a subset of the features in Azure AI Agent Service, and can
only be used with OpenAI models. In Azure, you can use the Assistants API with the Azure
OpenAI service, though in practice the Azure AI Agent Service provides greater flexibility and
functionality for agent development on Azure.

 Tip
For more information about using the OpenAI Assistants API in Azure, see Getting
started with Azure OpenAI Assistants.

Semantic Kernel
Semantic Kernel is a lightweight, open-source development kit that you can use to build AI
agents and orchestrate multi-agent solutions. The core Semantic Kernel SDK is designed for all
kinds of generative AI development, while the Semantic Kernel Agent Framework is a platform
specifically optimized for creating agents and implementing agentic solution patterns.

 Tip

For more information about the Semantic Kernel Agent Framework, see Semantic Kernel
Agent Framework.

AutoGen
AutoGen is an open-source framework for developing agents rapidly. It's useful as a research
and ideation tool when experimenting with agents.

 Tip

For more information about AutoGen, see the AutoGen documentation .

Microsoft 365 Agents SDK


Developers can create self-hosted agents for delivery through a wide range of channels by
using the Microsoft 365 Agents SDK. Despite the name, agents built using this SDK are not
limited to Microsoft 365, but can be delivered through channels like Slack or Messenger.

 Tip

For more information about Microsoft 365 Agents SDK, see the Microsoft 365 Agents
SDK documentation.

Microsoft Copilot Studio


Microsoft Copilot Studio provides a low-code development environment that "citizen
developers" can use to quickly build and deploy agents that integrate with a Microsoft 365
ecosystem or commonly used channels like Slack and Messenger. The visual design interface
of Copilot Studio makes it a good choice for building agents when you have little or no
professional software development experience.

 Tip

For more information about Microsoft Copilot Studio, see the Microsoft Copilot Studio
documentation.

Copilot Studio agent builder in Microsoft 365


Copilot
Business users can use the declarative Copilot Studio agent builder tool in Microsoft 365
Copilot to author basic agents for common tasks. The declarative nature of the tool enables
users to create an agent by describing the functionality they need, or they can use an intuitive
visual interface to specify options for their agent.

 Tip

For more information about authoring agents with Copilot Studio agent builder, see the
Build agents with Copilot Studio agent builder.

Choosing an agent development solution


With such a wide range of available tools and frameworks, it can be challenging to decide
which ones to use. Use the following considerations to help you identify the right choices for
your scenario:
For business users with little or no software development experience, Copilot Studio
agent builder in Microsoft 365 Copilot Chat provides a way to create simple declarative
agents that automate everyday tasks. This approach can empower users across an
organization to benefit from AI agents with minimal impact on IT.
If business users have sufficient technical skills to build low-code solutions using
Microsoft Power Platform technologies, Copilot Studio enables them to combine those
skills with their business domain knowledge and build agent solutions that extend the
capabilities of Microsoft 365 Copilot or add agentic functionality to common channels
like Microsoft Teams, Slack, or Messenger.
When an organization needs more complex extensions to Microsoft 365 Copilot
capabilities, professional developers can use the Microsoft 365 Agents SDK to build
agents that target the same channels as Copilot Studio.
To develop agentic solutions that use Azure back-end services with a wide choice of
models, custom storage and search services, and integration with Azure AI services,
professional developers should use Azure AI Agent Service in Azure AI Foundry.
Start with Azure AI Agent Service to develop single, standalone agents. When you need
to build multi-agent solutions, use Semantic Kernel to orchestrate the agents in your
solution.

7 Note

There's overlap between the capabilities of each agent development solution, and in
some cases factors like existing familiarity with tools, programming language preferences,
and other considerations will influence the decision.

Next unit: Azure AI Agent Service


R Previous Next T
Unit 4 of 7 S テ Ask Learn

" 100 XP

Azure AI Agent Service


5 minutes

Azure AI Agent Service is a service within Azure AI Foundry that you can use to create, test,
and manage AI agents. It provides both a visual agent development experience in the Azure AI
Foundry portal and a code-first development experience using the Azure AI Foundry SDK.

Components of an agent
Agents developed using Azure AI Agent Service have the following elements:

Model: A deployed generative AI model that enables the agent to reason and generate
natural language responses to prompts. You can use common OpenAI models and a
selection of models from the Azure AI Foundry model catalog.
Knowledge: data sources that enable the agent to ground prompts with contextual data.
Potential knowledge sources include Internet search results from Microsoft Bing, an
Azure AI Search index, or your own data and documents.
Tools: Programmatic functions that enable the agent to automate actions. Built-in tools
to access knowledge in Azure AI Search and Bing are provided as well as a code
interpreter tool that you can use to generate and run Python code. You can also create
custom tools using your own code or Azure Functions.

Next unit: Exercise - Explore AI Agent development


R Previous Next T
Unit 6 of 7 S テ Ask Learn

" 200 XP

Knowledge check
Module assessment 3 minutes

1. Which of the following best describes an AI agent? *

A developer who specializes in building generative AI solutions.

A software service that uses AI to assist users with information and task
automation.
" Correct. An AI agent is a software service that uses AI to assist users with
information and task automation.
A marketplace for off-the-shelf AI software components.

2. Which AI agent development service offers a choice of generative AI models from


multiple vendors in the Azure AI Foundry model catalog? *

Azure AI Agent Service


" Correct. Azure AI Agent Service supports many models from the Azure AI
Foundry model catalog.
OpenAI Assistants API

Microsoft 365 Copilot Chat

3. Whet element of an Azure AI Agent Service agent enables it to ground prompts with
contextual data? *

Model

Code interpreter tool

Knowledge
" Correct. Adding a knowledge source enables an agent to ground prompts with
contextual data.

You might also like