0% found this document useful (0 votes)
212 views32 pages

05 Build A Document Intelligence Custom Skill For Azure AI Search

Getting started with Build a Document intelligence custom skill for Azure AI search
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
212 views32 pages

05 Build A Document Intelligence Custom Skill For Azure AI Search

Getting started with Build a Document intelligence custom skill for Azure AI search
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Introduction

• Azure AI Search can index content in many different formats to ensure users can locate
the information they need to do their jobs.

• It's also extensible - skills can be added to add extra processing during indexing.

• If you add a skill that calls Azure AI Document Intelligence, you can use your Azure AI
Document Intelligence models to enrich your AI Search index.

• You work for a company that conducts polls for private companies and political parties.

• Participants submit their responses as paper forms or as online PDFs.

• Using Azure AI Document Intelligence, you've built a successful response analysis service
that can obtain data from poll responses, which your users scan with their mobile
devices.

• You've also deployed an Azure AI Search service to help your users locate documents.

• Now, you'd like to ensure that your search solution can extract the key-value pairs that
your polling models are trained to recognize.

• In this module, you'll learn how to create a AI Search custom skill that calls a model in
Azure AI Document Intelligence to help index documents.
Introduction
Learning objectives
• Describe how a custom skill can enrich content passed through an Azure AI
Search pipeline.

• Build a custom skill that calls an Azure AI Document Intelligence solution to


index data from forms
Understand Azure AI Search enrichment
pipelines
• If you integrate AI Search with an Azure AI Document Intelligence solution, you
can enrich your index with fields that your Azure AI Document Intelligence
models are trained to extract.

• In your polling company, users can submit queries to your search service,
which is built on Azure AI Services.

• However, users need to be able to locate a completed polling form by


searching for a voter ID.

• You've already trained an Azure AI Document Intelligence model to extract the


voter ID from various polling forms.

• Now you want to ensure that the voter ID is included in your AI Search index
so users can locate the forms they need.

• In this unit, you'll learn how to integrate an Azure AI Document Intelligence


model by calling it from a AI Search custom skill.
Understand Azure AI Search enrichment
pipelines
Indexing content in AI Search
• In a search service, a corpus of content is indexed to determine the words it
contains and the documents that contain them.

• When users search for a word, their query is checked against the index to
determine the documents that contain that word.

• In this way, relevant documents can be returned to the user much more
quickly than if each document was searched for the word.

• Azure AI Search is a search service hosted in Azure that can index content on
your premises or in a cloud location.
Understand Azure AI Search enrichment
pipelines
Indexing content in AI Search
Understand Azure AI Search enrichment
pipelines
Indexing content in AI Search
• During the indexing process, AI Search crawls your content, processes it, and
creates a list of words that will be added to the index, together with their location.

• There are five stages to the indexing process:

1. Document Cracking. In document cracking, the indexer opens the content


files and extracts their content.
2. Field Mappings. Fields such as titles, names, dates, and more are extracted
from the content. You can use field mappings to control how they're stored in
the index.
3. Skillset Execution. In the optional skillset execution stage, custom AI
processing is done on the content to enrich the final index.
4. Output field mappings. If you're using a custom skillset, its output is mapped
to index fields in this stage.
5. Push to index. The results of the indexing process are stored in the index in
Azure AI Search.
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
• You can create a customized list of skills that will be executed as a skillset in the
third stage of indexing. Each skill is a call to an AI process that enriches the index.

• For example, an AI skill might translate words into different languages or extract
words from a binary image.

• AI Search has a range of built-in skills, that you can include in your pipeline. Built-in
skills use pretrained AI models, provided by Microsoft. These include:
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
• Key phrase extraction. Detects important phrases in text based on the
placement of terms, linguistic rules, and proximity to other terms.

• Language detection. Detects the predominantly used language in text.

• Merge. Merges text from several fields into a single field.

• Sentiment. Detects sentiments such as positive, negative, and neutral in


text.

• Translation. Translates text from the original language into another to create
a multilingual index.

• Image analysis. Detects the contents of an image and generates a


description.

• Optical character recognition. Recognizes printed and handwritten text in


an image.
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
• To create a custom skillset, you must call the Create Skillset REST API, and
send it the appropriate JSON definition code. The call to the API looks like this:

PUT https://[service name].search.windows.net/skillsets/[skillset name]?api-


version=[api version]
Content-Type: application/json
api-key: [admin key]
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
• In the above call:

• [service name] is the name of your AI Search service in Azure.

• [skillset name] is a name for the skillset you're creating.

• [API version] is the version of the Search REST API.

• [admin key] is the API key for the Search service. You can obtain this key
from the Azure portal.

• The JSON code that defines the skillset looks like this:
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
• In the above JSON code;

• cognitiveServices is required if you're using billable Azure AI Services


APIs in your skillset. Provide the API key for your Azure AI Services
multiservice resource.

• knowledgeStore specifies an Azure Storage Account where the output


from skills can be stored.

• encryptionKey specifies keys from the Azure Key Vault that will be
used to encrypt sensitive content in the pipeline.

• The skills section defines one or more built-in or custom skills that will
analyze the content. For example:
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
"skills":[
{
"@odata.type": "#Microsoft.Skills.Text.V3.EntityRecognitionSkill",
"name": "Entity recognition",
"context": "/document",
"categories": [ "Organization" ],
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "organizations",
"targetName": "orgs"
}
},
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
{
"@odata.type": "#Microsoft.Skills.Vision.ImageAnalysisSkill",
"name": "Image analysis",
"context": "/document/normalized_images/*",
"visualFeatures": [
"brands"
],
"inputs": [
{
"name": "image",
"source": "/document/normalized_images/*"
}
],
"outputs": [
{
"name": "brands"
}
]
}
]
Understand Azure AI Search enrichment
pipelines
What is a AI Search skillset?
• In this example, there are two built-in skills in the skillset: An entity recognition skill
that detects organization names in text and an image analysis skill that detects
brand logos in image files.
Understand Azure AI Search enrichment
pipelines
What is a custom skill?

• Custom skills can be used for two reasons:

• The list of built-in skills doesn't include the type of AI enrichment you
need.

• You want to train your own model to analyze the data.

• There are two types of custom skill that you can create:

• Azure Machine Learning (AML) custom skills. You can use this
custom skill type to enrich your index by calling an AML model.

• Custom Web API skills. You can use this custom skill type to enrich
your index by calling a web service. Such web services can include
Azure applied AI services, such as Azure AI Document Intelligence.
Understand Azure AI Search enrichment
pipelines
What is a custom skill?
• Custom skills can be used for two reasons:

• The list of built-in skills doesn't include the type of AI enrichment you
need.

• You want to train your own model to analyze the data.

• There are two types of custom skill that you can create:

• Azure Machine Learning (AML) custom skills. You can use this
custom skill type to enrich your index by calling an AML model.

• Custom Web API skills. You can use this custom skill type to enrich
your index by calling a web service. Such web services can include
Azure applied AI services, such as Azure AI Document Intelligence.
Understand Azure AI Search enrichment
pipelinesAI Search and Azure AI Document
Integrate
Intelligence
• If you've developed an Azure AI Document Intelligence solution, you may be
using it to accept scanned or photographed forms or documents from users,
perhaps from an app on their mobile device.

• Azure AI Document Intelligence can use either a built-in model or a custom


model to analyze the content of these images and return text, structural
information, languages used, key-value pairs, and other data.

• That's the kind of data that may be useful in a AI Search index. For example, if
the content that you index includes scanned sales invoices, Azure AI Document
Intelligence can identify field such as currency amounts, retailer names, and tax
information by using its prebuilt Invoice model.

• When users search for a retailer, you'd like them to receive a link to invoices from
that retailer in their results.
Understand Azure AI Search enrichment
pipelinesAI Search and Azure AI Document
Integrate
Intelligence
• To integrate Azure AI Document Intelligence into the AI Search indexing pipeline,
you must:

• Create an Azure AI Document Intelligence resource in your Azure


subscription.

• Configure one or more models in Azure AI Document Intelligence. You can


either select prebuilt models, such as Invoice or Business Card or train your
own model for unusual or unique form types.

• Develop and deploy a web service that can call your Azure AI Document
Intelligence resource. In this module, you'll use an Azure Function to host this
service.

• Add a custom web API skill, with the correct configuration to the AI Search
skillset. This skill should be configured to send requests to the web service.
Build an Azure AI Document Intelligence custom
skill
• To integrate Azure AI Document Intelligence into the AI Search indexing process,
you must write a Web service that integrates the custom skill interface.

• In your polling company, you've decided to implement a custom skill that sends
completed polling forms to Azure AI Document Intelligence to extract the voter ID
and other values.

• You want to store these values in your index to ensure that users can search by
voter ID and find the polling forms they need.

• In this unit, you'll learn how to create and host a custom skill that calls Azure AI
Document Intelligence.
Build an Azure AI Document Intelligence custom
skill
Custom skill interface and
•requirements
A custom Web API skill has to integrate with other skills in your skillset and with
the rest of the AI Search indexing pipeline.

• Therefore it must accept input data and return output data in compatible
formats.

• When you write a custom skill, including one that integrates Azure AI Document
Intelligence, one of your primary concerns is to implement the custom skill
interface to ensure this compatibility.
Build an Azure AI Document Intelligence custom
skill
Custom skill interface and
•requirements
Your code should handle the following input values in the JSON body of the REST
request:

• values. The JSON body will include a collection named values. Each item in
this collection represents a form to analyze.

o recordId. Each item in the values collection has a recordId. You must
include this ID in the output JSON so that AI Search can match input
forms with their results.

o data. Each item in the values includes a data collection with two values:

o formUrl. This is the location of the form to analyze.

o formSasToken. If the form is stored in Azure Storage, this token


enables your code to authenticate with that account.
Build an Azure AI Document Intelligence custom
skill
Custom skill interface and
•requirements
From the input data, your code can formulate requests to send to Azure AI
Document Intelligence. You'll need the following connection information for these
requests:

• The Azure AI Document Intelligence endpoint.

• The Azure AI Document Intelligence API key.


• You can obtain both these values from the Azure AI Document Intelligence
resource in the Azure portal.
Build an Azure AI Document Intelligence custom
skill
Custom skill interface and
•requirements
Your code should formulate a REST response that includes a JSON body. The AI
Search service expects this response to include:

• values. A collection where each item is one of the submitted forms.

o recordId. AI Search uses this value to match results to one of the input
forms.

o data. Use the data collection to return the fields that Azure AI Document
Intelligence has extracted from each input form.

o errors. If you couldn't obtain the analysis for a form, use the errors
collection to indicate why.

o warnings, If you have obtained results but some noncritical problem


has arisen, use the warnings collection to report the issue.
Build an Azure AI Document Intelligence custom
skill
Testing the custom skill
• During development, you'll need to test your custom skill by sending it REST
requests and observing its responses.

• REST developers often use the Postman tool to help with this process, but any
tool that helps you to formulate and submit REST requests with JSON message
bodies can be used.

• You can also use the Code + Test tool in the Azure portal to formulate and submit
test REST requests.

• In Visual Studio, deploy the function locally by pressing F5. Then, you can submit
requests to the function by sending them to this URL:
POST https://localhost:7071/api/analyze-form
Build an Azure AI Document Intelligence custom
skill
Testing the custom skill
• The request specifies a form to analyze by its URL included in the JSON body:

{
"values": [
{
"recordId": "record1",
"data": {
"formUrl": "<your-form-url>",
"formSasToken": "<your-sas-token>"
}
}
]
}
Build an Azure AI Document Intelligence custom
skill
Testing the custom skill
• If your form is stored in an Azure {
Storage Account, you can use "values": [
the formSasToken property to {
authenticate with that storage "recordId": "record1",
account. "data": {
"address": "1111 8th st. Bellevue, WA
• To obtain the correct SAS token, 99501 ",
open Azure Storage Explorer, "recipient": "Southridge Video 1060 Main
browse to the form, then right- St. Atlanta, GA 65024"
click it and select Get Shared },
Access Signature. "errors": null,
"warnings": null
• The response to such a request }
should look like this: ]
}
• The keys and values returned in the data object, depend on the model you've
calling in Azure AI Document Intelligence.
Build an Azure AI Document Intelligence custom
skill
Hosting a custom skill
• The custom skill is a Web API service and so you have many choices on how to
host it.

• For example, if you want to host the skill within Azure, for example, you could
host the skill as:

• An Azure Function.

• A container in the Azure Container Instance Service (ACI).

• A container in Azure Kubernetes Services (AKS).


Build an Azure AI Document Intelligence custom
skill
Add the custom skill to a skillset
• Once the custom skill is {
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
completed, tested, and hosted, "description": "A custom skill that calls Azure AI Document
you must configure AI Search Intelligence",
to call it. "uri": "https://contoso.com/formrecognizer",
"batchSize": 1,
"context": "/document",
• In the previous unit, you saw "inputs": [
sample definition JSON code {
for built-in skills. "name": "formUrl",
"source": "/document/metadata_storage_path"
}
• The equivalent definition code ],
for a custom skill is: "outputs":[
{
"name":"address",
"targetName":"address"
},
{
"name":"recipient",
"targetName":"recipient"
}
]
}
Build an Azure AI Document Intelligence custom
skill
Add the custom skill to a skillset
• In this code:

• Microsoft.Skills.Custom.WebApiSkill is required to define this as a Web API skill.

• uri is the location of the web service. In this module, the web service is
implemented as an Azure Function. The Function is the interface between the
search pipeline and Azure AI Document Intelligence.

• inputs determines the data that is sent to the skill for analysis.

• outputs determines the data that is returned from the skill.


Exercise - Build and deploy an Azure AI
Document Intelligence custom skill
Knowledge check
1. Which of the following values does AI Search use to match a form submitted to a custom skill
with the right response from that skill?

a) formUrl.
b) recordId.
c) formSasToken.

2. You're troubleshooting your AI Search indexing process. You have a single custom skill that
calls Azure AI Document Intelligence but requests are never received by your skill. Which of the
following stages of the indexing process might be causing the problem?

d) Push to index.
e) Output field mapping.
f) Document cracking.

You might also like