Module 4
Module 4
" 100 XP
Introduction
1 minute
Document intelligence describes AI capabilities that support processing text and making sense
of information in text. As an extension of optical character recognition (OCR), document
intelligence takes the next step a person might after reading a form or document. It automates
the process of extracting, understanding, and saving the data in text.
Consider an organization that needs to process large numbers of receipts for expenses claims,
project costs, and other accounting purposes. Suppose someone needs to manually enter the
information into a database. The manual process is relatively slow and potentially error-prone.
Using document intelligence, the company can take a scanned image of a receipt, digitize the
text with OCR, and pair the field items with their field names in a database. Document
intelligence can identify specific data such as the merchant's name, merchant's address, total
value, and tax value.
Azure AI Document Intelligence supports features that can analyze documents and forms
with prebuilt and custom models. In this module, you explore how Azure AI services provide
access to document intelligence capabilities.
" 100 XP
Document intelligence relies on machine learning models that are trained to recognize data in
text. The ability to extract text, layout, and key-value pairs is known as document analysis.
Document analysis provides locations of text on a page identified by bounding box
coordinates.
For example, the information in on the receipt 123 Main Street is saved as a key , address
and a value , 123 Main Street . Document analysis could record the location of the field
value as bounding box coordinates [4.1, 2.2], [4.3, 2.2], [4.3, 2.4], [4.1, 2.4]. Machine learning
models can interpret the data in a document or form because they're trained to recognize
patterns in bounding box coordinate locations and text.
A challenge for automating the process of analyzing documents is that forms and documents
come in all different formats. For example, while tax forms and driver's license documents both
include an individual's name, the bounding box coordinates for the name differ. Separate
machine learning models need to be trained to provide high quality results for different forms
and documents. In this way, sometimes you might be able to use prebuilt machine learning
models that have been trained on commonly used document formats. Other times, you might
need to customize a machine learning model to recognize a unique document format.
Automating the process of reading text and recording data can accelerate operations, create
better customer experiences, improve decision making, and more. Next you explore how to
use Azure AI services to implement document intelligence.
" 100 XP
Prebuilt models
The prebuilt models apply advanced machine learning to accurately identify and extract text,
key-value pairs, tables, and structures from forms and documents. The main types of
documents prebuilt models can process are financial services and legal, US tax, US mortgage,
and personal identification documents. Some examples of these capabilities include extracting:
For example, consider the prebuilt receipt model. It processes receipts by:
Matching field names to values
Identifying tables of data
Identifying specific fields, such as dates, telephone numbers, addresses, totals, and others
The receipt model has been trained to recognize data on several different receipt types, such
as thermal receipts (printed on heat-sensitive paper), hotel receipts, gas receipts, credit card
receipts, and parking receipts.
Fields recognized include:
After the resource has been created, you can use the resource in the Document Intelligence
Studio , a user interface for testing document analysis, prebuilt models, and creating custom
models.
" 200 XP
Knowledge check
Module assessment 3 minutes
1. You plan to use Azure AI Document Intelligence's prebuilt receipt model. Which kind of
Azure resource should you create? *
2. What are the main types of documents that prebuilt models in Azure AI Document
Intelligence can process? *
" 100 XP
Introduction
2 minutes
Consider when people need to manually read through multiple pages of a document for
information. What are ways they could search for that information more quickly? This module
discusses how to improve the speed of information retrieval and gain novel insights.
Knowledge mining solutions provide automated information extraction from large volumes of
often unstructured data. One of these knowledge mining solutions is Azure AI Search, a cloud
search service that has tools for building and managing indexes. Azure AI Search can index
unstructured, typed, image-based, or hand-written media. The indexes can be used for internal
only use, or to enable searchable content on public-facing internet assets.
Importantly, Azure AI Search can utilize the built-in capabilities of Azure AI services such as
image processing, content extraction, and natural language processing to perform knowledge
mining of documents. The product's AI capabilities makes it possible to index previously
unsearchable documents and to extract and surface insights from large amounts of data
quickly.
Learning objectives
In this module, you will:
" 100 XP
Azure AI Search provides the infrastructure and tools to create search solutions that extract
data from various structured, semi-structured, and non-structured documents.
Azure AI Search results contain only your data, which can include text inferred or extracted
from images, or new entities and key phrases detection through text analytics. It's a Platform
as a Service (PaaS) solution. Microsoft manages the infrastructure and availability, allowing
your organization to benefit without the need to purchase or manage dedicated hardware
resources.
" 100 XP
A search index contains your searchable content. In an Azure AI Search solution, you create a
search index by moving data through the following indexing pipeline:
1. Start with a data source: the storage location of your original data artifacts, such as PDFs,
video files, and images. For Azure AI Search, your data source could be files in Azure
Storage, or text in a database such as Azure SQL Database or Azure Cosmos DB.
2. Indexer: automates the movement data from the data source through document cracking
and enrichment to indexing. An indexer automates a portion of data ingestion and
exports the original file type to JSON (in an action called JSON serialization).
3. Document cracking: the indexer opens files and extracts content.
4. Enrichment: the indexer moves data through AI enrichment, which implements Azure AI
on your original data to extract more information. AI enrichment is achieved by adding
and combining skills in a skillset. A skillset defines the operations that extract and enrich
data to make it searchable. These AI skills can be either built-in skills, such as text
translation or Optical Character Recognition (OCR), or custom skills that you provide.
Examples of AI enrichment include adding captions to a photo and evaluating text
sentiment. AI enriched content can be sent to a knowledge store, which persists output
from an AI enrichment pipeline in tables and blobs in Azure Storage for independent
analysis or downstream processing.
5. Push to index: the serialized JSON data populates the search index.
6. The result is a populated search index which can be explored through queries. When
users make a search query such as "coffee", the search engine looks for that information
in the search index. A search index has a structure similar to a table, known as the index
schema. A typical search index schema contains fields, the field's data type (such as
string), and field attributes. The fields store searchable text, and the field attributes allow
for actions such as filtering and sorting. Below is an example of a search index schema:
" 100 XP
The first step to creating an Azure AI Search solution is to provision an Azure AI Search
resource. Once the Azure AI Search resource is created, you can manage components of your
service from the resource Overview page in the portal.
Before you begin, identify your data source. You may also create an Azure Storage object to
contain your original data.
You can use one of several methods to create your search solution:
A unique aspect of working with the Azure portal's Import data wizard is that it defines the
search index and runs the indexer. You can see it in action when creating any of the following
objects using the Azure portal:
Data Source: Persists connection information to source data, including credentials. A
data source object is used exclusively with indexers.
Index: Physical data structure used for full text search and other queries.
Indexer: A configuration object specifying a data source, target index, an optional AI
skillset, optional schedule, and optional configuration settings for error handling and
base-64 encoding.
Skillset: A complete set of instructions for manipulating, transforming, and shaping
content, including analyzing and extracting information from image files. Except for very
simple and limited structures, it includes a reference to an Azure AI services resource that
provides enrichment.
Knowledge store: Stores output from an AI enrichment pipeline in tables and blobs in
Azure Storage for independent analysis or downstream processing.
" 100 XP
Index and query design are closely linked. After we build the index, we can perform queries. A
crucial component to understand is that the schema of the index determines what queries can
be answered.
Azure AI Search queries can be submitted as an HTTP or REST API request, with the response
coming back as JSON. Queries can specify what fields are searched and returned, how search
results are shaped, and how the results should be filtered or sorted. A query that doesn't
specify the field to search will execute against all the searchable fields within the index.
Azure AI Search supports two types of syntax: simple and full Lucene. Simple syntax covers all
of the common query scenarios, while full Lucene is useful for advanced scenarios.
This query is trying to find content about coffee, excluding busy and including wifi.
Breaking the query into components, it's made up of search terms ( coffee ), plus two
verbatim phrases, "busy" and "wifi" , and operators ( - , + , and ( ) ). The search terms can
be matched in the search index in any order or location in the content. The two phrases will
only match with exactly what is specified, so wi-fi would not be a match. Finally, a query can
contain a number of operators. In this example, the - operator tells the search engine that
these phrases should NOT be in the results. The parenthesis group terms together, and set
their precedence.
By default, the search engine will match any of the terms in the query. Content containing just
coffee would be a match. In this example, using -"busy" would lead to the search results
including all content that doesn't have the exact string "busy" in it.
The simple query syntax in Azure AI Search excludes some of the more complex features of
the full Lucene query syntax, and it's the default search syntax for queries.
You can learn more about query syntax in the documentation.
" 200 XP
Knowledge check
Module assessment 3 minutes
1. Which data format is accepted by Azure AI Search when you're pushing data to the index?
*
CSV.
SQL.
JSON.
" Correct. Azure AI Search can index JSON documents. JSON is also used to define
index schemas, indexers, and data source objects.
3. If you set up a search index of written news documents without including any skillsets,
what information would you still be able to query? *
The sentiment.
The full-text.
" Correct. Without AI skillsets, you can still perform full text search over indexes
containing alphanumeric content.