Module 3
Module 3
100 XP
Introduction
1 minute
In order for computer systems to interpret the subject of a text in a similar way humans do,
they use natural language processing (NLP), an area within AI that deals with understanding
written or spoken language, and responding in kind. Text analysis describes NLP processes
that extract information from unstructured text.
Azure AI Language is a cloud-based service that includes features for understanding and
analyzing text. Azure AI Language includes various features that support sentiment analysis,
key phrase identification, text summarization, and conversational language understanding.
In this module, you'll explore the capabilities of text analytics, and how you might use them.
" 100 XP
Before exploring the text analytics capabilities of the Azure AI Language service, let's examine
some general principles and common techniques used to perform text analysis and other
natural language processing (NLP) tasks.
Some of the earliest techniques used to analyze text with computers involve statistical analysis
of a body of text (a corpus) to infer some kind of semantic meaning. Put simply, if you can
determine the most commonly used words in a given document, you can often get a good
idea of what the document is about.
Tokenization
The first step in analyzing a corpus is to break it down into tokens. For the sake of simplicity,
you can think of each distinct word in the training text as a token, though in reality, tokens can
be generated for partial words, or combinations of words and punctuation.
For example, consider this phrase from a famous US presidential speech: "we choose to go
to the moon" . The phrase can be broken down into the following tokens, with numeric
identifiers:
1. we
2. choose
3. to
4. go
5. the
6. moon
Notice that "to" (token number 3) is used twice in the corpus. The phrase "we choose to go
7 Note
We've used a simple example in which tokens are identified for each distinct word in the
text. However, consider the following concepts that may apply to tokenization depending
on the specific kind of NLP problem you're trying to solve:
Text normalization: Before generating tokens, you may choose to normalize the text
by removing punctuation and changing all words to lower case. For analysis that
relies purely on word frequency, this approach improves overall performance.
However, some semantic meaning may be lost - for example, consider the sentence
"Mr Banks has worked in many banks." . You may want your analysis to
differentiate between the person "Mr Banks" and the "banks" in which he has
worked. You may also want to consider "banks." as a separate token to "banks"
because the inclusion of a period provides the information that the word comes at
the end of a sentence
Stop word removal. Stop words are words that should be excluded from the
analysis. For example, "the" , "a" , or "it" make text easier for people to read but
add little semantic meaning. By excluding these words, a text analysis solution may
be better able to identify the important words.
n-grams are multi-term phrases such as "I have" or "he walked" . A single word
phrase is a unigram , a two-word phrase is a bi-gram , a three-word phrase is a
tri-gram , and so on. By considering words as groups, a machine learning model
Frequency analysis
After tokenizing the words, you can perform some analysis to count the number of
occurrences of each token. The most commonly used words (other than stop words such as
"a" , "the" , and so on) can often provide a clue as to the main subject of a text corpus. For
example, the most common words in the entire text of the "go to the moon" speech we
considered previously include "new" , "go" , "space" , and "moon" . If we were to tokenize the
text as bi-grams (word pairs), the most common bi-gram in the speech is "the moon" . From
this information, we can easily surmise that the text is primarily concerned with space travel
and going to the moon.
Tip
Simple frequency analysis in which you simply count the number of occurrences of each
token can be an effective way to analyze a single document, but when you need to
differentiate across multiple documents within the same corpus, you need a way to
determine which tokens are most relevant in each document. Term frequency - inverse
document frequency (TF-IDF) is a common technique in which a score is calculated based
on how often a word or term appears in one document compared to its more general
frequency across the entire collection of documents. Using this technique, a high degree
of relevance is assumed for words that appear frequently in a particular document, but
relatively infrequently across a wide range of other documents.
With enough labeled reviews, you can train a classification model using the tokenized text as
features and the sentiment (0 or 1) a label. The model will encapsulate a relationship between
tokens and sentiment - for example, reviews with tokens for words like "great" , "tasty" , or
"fun" are more likely to return a sentiment of 1 (positive), while reviews with words like
- 4 ("dog"): [10.3.2]
- 5 ("bark"): [10,2,2]
- 8 ("cat"): [10,3,1]
- 9 ("meow"): [10,2,1]
- 10 ("skateboard"): [3,3,1]
We can plot the location of tokens based on these vectors in three-dimensional space, like
this:
The locations of the tokens in the embeddings space include some information about how
closely the tokens are related to one another. For example, the token for "dog" is close to
"cat" and also to "bark" . The tokens for "cat" and "bark" are close to "meow" . The token
The language models we use in industry are based on these principles but have greater
complexity. For example, the vectors used generally have many more dimensions. There are
also multiple ways you can calculate appropriate embeddings for a given set of tokens.
Different methods result in different predictions from natural language processing models.
A generalized view of most modern natural language processing solutions is shown in the
following diagram. A large corpus of raw text is tokenized and used to train language models,
which can support many different types of natural language processing task.
Common NLP tasks supported by language models include:
Text analysis, such as extracting key terms or identifying named entities in text.
Sentiment analysis and opinion mining to categorize text as positive or negative.
Machine translation, in which text is automatically translated from one language to
another.
Summarization, in which the main points of a large body of text are summarized.
Conversational AI solutions such as bots or digital assistants in which the language model
can interpret natural language input and return an appropriate response.
These capabilities and more are supported by the models in the Azure AI Language service,
which we'll explore next.
" 100 XP
Azure AI Language is a part of the Azure AI services offerings that can perform advanced
natural language processing over unstructured text. Azure AI Language's text analysis features
include:
Named entity recognition identifies people, places, events, and more. This feature can
also be customized to extract custom categories.
Entity linking identifies known entities together with a link to Wikipedia.
Personal identifying information (PII) detection identifies personally sensitive
information, including personal health information (PHI).
Language detection identifies the language of the text and returns a language code such
as "en" for English.
Sentiment analysis and opinion mining identifies whether text is positive or negative.
Summarization summarizes text by identifying the most important information.
Key phrase extraction lists the main concepts from unstructured text.
ノ Expand table
Organization "Microsoft"
IP Address "10.0.1.125"
Azure AI Language also supports entity linking to help disambiguate entities by linking to a
specific reference. For recognized entities, the service returns a URL for a relevant Wikipedia
article.
For example, suppose you use Azure AI Language to detect entities in the following restaurant
review extract:
ノ Expand table
Language detection
Use the language detection capability of Azure AI Language to identify the language in which
text is written. You can submit multiple documents at a time for analysis. For each document
submitted the service will detect:
The language name (for example "English").
The ISO 639-1 language code (for example, "en").
A score indicating a level of confidence in the language detection.
For example, consider a scenario where you own and operate a restaurant where customers
can complete surveys and provide feedback on the food, the service, staff, and so on. Suppose
you have received the following reviews from customers:
Review 1: "A fantastic place for lunch. The soup was delicious."
You can use the text analytics capabilities in Azure AI Language to detect the language for
each of these reviews; and it might respond with the following results:
ノ Expand table
Document Language Name ISO 6391 Code Score
Notice that the language detected for review 3 is English, despite the text containing a mix of
English and French. The language detection service will focus on the predominant language in
the text. The service uses an algorithm to determine the predominant language, such as length
of phrases or total amount of text for the language compared to other languages in the text.
The predominant language will be the value returned, along with the language code. The
confidence score might be less than 1 as a result of the mixed language text.
There might be text that is ambiguous in nature, or that has mixed language content. These
situations can present a challenge. An ambiguous content example would be a case where the
document contains limited text, or only punctuation. For example, using Azure AI Language to
analyze the text ":-)", results in a value of unknown for the language name and the language
identifier, and a score of NaN (which is used to indicate not a number).
Azure AI Language uses a prebuilt machine learning classification model to evaluate the text.
The service returns sentiment scores in three categories: positive, neutral, and negative. In
each of the categories, a score between 0 and 1 is provided. Scores indicate how likely the
provided text is a particular sentiment. One document sentiment is also provided.
For example, the following two restaurant reviews could be analyzed for sentiment:
Review 1: "We had dinner at this restaurant last night and the first thing I noticed was how
courteous the staff was. We were greeted in a friendly manner and taken to our table right
away. The table was clean, the chairs were comfortable, and the food was amazing."
and
Review 2: "Our dining experience at this restaurant was one of the worst I've ever had. The
service was slow, and the food was awful. I'll never eat at this establishment again."
The sentiment score for the first review might be: Document sentiment: positive Positive score:
.90 Neutral score: .10 Negative score: .00
The second review might return a response: Document sentiment: negative Positive score: .00
Neutral score: .00 Negative score: .99
"We had dinner here for a birthday celebration and had a fantastic experience. We were
greeted by a friendly hostess and taken to our table right away. The ambiance was relaxed,
the food was amazing, and service was terrific. If you like great food and attentive service,
you should try this place."
Key phrase extraction can provide some context to this review by extracting the following
phrases:
birthday celebration
fantastic experience
friendly hostess
great food
attentive service
dinner
table
ambiance
place
As well as using sentiment analysis to determine that this is a positive review, you can also use
the key phrase service to identify important elements of the review.
" 200 XP
Knowledge check
Module assessment 3 minutes
1. You want to use Azure AI Language to determine the key talking points in a text
document. Which feature of the service should you use? *
Sentiment analysis
2. You use Azure AI Language to perform sentiment analysis on a sentence. The confidence
scores .04 positive, .36 neutral, and .60 negative are returned. What do these confidence
scores indicate about the sentence sentiment? *
3. When might you see NaN returned for a score in language detection? *
When the predominant language in the text is mixed with other languages
" 100 XP
Introduction
1 minute
We are used to being able to communicate at any time of the day or night, anywhere in the
world, putting organizations under pressure to react fast enough to their customers. We want
personal responses to our queries, without having to read in-depth documentation to find
answers. This often means that support staff get overloaded with requests for help through
multiple channels, and that people are left waiting for a response.
Conversational AI describes solutions that enable a dialog between an AI agent and a human.
Generically, conversational AI agents are known as bots. People can engage with bots through
channels such as web chat interfaces, email, social media platforms, and more.
Azure AI Language's question answering feature provides you with the ability to create
conversational AI solutions. Next you'll learn about question answering.
" 100 XP
In the following example, a chat bot uses natural language and provides options to a customer
to best handle their query. The user gets an answer to their question quickly, and only gets
passed to a person if their query is more complicated.
Next, learn how Azure AI services can be used to create a question answering project.
" 100 XP
You can easily create a question answering solution on Microsoft Azure using Azure AI
Language service. Azure AI Language includes a custom question answering feature that
enables you to create a knowledge base of question and answer pairs that can be queried
using natural language input.
7 Note
You can write code to create and manage projects using the Azure AI Language REST API
or SDK. However, in most scenarios it is easier to use the Language Studio.
To create a project, you must first provision a Language resource in your Azure subscription.
In many cases, a project is created using a combination of all of these techniques; starting with
a base dataset of questions and answers from an existing FAQ document and extending the
knowledge base with additional manual entries.
Questions in the project can be assigned alternative phrasing to help consolidate questions
with the same meaning. For example, you might include a question like:
You can anticipate different ways this question could be asked by adding an alternative
phrasing such as:
" 200 XP
Knowledge check
Module assessment 2 minutes
1. Your organization has an existing frequently asked questions (FAQ) document. You need
to create a knowledge base that includes the questions and answers from the FAQ with the
least possible effort. What should you do? *
Create an empty knowledge base, and then manually copy and paste the FAQ
entries into it.
2. You want to create a knowledge base for your organization’s bot service. Which Azure AI
service is best suited to creating a knowledge base? *
Question Answering
" Correct. Question Answering is part of the Azure AI Language service and enables
you to create a knowledge base of question and answer pairs
Optical Character Recognition
" 100 XP
Introduction
1 minute
In 1950, the British mathematician Alan Turing devised the Imitation Game, which has become
known as the Turing Test and hypothesizes that if a dialog is natural enough, you might not
know whether you're conversing with a human or a computer. As artificial intelligence (AI)
grows ever more sophisticated, this kind of conversational interaction with applications and
digital assistants is becoming more and more common, and in specific scenarios can result in
human-like interactions with AI agents. Common scenarios for this kind of solution include
customer support applications, reservation systems, and home automation, among others.
To realize the aspiration of the imitation game, computers need not only to be able to accept
language as input (either in text or audio format), but also to be able to interpret the semantic
meaning of the input - in other words, understand what is being said.
Azure AI Language service supports conversational language understanding (CLU). You can
use CLU to build language models that interpret the meaning of phrases in a conversational
setting. One example of a CLU application is one that's able to turn devices on and off based
on speech. The application is able to take in audio input such as, "Turn the light off", and
understand an action it needs to take, such as turning a light off. Many types of tasks involving
command and control, end-to-end conversation, and enterprise support can be completed
with Azure AI Language's CLU feature.
" 100 XP
To work with conversational language understanding (CLU), you need to take into account
three core concepts: utterances, entities, and intents.
Utterances
An utterance is an example of something a user might say, and which your application must
interpret. For example, when using a home automation system, a user might use the following
utterances:
Entities
An entity is an item to which an utterance refers. For example, fan and light in the following
utterances:
You can think of the fan and light entities as being specific instances of a general device
entity.
Intents
An intent represents the purpose, or goal, expressed in a user's utterance. For example, for
both of the previously considered utterances, the intent is to turn a device on; so in your CLU
application, you might define a TurnOn intent that is related to these utterances.
A CLU application defines a model consisting of intents and entities. Utterances are used to
train the model to identify the most likely intent and the entities to which it should be applied
based on a given input. The home assistant application we've been considering might include
multiple intents, like the following examples:
ノ Expand table
Greeting "Hello"
"Hi"
"Hey"
"Good morning"
In the table there are numerous utterances used for each of the intents. The intent should be a
concise way of grouping the utterance tasks. Of special interest is the None intent. You should
consider always using the None intent to help handle utterances that do not map any of the
utterances you have entered. The None intent is considered a fallback, and is typically used to
provide a generic response to users when their requests don't match any other intent.
After defining the entities and intents with sample utterances in your CLU application, you can
train a language model to predict intents and entities from user input - even if it doesn't
match the sample utterances exactly. You can then use the model from a client application to
retrieve predictions and respond appropriately.
" 100 XP
Azure AI Language: A resource that enables you to build apps with industry-leading
natural language understanding capabilities without machine learning expertise. You can
use a language resource for authoring and prediction.
Azure AI services: A general resource that includes CLU along with many other Azure AI
services. You can only use this type of resource for prediction.
The separation of resources is useful when you want to track resource utilization for Azure AI
Language use separately from client applications using all Azure AI services applications.
Authoring
After you've created an authoring resource, you can use it to train a CLU model. To train a
model, start by defining the entities and intents that your application will predict as well as
utterances for each intent that can be used to train the predictive model.
CLU provides a comprehensive collection of prebuilt domains that include pre-defined intents
and entities for common scenarios; which you can use as a starting point for your model. You
can also create your own entities and intents.
When you create entities and intents, you can do so in any order. You can create an intent, and
select words in the sample utterances you define for it to create entities for them; or you can
create the entities ahead of time and then map them to words in utterances as you're creating
the intents.
You can write code to define the elements of your model, but in most cases it's easiest to
author your model using the Language studio - a web-based interface for creating and
managing CLU applications.
Predicting
When you are satisfied with the results from the training and testing, you can publish your
Conversational Language Understanding application to a prediction resource for consumption.
Client applications can use the model by connecting to the endpoint for the prediction
resource, specifying the appropriate authentication key; and submit user input to get
predicted intents and entities. The predictions are returned to the client application, which can
then take appropriate action based on the predicted intent.
" 200 XP
Knowledge check
Module assessment 3 minutes
1. You need to provision an Azure resource that will be used to author a new conversational
language understanding application. What kind of resource should you create? *
Azure AI Speech
Azure AI Language
" Correct. To author a conversational language understanding model, you need an
Azure AI Language resource.
Azure AI services
Define a "city" entity and a "GetTime" intent with utterances that indicate the
city entity.
" Correct. The intent encapsulates the task (getting the time) and the entity
specifies the item to which the intent is applied (the city).
Create an intent for each city, each with an utterance that asks for the time in
that city.
" 100 XP
Introduction
1 minute
AI speech capabilities enable us to manage home and auto systems with voice instructions,
get answers from computers for spoken questions, generate captions from audio, and much
more.
To enable this kind of interaction, the AI system must support at least two capabilities:
Azure AI Speech provides speech to text, text to speech, and speech translation capabilities
through speech recognition and synthesis. You can use prebuilt and custom Speech service
models for a variety of tasks, from transcribing audio to text with high accuracy, to identifying
speakers in conversations, creating custom voices, and more. Next you'll learn how AI speech
capabilities work.
" 100 XP
Speech recognition takes the spoken word and converts it into data that can be processed -
often by transcribing it into text. The spoken words can be in the form of a recorded voice in
an audio file, or live audio from a microphone. Speech patterns are analyzed in the audio to
determine recognizable patterns that are mapped to words. To accomplish this, the software
typically uses multiple models, including:
An acoustic model that converts the audio signal into phonemes (representations of
specific sounds).
A language model that maps phonemes to words, usually using a statistical algorithm
that predicts the most probable sequence of words based on the phonemes.
The recognized words are typically converted to text, which you can use for various purposes,
such as:
To synthesize speech, the system typically tokenizes the text to break it down into individual
words, and assigns phonetic sounds to each word. It then breaks the phonetic transcription
into prosodic units (such as phrases, clauses, or sentences) to create phonemes that will be
converted to audio format. These phonemes are then synthesized as audio and can be
assigned a particular voice, speaking rate, pitch, and volume.
You can use the output of speech synthesis for many purposes, including:
Generating spoken responses to user input
Creating voice menus for phone systems
Reading email or text messages aloud in hands-free scenarios
Broadcasting announcements in public locations, such as railway stations or airports
" 100 XP
Microsoft Azure offers speech recognition and synthesis capabilities through Azure AI Speech
service, which supports many capabilities, including:
Speech to text
Text to speech
7 Note
This module covers speech to text and text to speech capabilities. A separate module
covers speech translation in Azure AI services.
Speech to text
You can use Azure AI Speech to text API to perform real-time or batch transcription of audio
into a text format. The audio source for transcription can be a real-time audio stream from a
microphone or an audio file.
The model that is used by the Speech to text API, is based on the Universal Language Model
that was trained by Microsoft. The data for the model is Microsoft-owned and deployed to
Microsoft Azure. The model is optimized for two scenarios, conversational and dictation. You
can also create and train your own custom models including acoustics, language, and
pronunciation if the pre-built models from Microsoft don't provide what you need.
Real-time transcription: Real-time speech to text allows you to transcribe text in audio
streams. You can use real-time transcription for presentations, demos, or any other scenario
where a person is speaking.
In order for real-time transcription to work, your application needs to be listening for
incoming audio from a microphone, or other audio input source such as an audio file. Your
application code streams the audio to the service, which returns the transcribed text.
Batch transcription: Not all speech to text scenarios are real time. You might have audio
recordings stored on a file share, a remote server, or even on Azure storage. You can point to
audio files with a shared access signature (SAS) URI and asynchronously receive transcription
results.
Batch transcription should be run in an asynchronous manner because the batch jobs are
scheduled on a best-effort basis. Normally a job starts executing within minutes of the request
but there's no estimate for when a job changes into the running state.
Text to speech
The text to speech API enables you to convert text input to audible speech, which can either
be played directly through a computer speaker or written to an audio file.
Speech synthesis voices: When you use the text to speech API, you can specify the voice to be
used to vocalize the text. This capability offers you the flexibility to personalize your speech
synthesis solution and give it a specific character.
The service includes multiple pre-defined voices with support for multiple languages and
regional pronunciation, including neural voices that leverage neural networks to overcome
common limitations in speech synthesis with regard to intonation, resulting in a more natural
sounding voice. You can also develop custom voices and use them with the text to speech API
Supported Languages
Both the speech to text and text to speech APIs support a variety of languages. Use the links
below to find details about the supported languages:
Speech to text languages.
Text to speech languages.
" 100 XP
Azure AI Speech is available for use through several tools and programming languages
including:
Studio interfaces
Command Line Interface (CLI)
REST APIs and Software Development Kits (SDKs)
ノ Expand table
A Speech resource - choose this resource type if you only plan to use Azure AI Speech, or
if you want to manage access and billing for the resource separately from other services.
An Azure AI services resource - choose this resource type if you plan to use Azure AI
Speech in combination with other Azure AI services, and you want to manage access and
billing for these services together.
" 200 XP
Knowledge check
Module assessment 2 minutes
1. You plan to build an application that uses Azure AI Speech to transcribe audio recordings
of phone calls into text, and then submit the transcribed text to Azure AI Language to
extract key phrases. You want to manage access and billing for the application services with
a single Azure resource. Which type of Azure resource should you create? *
Speech
Language
Azure AI services
" Correct. This resource would support both the Azure AI Speech and Azure AI
Language services.
2. You want to use Azure AI Speech service to build an application that reads incoming
email message subjects aloud. Which API should you use? *
Speech to text
Text to speech
" Correct. The Text to speech API converts text to audible speech.
Translator
" 100 XP
Introduction
1 minute
As organizations and individuals collaborate with people in other cultures and geographic
locations, they continue to need ways to remove language barriers.
One solution is to hire multilingual people to translate between languages. However the
scarcity of such skills, and the number of possible language combinations can make this
approach difficult to scale. Increasingly, automated translation, sometimes known as machine
translation, is being employed to solve this problem.
In this module, we explore Azure AI Translator and Azure AI Speech's cloud-based neural
machine translation capabilities.
" 100 XP
One of the many challenges of translation between languages is that words don't have a one
to one replacement between languages. Machine translation advancements are needed to
improve the communication of meaning and tone between languages.
" 100 XP
Microsoft provides Azure AI services that support translation. Specifically, you can use the
following services:
The Azure AI Translator service, which supports text-to-text translation.
The Azure AI Speech service, which enables speech to text and speech-to-speech
translation.
Azure AI Translator
Azure AI Translator is easy to integrate in your applications, websites, tools, and solutions. The
service uses a Neural Machine Translation (NMT) model for translation, which analyzes the
semantic context of the text and renders a more accurate and complete translation as a result.
Language support: Azure AI Translator supports text-to-text translation between more than
130 languages. When using the service, you must specify the language you are translating
from and the language you are translating to using ISO 639-1 language codes, such as en for
English, fr for French, and zh for Chinese. Alternatively, you can specify cultural variants of
languages by extending the language code with the appropriate 3166-1 cultural code - for
example, en-US for US English, en-GB for British English, or fr-CA for Canadian French. When
using Azure AI Translator, you can specify one from language with multiple to languages,
enabling you to simultaneously translate a source document into multiple languages.
Azure AI Speech
You can use Azure AI Speech to translate spoken audio from a streaming source, such as a
microphone or audio file, and return the translation as text or an audio stream. This enables
scenarios such as real-time closed captioning for a speech or simultaneous two-way
translation of a spoken conversation.
Language support: As with Azure AI Translator, you can specify one source language and one
or more target languages to which the source should be translated with Azure AI Speech. You
can translate speech into over 90 languages. The source language must be specified using the
extended language and culture code format, such as es-US for American Spanish. This
requirement helps ensure that the source is understood properly, allowing for localized
pronunciation and linguistic idioms. The target languages must be specified using a two-
character language code, such as en for English or de for German.
" 100 XP
You can use Azure AI Translator with a programming language of your choice or the REST API.
You can use some of its features with Language Studio.
You can get started with Azure AI Speech with Speech Studio or a programming language of
your choice or the REST API.
There are dedicated Translator and Speech resource types for these services, which you can
use if you want to manage access and billing for each service individually.
Alternatively, you can create an Azure AI services resource that provides access to both
services through a single Azure resource, consolidating billing and enabling applications to
access both services through a single endpoint and authentication key.
Text translation - used for quick and accurate text translation in real time across all
supported languages.
Document translation - used to translate multiple documents across all supported
languages while preserving original document structure.
Custom translation - used to enable enterprises, app developers, and language service
providers to build customized neural machine translation (NMT) systems.
Azure AI Translator's application programming interface (API) offers some optional
configuration to help you fine-tune the results that are returned, including:
Profanity filtering. Without any configuration, the service will translate the input text,
without filtering out profanity. Profanity levels are typically culture-specific but you can
control profanity translation by either marking the translated text as profane or by
omitting it in the results.
Selective translation. You can tag content so that it isn't translated. For example, you
may want to tag code, a brand name, or a word/phrase that doesn't make sense when
localized.
Speech to text - used to transcribe speech from an audio source to text format.
Text to speech - used to generate spoken audio from a text source.
Speech Translation - used to translate speech in one language to text or speech in
another.
7 Note
You can learn more about Azure AI Speech and Speech Studio with the learn module
Fundamentals of Azure AI Speech.
" 200 XP
Knowledge check
Module assessment 2 minutes
To translate spoken audio from a streaming source into text or an audio stream.
2. Your team would like to build an application that translates digital copies of books.
Which Azure AI Translator capability would you use? *
Text translation
Document translation
" Correct. Document translation supports the processing of multiple documents
and large files.
Custom translation
3. You're developing an application that must take English input from a microphone and
generate a real-time audio output in Hindi. Which capability of Azure AI Speech would you
use? *
Text-to-speech
Speech translation
" Correct. Azure AI Speech can translate audio from one language into audio in
another language.