A Developer's Guide To Building AI Applications
A Developer's Guide To Building AI Applications
Guide to Building
Al Applications
Create Your First Intelligent Bot
with Microsoft AI
Anand Raman
& Wee Hyong Tok
MICROSOFT AI PLATFORM
Build Intelligent Apps
Artificial Intelligence
productivity for
every developer
and every scenario
aischool.microsoft.com
A Developer’s Guide to Building
AI Applications
Create Your First Intelligent Bot
with Microsoft AI
iii
A Developer’s Guide to Building
AI Applications
Introduction
Artificial Intelligence is rapidly becoming a mainstream technology that is
helping transform and empower us in unexpected ways. Let’s take a trip to Nepal
to see a fascinating example. Like the vast majority of Nepalese, Melisha Ghimere
came from a remote village from a family of subsistence farmers who raised
cows, goats and water buffalos. Seven years ago, she watched her relatively
wealthy uncle and aunt lose a lot of their herd to an outbreak of anthrax; they
were never to recover their economic footing. Melisha went on to university
thinking about the plight of her family. In university, she worked to develop
a predictive early warning solution to help farmers. With a team of four students,
she researched livestock farming, veterinary practices and spoke to farmers. They
built a prototype for a monitoring device that tracks temperature, sleep patterns,
stress levels, motion and the activity of farm animals. Melisha’s AI system
predicts the likely health of each animal based on, often subtle, changes in these
observations. Farmers are able to track their animals, as well as receive alerts and
actionable recommendations. Although her project is still in its infancy, the field
tests have shown the solution was about 95% accurate in predicting risks to an
animal’s health. Melisha and her team were able to help a family prevent a deadly
outbreak of anthrax infection by identifying a diseased cow, before symptoms
were evident to the farmer. Melisha’s team was a regional finalist in Microsoft’s
Imagine Cup competition in 2016. 1
Let me give you another example much closer to home: the power of AI in
transforming the retail experience. Lowe’s Innovation Labs has now created a
unique prototype shopping experience for home remodelling. For example, a
customer can now walk in and share their dream kitchen photos with a design
1 The Future Computed: Artificial Intelligence and its Role in Society – Microsoft
1
specialist. Using an AI-powered application, the design specialist can gain deep
insight into the customer’s style and preferences. The application generates
a match from the Lowe’s dream kitchen collection, and the design of the kitchen
is then shown in very realistic holographic mixed-reality through a HoloLens. 2
The customer can now visualise, explore and change the design to their taste in
the mixed-reality environment in real time. Applications like these are
vanguards of the revolution in retail experiences that AI will bring
for consumers.
Healthcare is another field that is at the cusp of a revolution. With the power of
AI and a variety of data sources from genomics, electronic medical records,
medical literature and population data, scientists are now able to predict health
emergencies, diagnose better and optimise care. A unique example in this area
comes from Cochrane, a highly reputed non-profit organisation dedicated to
gathering and summarising the best evidence from research to help doctors
make informed choices about treatment. Cochrane conducts systematic reviews,
which digest and analyse dense medical literature, and reduce it into fairly short
and manageable pieces of information, in order to give doctors the best possible
guidance on the effects of healthcare interventions. For example, a recent
systematic review of medical studies looked at whether steroids can help with the
maturation of premature babies’ lungs. The review showed conclusively that
steroids can help save babies’ lives. This intervention has helped hundreds of
thousands of premature babies. However, such reviews are very labour intensive
and can take two to three years to complete. Cochrane’s Project Transform was
born out of the need to make systematic reviews more efficient, give more timely
and relevant guidance to doctors, and therefore help save more lives. Project
Transform uses AI to manipulate and analyse the literature and data very
efficiently and therefore allow researchers to understand the data and interpret
the findings. It creates a perfect partnership between human and machine, where
a significant amount of the heavy overhead of systematic reviews is reduced, and
human analysis skills can be directed where they are most needed for timeliness
and quality.
There’s no field that will be left untouched by the transformational power of AI.
I can point you to areas as diverse as astronomy, where AI has accelerated the
pace of new discoveries, and the area of conservation, where ecologists and
conservationists are working with AI-powered tools to help track, study and
protect elusive and endangered animals.
2 http://www.lowesinnovationlabs.com/hololens
Introduction | 3
The Intersection of Cloud, Data and AI
In the rest of this section, we will introduce AI and the powerful intersection
of data, cloud and AI tools that is creating a paradigm shift, helping to create
intelligent systems.
The Microsoft AI Platform
Here, we explore the Microsoft AI platform and point out the tools,
infrastructure and services that are available for developing AI applications.
Developing an Intelligent Chatbot
This section presents a discussion of chatbots and conversational AI, and
highlights some chatbot implementations. How do you create an intelligent
chatbot for your enterprise? We provide a high-level architecture using the
Conference Buddy bot example, including code samples, discussion of the
design considerations and technologies involved, and taking a deep dive into
the abstraction layer of the bot, which we call the Bot Brain.
Adding “Plug and Play” Intelligence to your Bot
This section explores how you can easily give the bot new skills and
capabilities, such as vision, translation, speech and other custom AI abilities.
We also look at how you can develop the Bot Brain’s intelligence.
Building an Enterprise App to Gain Bot Insights: The Conference Buddy
Dashboard
This section highlights the Conference Buddy dashboard, which allows the
conference speaker and attendees to see the attendees’ questions and answer
them in real time. We also discuss how to instrument the bot to get metrics
and application insights.
Paving the Road Ahead
In the final section, we consider an exciting development in the AI world
with the release of Open Neural Network Exchange (ONNX) and also
Microsoft’s commitment to the six ethical principles – fairness, reliability
and safety, privacy and security, inclusivity, transparency and
accountability – to guide the cross-disciplinary development and use of AI.
Regression
How much? How many?
Anomaly
Is it weird?
Clustering
How is it organised?
So how do you begin to design AI-powered solutions that take advantage
of all the aforementioned capabilities? We design AI solutions to
complement and unlock human potential and creative pursuits. There are
significant implications of what it means to design technology for
humans, and this includes considering ethical implications;
understanding the context of how people work, play and live; and creating
tailored solutions that adapt over time.
One of the most fascinating areas of research is bridging emotional and
cognitive intelligence to create conversational AI systems that model
human language and have insight into both the logical and unpredictable
ways that humans interact.
According to Lili Cheng, corporate vice president of Microsoft AI and
Research, “This likely means AI needs to recognise when people are more
effective on their own – when to get out of the way, when not to help,
when not to record, when not to interrupt or distract.” 5
The time for AI is now, given the proliferation of data, the limitless
availability of computing powers on the cloud and the rise of powerful
algorithms that are powering the future.
AI Algorithms andTools
The explosion of use cases for AI, driven by online services and the digital
transformation, in turn catalysed enormous progress in AI algorithms. One of
the most profound innovations in recent years has been deep learning. This
technique, inspired by neural networks in the brain, allows computers to learn
deep concepts, relationships and representations from vast amounts of data
(such as images, video and text), and perform tasks such as object and speech
recognition with accuracy comparable to humans. Today, open source tools,
such as Cognitive Toolkit, PyTorch and TensorFlow, make deep learning
innovations accessible to a wide audience. And all the major cloud vendors now
have services that substantially simplify AI development to empower software
engineers.
Modern AI lives at the intersection of these three powerful trends: digital data
from which AI systems learn, cloud-hosted services that enable AI-powered
interactions and continuing innovations in algorithms that make AI capabilities
more powerful, while enabling novel applications and use cases.
3. Route your question for the speaker to a dashboard so the speaker can see
all the questions from the audience, pick the question to answer and
engage, as illustrated in Figure 1-4.
To get a feel for how this app looks and works, I encourage you to visit the
GitHub website, https://aka.ms/conferencebuddy, to see a demonstration
and review the code for this sample.
Conference Bot
The Conference Bot, built on the Bot Framework, intelligently handles all
participants’ message events. The bot is omnichannel, which means users can
email, Skype or use a custom message service that will connect through the bot
connector to reach the Conference Bot. Figure 1-6 shows the greeting when the
Conference Buddy app is invoked.
The Conference Buddy does several things that are indicative of good design
principles when it comes to the opening dialogue:
We illustrate how to teach the Bot Brain new skills in the next section.
1. The user invokes the Conference Bot by sending the first message.
2. The Conference Bot responds with a greeting and introduction of what it
can do.
3. The user asks a question, for example, “Who is Lili Cheng?”
4. The Conference Bot routes the message to LUIS to determine the intent
of the message. LUIS parses the message and, for our example, returns
“This is an Ask Who task.”
5. The Conference Bot then selects the appropriate bot task within the Bot
Brain to call via HTTP POST. In our example, the “Ask Who” task will do
the following:
a. Send the string to Bing Web Search and grab the results.
b. Send the string to Bing Image Search in parallel.
c. Combine the image and text into a response object/data contract that the
Conference Bot understands.
6. The Conference Bot sends a graphical card as results to the user.
7. The Conference Bot sends results to Azure Search to be archived so that the
dashboard can use it.
8. The user can click the link on the card to get more information from the
source of the article.
Figure 1-7 illustrates the “Who is?” response card for “Lili Cheng”.
Let’s demonstrate the “Learn More” task to illustrate this entire process:
1. Suppose that the user asks, “I want to learn more about Azure
Machine Learning.”
2. The Conference Bot routes the message to LUIS to determine the
intent of the message. LUIS parses the message and, for our example,
returns “This is a Learn More task.”
3. The Conference Bot then selects the appropriate bot task to call via
HTTP POST to process the message. In our example, “The Learn
More task” will call Text Analytics to extract key phrases and send
parallel requests to the following:
a. Video Indexer: This is a Cognitive Service that will get the
transcript of the video, break it into keywords, annotate the video,
analyse sentiments and moderate content. You can upload specific
Now let’s look at the some of the design considerations and take a deeper dive
into the bot’s architecture.
Messaging Platform
On which messaging platform will the bot reside? What devices and platforms do
our users care about? There are popular existing messaging channels like Skype,
Facebook Messenger, Slack, Kik and others, or you can build a custom messaging
layer on an existing mobile app or website.
The key thing is first figuring out where your target audience spends time. Do you
have a popular gaming platform and want to introduce an in-game reward bot? A
small business building a following on social media? A large bank with a popular
mobile app? The location of your bot will also be tied to the specific reason you
are building it.
To reach as many audience members as possible, we decided to make our
Conference Buddy an omnichannel bot. To do this, you would need to develop a
special plug-in for each source that takes care of the specific protocol between the
information source and your system. The Microsoft Azure Bot Service
Framework allows you to connect with more than one messaging channel and
receive the information in a standard format regardless of its source. Figure 1-88
shows the Microsoft Azure Bot Service screen, via which adding new channels is
just a matter of several clicks. 10
Root Dialog
Whereas a traditional application starts with a main screen and users can
navigate back to it to start over, with bots you have the Root Dialog. The Root
Dialog guides the conversation flow. From a UI perspective, each dialog acts
like a new screen. This way, dialogs help the developer to logically separate out
the various areas of bot functionality.
For the Conference Buddy bot, each dialog invokes the next, depending on
what the user types and the intent. This is called a waterfall dialog. This type of
},
],
"entities": [
{
"entity": "buddy",
"type": "Person",
"startIndex": 20,
"endIndex": 24,
"score": 0.95678144
}
]
}
When LUIS returns the intent as “Greeting”, the Root Dialog processes the
function “Greeting Intent”. This function displays the Greeting Dialog,
which in our example does not need to invoke a bot task. The control will
remain with the Greeting Dialog until the user types something else.
When the user responds, the Greeting Dialog closes and the Root Dialog
resumes control.
message.Substring
(activity.Recipient.Name.Length + 1).Trim( );
}
// Handle intents
LUISResult luisResult = await GetEntityFromLUIS(message);
string intent =
luisResult.intents?.FirstOrDefault()?.intent ??string.Empty;
string[ ] entities =
luisResult.entities?.Select
(e => e.entity)?.ToArray( ) ?? new string[0];
if (intent == "greeting")
{
await ProcessGreetingIntent(context, message);
}
else if (intent == "who")
{
await ProcessQueryIntent
(context, activity, BotTask.AskWho, message, entities);
}
else if (intent == "learnmore")
{
await ProcessQueryIntent
(context, activity, BotTask.AskLearnMore, message, entities);
}
else
{
await ProcessQueryIntent(
context, activity,
The Root Dialog is not invoked unless a user types a message. When the
Conference Buddy bot receives the first message, we do a special handling in the
code for messages coming from the Skype channel.
We discussed what happens when LUIS returns the Greeting Intent. In our
example chatbot, we anticipate three other possible intents from LUIS:
• If the intent is “Who”, the Root Dialog posts the question to the bot task
“Ask Who”.
• If the intent is “Learn More”, the Root Dialog posts the question to the bot
task “Learn More”.
• For all other intents, the Root Dialog sends the text to the “Ask Question”
task.
At this point, the Root Dialog hands control to the appropriate bot task.
return "success";
}
What’s important in this layer, no matter which bot task is called, is that the request,
invocation and response are handled the same way. The Data Contract called
AskQuestionRequest combines the ConversationID, Query, SessionID and UserID to
pass to the bot task through an HTTP POST.
The HTTP POST is the call to a bot task within the Bot Brain. When the appropriate
bot task executes the query, it prepares the response in the AskQuestion Response
where, no matter which bot task, the response is handled generically.
Because the Conference Buddy bot is omnichannel, the response card is displayed
differently according to the channel; the last part of the code shows how the bot
implements adaptive cards.
/// <summary>
/// The results of the response
/// </summary>
[DataMember(Name = "results")]
public AskQuestionResult [] Results { get; set; }
}
[DataContract]
/// <summary>
/// The answer of the result
/// </summary>
[DataMember(Name = "answer")]
public string Answer { get; set; }
/// <summary>
/// The image url of the result
/// </summary>
[DataMember(Name = "imageUrl")]
public string ImageUrl { get; set; }
/// <summary>
/// The source of the result
/// </summary>
[DataMember(Name = "source")]
public string Source { get; set; }
/// <summary>
/// The url of the result
/// </summary>
[DataMember(Name = "url")]
public string Url { get; set; }
/// <summary>
/// The url display name of the result
/// </summary>
[DataMember(Name = "urlDisplayName")]
public string UrlDisplayName { get; set;
}
}
The Data Contract allows the separation of functions regarding how a query is
processed and how the response is generated. Think of the Data Contract as the
postal carrier. From the postman’s perspective, the specific details of the content
of the letter/package are irrelevant. What matters is the format of the “To” and
“From” addresses that allow delivery to the right location.
If we had to make different HTTP calls to each bot task, the Conference Buddy
bot would be unwieldy and difficult to build, test, deploy and scale. In the next
section, we see how the microservices implementation makes it simpler to develop
the Bot Brain’s intelligence and teach our Conference Buddy bot new skills.
For our bot, we will make an additional call to Cognitive Services: Microsoft
Translator. This is a machine translation service that supports more than
60 languages. The developer sends source text to the service with a parameter
indicating the target language, and the service sends back the translated text for
the client or web app to use.
The translated text can now be used with the other Cognitive Services that we
have used so far, such as Text Analytics and Bing Web Search.
To make a call to a new Cognitive Service, you need to log into your Azure portal.
This Quick Guide walks you through editing the bot code and using Azure
Functions to invoke various APIs. In the sample code that follows, we illustrate
how to add the new translation bot task. Let’s explore the code:
[FunctionName("AskQuestion")]
public static async Task<HttpResponseMessage>
Run(
[HttpTrigger(AuthorizationLevel.Function, "post", Route =
"AskQuestion")]HttpRequestMessage request,
[Table("Session", Connection =
"AzureWebJobsStorage")]ICollector<SessionTableEntity> sessionTable,
TraceWriter log)
{
MediaTypeHeaderValue contentType = request.Content.Headers.ContentType;
Task<BingWebSearchResult> bingWebSearchTask =
ServicesUtility.BingSearch.SearchWebAsync
(query: request.Question, count: SettingsUtility.MaxResultsCount);
Task<BingWebImagesResult> bingWebImagesTask =
ServicesUtility.BingSearch.SearchImagesAsync
(query: request.Question, count: SettingsUtility.MaxResultsCount);
await Task.WhenAll(bingWebSearchTask, bingWebImagesTask);
// Process results
AskQuestionResponse response = new AskQuestionResponse()
{
Id = id,
Results = new AskQuestionResult[0]
};
if (bingWebSearchResult.WebPagesResult?.Values?.Count() > 0)
{
response.Results =
ServicesUtility.GetAskQuestionResults(bingWebSearchResult);
}
return response;
}
In the first part of the code, the function AskQuestion reads the content from the
request and translates the question into English using the translator. It then
extracts the key phrases using Text Analytics and sends the query to Bing Web
Search and Bing Image Search to create a card for the response. The key phrases
go to Azure Search, to power the bot’s analytics, as well as the dashboard. In this
example, we do not translate the response back into the original language, but that
could be an option for other implementations.
Now that we have successfully added a new bot task, we can continue to develop
the Bot Brain’s intelligence to add more abilities like APIs for vision, speech and
more through our Cognitive Services. Let’s discuss the Conference Buddy
dashboard.
Many web apps will need a search capability for the application content. Having
an easy-to-use search API in the cloud can be a big boon to developers. We
embed Azure Search in our Conference Buddy dashboard to search the questions
being asked. Azure Search is a simple Search-as-a-Service API that allows
developers to embed a sophisticated search experience into web and mobile
applications without having to worry about the complexities of full-text search
and without having to deploy, maintain or manage any infrastructure.
For the Conference Buddy dashboard, Azure Search does the following:
public SessionSearchService()
{
string searchServiceName =
ConfigurationManager.AppSettings["SearchServiceName"];
string index =
ConfigurationManager.AppSettings["SearchServiceIndexName"];
string apiKey =
ConfigurationManager.AppSettings["SearchServiceApiKey"];
Building an Enterprise App to Gain Bot Insights: The Conference Buddy Dashboard | 39
SearchServiceClient
(searchServiceName,new SearchCredentials(apiKey));
this.IndexClient = searchServiceClient.Indexes.GetClient(index);
}
// Add filtering
IList<string> filters = new List<string>();
if (string.IsNullOrEmpty(sessionFacet) == false)
{
filters.Add(string.Format
("sessionId eq '{0}'", sessionFacet));
}
if (string.IsNullOrEmpty(skillFacet) == false)
{
filters.Add(string.Format("skill eq '{0}'", skillFacet));
}
if (string.IsNullOrEmpty(topicsFacet) == false)
if (string.IsNullOrEmpty(isAnsweredFacet) == false)
{
filters.Add(string.Format
("isAnswered eq {0}", isAnsweredFacet));
}
sp.Filter =
filters.Count() > 0 ?
string.Join(" and ", filters) : string.Empty;
return await
this.IndexClient.Documents.SearchAsync(searchText, sp)
}
catch (Exception ex)
{
Console.WriteLine
("Error querying index: {0}\r\n", ex.Message.ToString());
}
return null;
}
To enable Analytics for the bot (see Figure 1-10), do the following:
Building an Enterprise App to Gain Bot Insights: The Conference Buddy Dashboard | 41
3. Type the information needed to connect the bot to Application Insights.
All fields are required.
AppInsights Application ID
To find this value, open Application Insights and then navigate to Configure →
API Access.
Channel
You can choose which channels appear in the graphs. Note that if a bot is not
enabled on a channel, there will be no data from that channel.
Grand Totals
View the numbers of active users and messages sent.
Retention
View how many users sent a message and came back, as demonstrated in
Figure 1-11.
Figure 1-11. Insights screen showing users who messaged again Users
The Users graph tracks how many users accessed the bot using each
channel during the specified time frame, as shown in Figure 1-12.
Building an Enterprise App to Gain Bot Insights: The Conference Buddy Dashboard | 43
Messages
The Message graph tracks how many messages were sent and received
using a given channel during the specified time frame (Figure 1-13).
11 There is an ongoing debate about who the originator of this quote is: Marshall McLuhan, Winston
Churchill and Robert Flaherty are among them. Check this link for the evolution of the discussion.