Course 1-AI NEP Notes (1) - 2
Course 1-AI NEP Notes (1) - 2
Course 1-AI NEP Notes (1) - 2
Unit 1:
Introduction to AI on Azure
With AI, we can build solutions that seemed like science fiction a short time ago;
enabling incredible advances in health care, financial management,
environmental protection, and other areas to make a better world for everyone.
Learning objectives
In this module, you'll learn about the kinds of solution AI can make possible and
considerations for responsible AI practices.
Introduction to AI
AI enables us to build amazing software that can improve health care, enable
people to overcome physical disadvantages, empower smart infrastructure, create
incredible entertainment experiences, and even save the planet!
What is AI?
Simply put, AI is the creation of software that imitates human behaviors and capabilities. Key
workloads include:
Machine learning - This is often the foundation for an AI system, and is the way we
"teach" a computer model to make prediction and draw conclusions from data.
Anomaly detection - The capability to automatically detect errors or unusual activity in
a system.
Computer vision - The capability of software to interpret the world visually through
cameras, video, and images.
Natural language processing - The capability for a computer to interpret written or
spoken language, and respond in kind.
Knowledge mining - The capability to extract information from large volumes of often
unstructured data to create a searchable knowledge store.
Understand machine learning
Let's start by looking at a real-world example of how machine learning can be used to solve a
difficult problem.
Sustainable farming techniques are essential to maximize food production while protecting a
fragile environment. The Yield, an agricultural technology company based in Australia, uses
sensors, data and machine learning to help farmers make informed decisions related to weather,
soil and plant conditions.
The answer is, from data. In today's world, we create huge volumes of data as we go about our
everyday lives. From the text messages, emails, and social media posts we send to the
photographs and videos we take on our phones, we generate massive amounts of information.
More data still is created by millions of sensors in our homes, cars, cities, public transport
infrastructure, and factories.
Data scientists can use all of that data to train machine learning models that can make
predictions and inferences based on the relationships they find in the data.
Feature Capability
Automated machine This feature enables non-experts to quickly create an effective machine
learning learning model from data.
Data and compute Cloud-based data storage and compute resources that professional data
management scientists can use to run data experiment code at scale.
Pipelines Data scientists, software engineers, and IT operations professionals can define
pipelines to orchestrate model training, deployment, and management tasks.
These kinds of scenario can be addressed by using anomaly detection - a machine learning
based technique that analyzes data over time and identifies unusual changes.
Let's explore how anomaly detection might help in the racing car scenario.
1. Sensors in the car collect telemetry, such as engine revolutions, brake temperature, and
so on.
2. An anomaly detection model is trained to understand expected fluctuations in the
telemetry measurements over time.
3. If a measurement occurs outside of the normal expected range, the model reports an
anomaly that can be used to alert the race engineer to call the driver in for a pit stop to
fix the issue before it forces retirement from the race.
The Seeing AI app is a great example of the power of computer vision. Designed for the blind
and low vision community, the Seeing AI app harnesses the power of AI to open up the visual
world and describe nearby people, text and objects.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vC2Q?postJsllMsg=true
Task Description
Image
classification
Object
detection
Task Description
Semantic
segmentation
Image
analysis
You can create solutions that combine machine learning models with
advanced image analysis techniques to extract information from images,
including "tags" that could help catalog the image or even descriptive
captions that summarize the scene shown in the image.
Task Description
Face
detection,
analysis, and
recognition
Optical
character
recognition
(OCR)
Service Capabilities
Computer You can use this service to analyze images and video, and extract descriptions, tags,
Vision objects, and text.
Custom Vision Use this service to train custom image classification and object detection models using
your own images.
Face The Face service enables you to build face detection and facial recognition solutions.
Form Use this service to extract information from scanned forms and invoices.
Recognizer
3 minutes
Natural language processing (NLP) is the area of AI that deals with creating software that
understands written and spoken language.
Analyze and interpret text in documents, email messages, and other sources.
Interpret spoken language, and synthesize speech responses.
Automatically translate spoken or written phrases between languages.
Interpret commands and determine appropriate actions.
For example, Starship Commander, is a virtual reality (VR) game from Human Interact, that
takes place in a science fiction world. The game uses natural language processing to enable
players to control the narrative and interact with in-game characters and starship systems.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vyDj?postJsllMsg=true
Service Capabilities
Language Use this service to access features for understanding and analyzing text,
training language models that can understand spoken or text-based commands,
and building intelligent applications.
Translator Use this service to translate text between more than 60 languages.
Speech Use this service to recognize and synthesize speech, and to translate spoken
languages.
Azure Bot This service provides a platform for conversational AI, the capability of a
software "agent" to participate in a conversation. Developers can use the Bot
Framework to create a bot and manage it with Azure Bot Service - integrating
back-end services like Language, and connecting to channels for web chat,
email, Microsoft Teams, and others.
Azure Cognitive Search can utilize the built-in AI capabilities of Azure Cognitive Services
such as image processing, content extraction, and natural language processing to perform
knowledge mining of documents. The product's AI capabilities makes it possible to index
previously unsearchable documents and to extract and surface insights from large amounts of
data quickly.
The following table shows some of the potential challenges and risks facing an AI application
developer.
Bias can affect results A loan-approval model discriminates by gender due to bias in the data
with which it was trained
Errors may cause harm An autonomous vehicle experiences a system failure and causes a
collision
Data could be exposed A medical diagnostic bot is trained using sensitive patient data, which
is stored insecurely
Solutions may not work A home automation assistant provides no audio output for visually
for everyone impaired users
Who's liable for AI-driven An innocent person is convicted of a crime based on evidence from
decisions? facial recognition – who's responsible?
Understand Responsible AI
At Microsoft, AI software development is guided by a set of six principles, designed to ensure
that AI applications provide amazing solutions to difficult problems without any unintended
negative consequences.
Fairness
AI systems should treat all people fairly. For example, suppose you create a machine learning
model to support a loan approval application for a bank. The model should predict whether the
loan should be approved or denied without bias. This bias could be based on gender, ethnicity,
or other factors that result in an unfair advantage or disadvantage to specific groups of
applicants.
Azure Machine Learning includes the capability to interpret models and quantify the extent to
which each feature of the data influences the model's prediction. This capability helps data
scientists and developers identify and mitigate bias in the model.
For more details about considerations for fairness, watch the following video.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vqfa?postJsllMsg=true
For more information about considerations for reliability and safety, watch the following video.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vvIl?postJsllMsg=true
For more details about considerations for privacy and security, watch the following video.
https://www.microsoft.com/en-us/videoplayer/embed/RE4voJF?postJsllMsg=true
Inclusiveness
AI systems should empower everyone and engage people. AI should bring benefits to all parts
of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other factors.
For more details about considerations for inclusiveness, watch the following video.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vl9v?postJsllMsg=true
Transparency
Dr. Bhagirathi Halalli, Assistant Professor, GFGC Raibag 13
Skill Enhancement Course (SEC): Artificial Intelligence for BA,BCOM,BSc, BBA, BSW III Sem
AI systems should be understandable. Users should be made fully aware of the purpose of the
system, how it works, and what limitations may be expected.
For more details about considerations for transparency, watch the following video.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vqfb?postJsllMsg=true
Accountability
People should be accountable for AI systems. Designers and developers of AI-based solutions
should work within a framework of governance and organizational principles that ensure the
solution meets ethical and legal standards that are clearly defined.
For more details about considerations for accountability, watch the following video.
https://www.microsoft.com/en-us/videoplayer/embed/RE4vvIk?postJsllMsg=true
The principles of responsible AI can help you understand some of the challenges facing
developers as they try to create ethical AI solutions.
Knowledge check
1. You want to create a model to predict sales of ice cream based on historic data that includes
daily ice cream sales totals and weather measurements. Which Azure service should you use?
2. You are designing an AI application that uses images to detect cracks in car windshields and
warn drivers when a windshield should be repaired or replaced. What AI workload is
described?
Computer Vision
Anomaly Detection
Natural Language Processing
3. A predictive app provides audio output for visually impaired users. Which principle of
Responsible AI is reflected here?
Transparency
Inclusiveness
Fairness
Summary
Artificial Intelligence enables the creation of powerful solutions to many kinds of problems.
AI systems can exhibit human characteristics to analyze the world around them, make
predictions or inferences, and act on them in ways that we could only imagine a short time ago.
With this power, comes responsibility. As developers of AI solutions, we must apply principles
that ensure that everyone benefits from AI without disadvantaging any individual or section of
society.
Unit 2:
Use visual tools to create machine learning models with Azure
Machine Learning
Introduction
Machine Learning is the foundation for most artificial intelligence solutions. Creating an
intelligent solution often begins with the use of machine learning to train predictive models
using historic data that you have collected.
Azure Machine Learning is a cloud service that you can use to train and manage machine
learning models.
To complete this module, you'll need a Microsoft Azure subscription. If you don't already have
one, you can sign up for a free trial at https://azure.microsoft.com
5 minutes
Machine learning is a technique that uses mathematics and statistics to create a model that can
predict unknown values.
For example, suppose Adventure Works Cycles is a business that rents cycles in a city. The
business could use historic data to train a model that predicts daily rental demand in order to
make sure sufficient staff and cycles are available.
To do this, Adventure Works could create a machine learning model that takes information
about a specific day (the day of week, the anticipated weather conditions, and so on) as an
input, and predicts the expected number of rentals as an output.
Mathematically, you can think of machine learning as a way of defining a function (let's call it
f) that operates on one or more features of something (which we'll call x) to calculate a
predicted label (y) - like this:
f(x) = y
In this bicycle rental example, the details about a given day (day of the week, weather, and so
on) are the features (x), the number of rentals for that day is the label (y), and the function (f)
that calculates the number of rentals based on the information about the day is encapsulated in
a machine learning model.
The specific operation that the f function performs on x to calculate y depends on a number of
factors, including the type of model you're trying to create and the specific algorithm used to
train the model. Additionally in most cases, the data used to train the machine learning model
requires some pre-processing before model training can be performed.
The supervised machine learning approach requires you to start with a dataset with known
label values. Two types of supervised machine learning tasks include regression and
classification.
Regression: used to predict a continuous value; like a price, a sales total, or some other
measure.
Classification: used to determine a class label; an example of a binary class label is whether a
patient has diabetes or not; an example of multi-class labels is classifying text as positive,
negative, or neutral.
The unsupervised machine learning approach starts with a dataset without known label
values. One type of unsupervised machine learning task is clustering.
Clustering: used to determine labels by grouping similar information into label groups; like
grouping measurements from birds into species.
The following video discusses the various kinds of machine learning model you can create, and
the process generally followed to train and use them.
https://www.microsoft.com/en-us/videoplayer/embed/RE4xAok?postJsllMsg=true
Most importantly, Azure Machine Learning helps data scientists increase their efficiency by
automating many of the time-consuming tasks associated with training models; and it enables
them to use cloud-based compute resources that scale effectively to handle large volumes of
data while incurring costs only when actually used.
After you have created an Azure Machine Learning workspace, you can develop solutions with
the Azure machine learning service either with developer tools or the Azure Machine Learning
studio web portal.
Compute targets are cloud-based resources on which you can run model training and data
exploration processes.
In Azure Machine Learning studio, you can manage the compute targets for your data science
activities. There are four kinds of compute resource you can create:
Compute Instances: Development workstations that data scientists can use to work with data
and models.
Compute Clusters: Scalable clusters of virtual machines for on-demand processing of
experiment code.
Inference Clusters: Deployment targets for predictive services that use your trained models.
Attached Compute: Links to existing Azure compute resources, such as Virtual Machines or
Azure Databricks clusters.
3 minutes
Azure Machine Learning includes an automated machine learning capability that automatically
tries multiple pre-processing techniques and model-training algorithms in parallel. These
automated capabilities use the power of cloud compute to find the best performing supervised
machine learning model for your data.
Automated machine learning allows you to train models without extensive data science or
programming knowledge. For people with a data science and programming background, it
provides a way to save time and resources by automating algorithm selection and
hyperparameter tuning.
You can create an automated machine learning job in Azure Machine Learning studio.
In Azure Machine Learning, operations that you run are called jobs. You can configure multiple
settings for your job before starting an automated machine learning run. The run configuration
provides the information needed to specify your training script, compute target, and Azure ML
environment in your run configuration and run a training job.
5 minutes
1. Prepare data: Identify the features and label in a dataset. Pre-process, or clean and transform,
the data as needed.
2. Train model: Split the data into two groups, a training and a validation set. Train a machine
learning model using the training data set. Test the machine learning model for performance
using the validation data set.
3. Evaluate performance: Compare how close the model's predictions are to the known labels.
4. Deploy a predictive service: After you train a machine learning model, you can deploy the
model as an application on a server or device so that others can use it.
These are the same steps in the automated machine learning process with Azure Machine
Learning.
Prepare data
Machine learning models must be trained with existing data. Data scientists expend a lot of
effort exploring and pre-processing data, and trying various types of model-training algorithms
to produce accurate models, which is time consuming, and often makes inefficient use of
expensive compute hardware.
In Azure Machine Learning, data for model training and other operations is usually
encapsulated in an object called a dataset. You can create your own dataset in Azure Machine
Learning studio.
Train model
The automated machine learning capability in Azure Machine Learning supports supervised
machine learning models - in other words, models for which the training data includes known
label values. You can use automated machine learning to train models for:
In Automated Machine Learning you can select from several types of tasks:
In Automated Machine Learning, you can select configurations for the primary metric, type of
model used for training, exit criteria, and concurrency limits.
Importantly, AutoML will split data into a training set and a validation set. You can configure
the details in the settings before you run the job.
Evaluate performance
After the job has finished you can review the best performing model. In this case, you used exit
criteria to stop the job. Thus the "best" model the job generated might not be the best possible
model, just the best one found within the time allowed for this exercise.
The best model is identified based on the evaluation metric you specified, Normalized root
mean squared error.
A technique called cross-validation is used to calculate the evaluation metric. After the model
is trained using a portion of the data, the remaining portion is used to iteratively test, or cross-
validate, the trained model. The metric is calculated by comparing the predicted value from the
test with the actual known value, or label.
The difference between the predicted and actual value, known as the residuals, indicates the
amount of error in the model. The performance metric root mean squared error (RMSE), is
calculated by squaring the errors across all of the test cases, finding the mean of these squares,
and then taking the square root. What all of this means is that smaller this value is, the more
accurate the model's predictions. The normalized root mean squared error (NRMSE)
standardizes the RMSE metric so it can be used for comparison between models which have
variables on different scales.
The Residual Histogram shows the frequency of residual value ranges. Residuals represent
variance between predicted and true values that can't be explained by the model, in other words,
errors. You should hope to see the most frequently occurring residual values clustered around
zero. You want small errors with fewer errors at the extreme ends of the scale.
The Predicted vs. True chart should show a diagonal trend in which the predicted value
correlates closely to the true value. The dotted line shows how a perfect model should perform.
The closer the line of your model's average predicted value is to the dotted line, the better its
performance. A histogram below the line chart shows the distribution of true values.
After you've used automated machine learning to train some models, you can deploy the best
performing model as a service for client applications to use.
Knowledge check
1.
An automobile dealership wants to use historic car sales data to train a machine learning model.
The model should predict the price of a pre-owned car based on its make, model, engine size,
and mileage. What kind of machine learning model should the dealership use automated
machine learning to create?
Classification
Regression
2. A bank wants to use historic loan repayment records to categorize loan applications as low-
risk or high-risk based on characteristics like the loan amount, the income of the borrower, and
the loan period. What kind of machine learning model should the bank use automated machine
learning to create?
Classification
Regression
Time series forecasting
3. You want to use automated machine learning to train a regression model with the best
possible R2 score. How should you configure the automated machine learning experiment?
Set the Primary metric to R2 score
Correct. The primary metric determines the metric used to evaluate the best performing model.
Summary
In this module, you learned how to:
Unit 3.
Explore computer vision in Microsoft Azure
Introduction
Computer vision is one of the core areas of artificial intelligence (AI), and focuses on creating
solutions that enable AI applications to "see" the world and make sense of it.
Of course, computers don't have biological eyes that work the way ours do, but they are capable
of processing images; either from a live camera feed or from digital photographs or videos.
This ability to process images is the key to creating software that can emulate human visual
perception.
Content Organization: Identify people or objects in photos and organize them based on that
identification. Photo recognition applications like this are commonly used in photo storage and
social media applications.
Text Extraction: Analyze images and PDF documents that contain text and extract the text
into a structured format.
Spatial Analysis: Identify people or objects, such as cars, in a space and map their movement
within that space.
To an AI application, an image is just an array of pixel values. These numeric values can be
used as features to train machine learning models that make predictions about the image and
its contents.
Training machine learning models from scratch can be very time intensive and require a large
amount of data. Microsoft's Computer Vision service gives you access to pre-trained computer
vision capabilities.
Learning objectives
In this module you will:
Identify image analysis tasks that can be performed with the Computer Vision service.
Provision a Computer Vision resource.
Use a Computer Vision resource to analyze an image.
Computer Vision: A specific resource for the Computer Vision service. Use this resource type
if you don't intend to use any other cognitive services, or if you want to track utilization and
costs for your Computer Vision resource separately.
Cognitive Services: A general cognitive services resource that includes Computer Vision along
with many other cognitive services; such as Text Analytics, Translator Text, and others. Use
this resource type if you plan to use multiple cognitive services and want to simplify
administration and development.
Whichever type of resource you choose to create, it will provide two pieces of information that
you will need to use it:
Note
If you create a Cognitive Services resource, client applications use the same key and endpoint
regardless of the specific service they are using.
Describing an image
Computer Vision has the ability to analyze an image, evaluate the objects that are detected, and
generate a human-readable phrase or sentence that can describe what was detected in the image.
Depending on the image contents, the service may return multiple results, or phrases. Each
returned phrase will have an associated confidence score, indicating how confident the
algorithm is in the supplied description. The highest confidence phrases will be listed first.
To help you understand this concept, consider the following image of the Empire State building
in New York. The returned phrases are listed below the image in the order of confidence.
The image descriptions generated by Computer Vision are based on a set of thousands of
recognizable objects, which can be used to suggest tags for the image. These tags can be
associated with the image as metadata that summarizes attributes of the image; and can be
particularly useful if you want to index an image along with a set of key terms that might be
used to search for images with specific attributes or contents.
For example, the tags returned for the Empire State building image include:
skyscraper
tower
building
Detecting objects
The object detection capability is similar to tagging, in that the service can identify common
objects; but rather than tagging, or providing tags for the recognized objects only, this service
can also return what is known as bounding box coordinates. Not only will you get the type of
object, but you will also receive a set of coordinates that indicate the top, left, width, and height
of the object detected, which you can use to identify the location of the object in the image,
like this:
Detecting brands
This feature provides the ability to identify commercial brands. The service has an existing
database of thousands of globally recognized logos from commercial brands of products.
When you call the service and pass it an image, it performs a detection task and determine if
any of the identified objects in the image are recognized brands. The service compares the
brands against its database of popular brands spanning clothing, consumer electronics, and
many more categories. If a known brand is detected, the service returns a response that contains
the brand name, a confidence score (from 0 to 1 indicating how positive the identification is),
and a bounding box (coordinates) for where in the image the detected brand was found.
For example, in the following image, a laptop has a Microsoft logo on its lid, which is identified
and located by the Computer Vision service.
Detecting faces
The Computer Vision service can detect and analyze human faces in an image, including the
ability to determine age and a bounding box rectangle for the location of the face(s). The facial
analysis capabilities of the Computer Vision service are a subset of those provided by the
dedicated Face Service. If you need basic face detection and analysis, combined with general
image analysis capabilities, you can use the Computer Vision service; but for more
comprehensive facial analysis and facial recognition functionality, use the Face service.
The following example shows an image of a person with their face detected and approximate
age estimated.
Categorizing an image
Computer Vision can categorize images based on their contents. The service uses a parent/child
hierarchy with a "current" limited set of categories. When analyzing an image, detected objects
are compared to the existing categories to determine the best way to provide the categorization.
As an example, one of the parent categories is people_. This image of a person on a roof is
assigned a category of people_.
A slightly different categorization is returned for the following image, which is assigned to the
category people_group because there are multiple people in the image:
When categorizing an image, the Computer Vision service supports two specialized domain
models:
Celebrities - The service includes a model that has been trained to identify thousands of well-
known celebrities from the worlds of sports, entertainment, and business.
Landmarks - The service can identify famous landmarks, such as the Taj Mahal and the Statue
of Liberty.
For example, when analyzing the following image for landmarks, the Computer Vision service
identifies the Eiffel Tower, with a confidence of 99.41%.
The Computer Vision service can use optical character recognition (OCR) capabilities to detect
printed and handwritten text in images. This capability is explored in the Read text with the
Computer Vision service module on Microsoft Learn.
Additional capabilities
Detect image types - for example, identifying clip art images or line drawings.
Detect image color schemes - specifically, identifying the dominant foreground, background,
and overall colors in an image.
Generate thumbnails - creating small versions of images.
Moderate content - detecting images that contain adult content or depict violent, gory scenes.
Knowledge check
1. You want to use the Computer Vision service to analyze images. You also want to use the Language
service to analyze text. You want developers to require only one key and endpoint to access all of your
services. What kind of resource should you create in your Azure subscription?
Computer Vision
Cognitive Services
Correct. A Cognitive Services resource supports both Computer Vision and Language.
Custom Vision
2. You want to use the Computer Vision service to identify the location of individual items in an image.
Which of the following features should you retrieve?
Objects
Correct. Computer Vision returns objects with a bounding box to indicate their location in the
image.
Tags
Categories
3. You want to use the Computer Vision service to analyze images of locations and identify well-known
buildings. What should you do?
Retrieve the categories for the image, specifying the celebrities domain
Retrieve the categories for the image, specifying the landmarks domain
Correct. The landmarks domain includes many well-known buildings around the world.
Summary
The Computer Vision service provides many capabilities that you can use to analyze images,
including generating a descriptive caption, extracting relevant tags, identifying objects,
determining image type and metadata, detecting human faces, known brands, and celebrities,
and others.
You can find out more about using the Computer Vision service in the service documentation.
Clean-up
It's a good idea at the end of a project to identify whether you still need the resources you
created. Resources left running can cost you money.
If you are continuing on to other modules in this learning path you can keep your resources for
use in other labs.
If you have finished learning, you can delete the resource group or individual resources from
your Azure subscription:
1. In the Azure portal, in the Resource groups page, open the resource group you
specified when creating your resource.
2. Click Delete resource group, type the resource group name to confirm you want to
delete it, and select Delete. You can also choose to delete individual resources by
selecting the resource(s), clicking on the three dots to see more options, and clicking
Delete.
Unit iv:
Explore natural language processing
Introduction
Analyzing text is a process where you evaluate different aspects of a document or phrase, in
order to gain insights into the content of that text. For the most part, humans are able to read
some text and understand the meaning behind it. Even without considering grammar rules for
the language the text is written in, specific insights can be identified in the text.
As an example, you might read some text and identify some key phrases that indicate the main
talking points of the text. You might also recognize names of people or well-known landmarks
such as the Eiffel Tower. Although difficult at times, you might also be able to get a sense for
how the person was feeling when they wrote the text, also commonly known as sentiment.
Statistical analysis of terms used in the text. For example, removing common "stop words"
(words like "the" or "a", which reveal little semantic information about the text), and performing
frequency analysis of the remaining words (counting how often each word appears) can provide
clues about the main subject of the text.
Extending frequency analysis to multi-term phrases, commonly known as N-grams (a two-word
phrase is a bi-gram, a three-word phrase is a tri-gram, and so on).
Applying stemming or lemmatization algorithms to normalize words before counting them - for
example, so that words like "power", "powered", and "powerful" are interpreted as being the
same word.
Applying linguistic structure rules to analyze sentences - for example, breaking down sentences
into tree-like structures such as a noun phrase, which itself contains nouns, verbs, adjectives,
and so on.
Encoding words or terms as numeric features that can be used to train a machine learning model.
For example, to classify a text document based on the terms it contains. This technique is often
used to perform sentiment analysis, in which a document is classified as positive or negative.
Creating vectorized models that capture semantic relationships between words by assigning
them to locations in n-dimensional space. This modeling technique might, for example, assign
values to the words "flower" and "plant" that locate them close to one another, while
"skateboard" might be given a value that positions it much further away.
While these techniques can be used to great effect, programming them can be complex. In
Microsoft Azure, the Language cognitive service can help simplify application development
by using pre-trained models that can:
Extract key phrases from text that might indicate its main talking points.
Identify and categorize entities in the text. Entities can be people, places, organizations, or even
everyday items such as dates, times, quantities, and so on.
In this module, you'll explore some of these capabilities and gain an understanding of how you
might apply them to applications such as:
A social media feed analyzer to detect sentiment around a political campaign or a product in
market.
A document search application that extracts key phrases to help summarize the main subject
matter of documents in a catalog.
A tool to extract brand information or company names from documents or other text for
identification purposes.
These examples are just a small sample of the many areas that the Language service can help
with text analytics.
A Language resource - choose this resource type if you only plan to use natural language
processing services, or if you want to manage access and billing for the resource separately
from other services.
A Cognitive Services resource - choose this resource type if you plan to use the Language
service in combination with other cognitive services, and you want to manage access and billing
for these services together.
Language detection
Use the language detection capability of the Language service to identify the language in which
text is written. You can submit multiple documents at a time for analysis. For each document
submitted to it, the service will detect:
For example, consider a scenario where you own and operate a restaurant where customers can
complete surveys and provide feedback on the food, the service, staff, and so on. Suppose you
have received the following reviews from customers:
Review 1: "A fantastic place for lunch. The soup was delicious."
Review 3: "The croque monsieur avec frites was terrific. Bon appetit!"
You can use the text analytics capabilities in the Language service to detect the language for
each of these reviews; and it might respond with the following results:
Notice that the language detected for review 3 is English, despite the text containing a mix of
English and French. The language detection service will focus on the predominant language
in the text. The service uses an algorithm to determine the predominant language, such as length
of phrases or total amount of text for the language compared to other languages in the text. The
predominant language will be the value returned, along with the language code. The confidence
score may be less than 1 as a result of the mixed language text.
There may be text that is ambiguous in nature, or that has mixed language content. These
situations can present a challenge to the service. An ambiguous content example would be a
case where the document contains limited text, or only punctuation. For example, using the
service to analyze the text ":-)", results in a value of unknown for the language name and the
language identifier, and a score of NaN (which is used to indicate not a number).
Sentiment analysis
The text analytics capabilities in the Language service can evaluate text and return sentiment
scores and labels for each sentence. This capability is useful for detecting positive and negative
sentiment in social media, customer reviews, discussion forums and more.
Using the pre-built machine learning classification model, the service evaluates the text and
returns a sentiment score in the range of 0 to 1, with values closer to 1 being a positive
sentiment. Scores that are close to the middle of the range (0.5) are considered neutral or
indeterminate.
For example, the following two restaurant reviews could be analyzed for sentiment:
"We had dinner at this restaurant last night and the first thing I noticed was how courteous the
staff was. We were greeted in a friendly manner and taken to our table right away. The table
was clean, the chairs were comfortable, and the food was amazing."
and
"Our dining experience at this restaurant was one of the worst I've ever had. The service was
slow, and the food was awful. I'll never eat at this establishment again."
The sentiment score for the first review might be around 0.9, indicating a positive sentiment;
while the score for the second review might be closer to 0.1, indicating a negative sentiment.
Indeterminate sentiment
A score of 0.5 might indicate that the sentiment of the text is indeterminate, and could result
from text that does not have sufficient context to discern a sentiment or insufficient phrasing.
For example, a list of words in a sentence that has no structure, could result in an indeterminate
score. Another example where a score may be 0.5 is in the case where the wrong language code
was used. A language code (such as "en" for English, or "fr" for French) is used to inform the
service which language the text is in. If you pass text in French but tell the service the language
code is en for English, the service will return a score of precisely 0.5.
"We had dinner here for a birthday celebration and had a fantastic experience. We were
greeted by a friendly hostess and taken to our table right away. The ambiance was relaxed, the
food was amazing, and service was terrific. If you like great food and attentive service, you
should try this place."
Key phrase extraction can provide some context to this review by extracting the following
phrases:
attentive service
great food
birthday celebration
fantastic experience
table
friendly hostess
dinner
ambiance
place
Not only can you use sentiment analysis to determine that this review is positive, you can use
the key phrases to identify important elements of the review.
Entity recognition
You can provide the Language service with unstructured text and it will return a list of entities
in the text that it recognizes. The service can also provide links to more information about that
entity on the web. An entity is essentially an item of a particular type or a category; and in
some cases, subtype, such as those as shown in the following table.
Organization "Microsoft"
URL "https://www.bing.com"
Email "[email protected]"
IP Address "10.0.1.125"
The service also supports entity linking to help disambiguate entities by linking to a specific
reference. For recognized entities, the service returns a URL for a relevant Wikipedia article.
For example, suppose you use the Language service to detect entities in the following restaurant
review extract:
In this exercise, you'll test the capabilities of the Language service. You'll use a simple command-line
application that runs in the Cloud Shell. The same principles and functionality apply in real-world
solutions, such as web sites or phone apps.
https://microsoftlearning.github.io/AI-900-AIFundamentals/instructions/04-module-04.html
1. You want to use the Language service to determine the key talking points in a text document.
Which feature of the service should you use?
Sentiment analysis
Key phrase extraction
Correct. Key phrases can be used to identify the main talking points in a text document.
Entity detection
2. You use the Language service to perform sentiment analysis on a document, and a score of
0.99 is returned. What does this score indicate about the document sentiment?
The document is positive.
Correct. Score values closer to 1 indicated a more positive sentiment where scores closer to 0
indicated negative sentiment.
Correct. The service will return NaN when it cannot determine the language in the provided
text.
Summary
The Language service provides advanced natural language processing over raw text, and
includes four main functions: sentiment analysis, key phrase extraction, language detection,
and named entity recognition.
Clean-up
It's a good idea at the end of a project to identify whether you still need the resources you
created. Resources left running can cost you money.
If you are continuing on to other modules in this learning path you can keep your resources for
use in other labs.
If you have finished learning, you can delete the resource group or individual resources from
your Azure subscription:
1. In the Azure portal, in the Resource groups page, open the resource group you
specified when creating your resource.
2. Click Delete resource group, type the resource group name to confirm you want to
delete it, and select Delete. You can also choose to delete individual resources by
selecting the resource(s), clicking on the three dots to see more options, and clicking
Delete.
Unit v:
Explore conversational AI
Introduction
Anomaly detection is an artificial intelligence technique used to determine whether values in a
series are within expected parameters.
There are many scenarios where anomaly detection is helpful. For example, a smart HVAC
system might use anomaly detection to monitor temperatures in a building and raise an alert if
the temperature goes above or below the expected value for a given period of time.
The Azure Anomaly Detector service is a cloud-based service that helps you monitor and detect
abnormalities in your historical time series and real-time data.
Learning objectives
After completing this module, you'll be able to:
In the graphic depicting the time series data, there is a light shaded area that indicates the
boundary, or sensitivity range. The solid blue line is used to indicate the measured values.
When a measured value is outside of the shaded boundary, an orange dot is used to indicate the
value is considered an anomaly. The sensitivity boundary is a parameter that you can specify
when calling the service. It allows you to adjust that boundary settings to tweak the results.
Anomaly detection is considered the act of identifying events, or observations, that differ in a
significant way from the rest of the data being evaluated. Accurate anomaly detection leads to
prompt troubleshooting, which helps to avoid revenue loss and maintain brand reputation.
Data format
The Anomaly Detector service accepts data in JSON format. You can use any numerical data
that you have recorded over time. The key aspects of the data being sent includes the
granularity, a timestamp, and a value that was recorded for that timestamp. An example of a
JSON object that you might send to the API is shown in this code sample. The granularity is
set as hourly and is used to represent temperatures in degrees Celsius that were recorded at the
timestamps indicated.
{
"granularity": "hourly",
"series": [
{
"timestamp": "2021-03-02T01:00:00Z",
"value": -10.56
},
{
"timestamp": "2021-03-02T02:00:00Z",
"value": -8.30
},
{
"timestamp": "2021-03-02T03:00:00Z",
"value": -10.30
},
{
"timestamp": "2021-03-02T04:00:00Z",
"value": 5.95
},
]
}
The service will support a maximum of 8640 data points however, sending this many data
points in the same JSON object, can result in latency for the response. You can improve the
response by breaking your data points into smaller chunks (windows) and sending these in a
sequence.
The same JSON object format is used in a streaming scenario. The main difference is that you
will send a single value in each request. The streaming detection method will compare the
current value being sent and the previous value sent.
Sampling occurs every few minutes and has less than 10% of the expected number of points
missing. In this case, the impact should be negligible on the detection results.
If you have more than 10% missing, there are options to help "fill" the data set. Consider using
a linear interpolation method to fill in the missing values and complete the data set. This will
fill gaps with evenly distributed values.
The Anomaly Detector service will provide the best results if your time series data is evenly
distributed. If the data is more randomly distributed, you can use an aggregation method to
create a more even distribution data set.
Batch detection
Batch detection involves applying the algorithm to an entire data series at one time. The concept
of time series data involves evaluation of a data set as a batch. Use your time series to detect
any anomalies that might exist throughout your data. This operation generates a model using
your entire time series data, with each point analyzed using the same model.
When using the batch detection mode, Anomaly Detector creates a single statistical model
based on the entire data set that you pass to the service. From this model, each data point in the
data set is evaluated and anomalies are identified.
Consider a pharmaceutical company that stores medications in storage facilities where the
temperature in the facilities needs to remain within a specific range. To evaluate whether the
medication remained stored in a safe temperature range in the past three months we need to
know:
If you are interested in evaluating compliance over historical readings, you can extract the
required time series data, package it into a JSON object, and send it to the Anomaly Detector
service for evaluation. You will then have a historical view of the temperature readings over
time.
Real-time detection
Real-time detection uses streaming data by comparing previously seen data points to the last
data point to determine if your latest one is an anomaly. This operation generates a model using
the data points you send, and determines if the target (current) point is an anomaly. By calling
the service with each new data point you generate, you can monitor your data as it's created.
Consider a scenario in the carbonated beverage industry where real-time anomaly detection
may be useful. The carbon dioxide added to soft drinks during the bottling or canning process
needs to stay in a specific temperature range.
Bottling systems use a device known as a carbo-cooler to achieve the refrigeration of the
product for this process. If the temperature goes too low, the product will freeze in the carbo-
cooler. If the temperature is too warm, the carbon dioxide will not adhere properly. Either
situation results in a product batch that cannot be sold to customers.
This carbonated beverage scenario is an example of where you could use streaming detection
for real-time decision making. It could be tied into an application that controls the bottling line
equipment. You may use it to feed displays that depict the system temperatures for the quality
control station. A service technician may also use it to identify equipment failure potential and
servicing needs.
You can use the Anomaly Detector service to create a monitoring application configured with
the above criteria to perform real-time temperature monitoring. You can perform anomaly
detection using both streaming and batch detection techniques. Streaming detection is most
useful for monitoring critical storage requirements that must be acted on immediately. Sensors
will monitor the temperature inside the compartment and send these readings to your
application or an event hub on Azure. Anomaly Detector will evaluate the streaming data points
and determine if a point is an anomaly.
Correct. Seasonal times series is considered to be a pattern in your data that occurs at regular
intervals. Examples would be hourly, daily, or monthly patterns.
It tells the service how to chunk up the results that are returned for review, independent of
the time series data pattern.
It is used to indicate the range of acceptable values.
3. How does the Anomaly Detector service evaluate real-time data for anomalies?
It collects all the values in a window of time and evaluates them all at once.
It evaluates the current value against the previous value.
Correct. It evaluates previously seen data points to determine if your latest one is an anomaly.
It uses interpolation based on the current value and the previous value to predict what the
expected value should be.
Summary
The Anomaly Detector detects anomalies automatically in time series data. It supports two
basic detection models. One is for detecting a batch of data with a model trained by the time
series sent to the service. The other is used for detecting the last point with the model trained
by points before.
Packaging your time series data into a JSON object and passing it to the API, anomalies can
be detected in the time series data. Using the returned results can help you identify issues with
industrial processes or recorded events. Batch series data is best used to evaluate recorded
events that represent seasonal patterns and don't require immediate action. Streaming data
points into the API can offer real-time awareness of anomalies that may require immediate
action.
The API can be integrated into your applications by using REST calls or by incorporating the
appropriate SDK into your code. Using the Anomaly Detector service does not require you to
devise, or to be knowledgeable in, machine learning algorithms.
Unit vi.
Tune Model Hyper parameters - Azure Machine Learning
(Reading)
Tune Model Hyperparameters
Learning Objectives
This article describes how to use the Tune Model Hyperparameters component in Azure
Machine Learning designer. The goal is to determine the optimum hyperparameters for a
machine learning model. The component builds and tests multiple models by using different
combinations of settings. It compares metrics over all models to get the combinations of
settings.
The terms parameter and hyperparameter can be confusing. The model's parameters are what
you set in the right pane of the component. Basically, this component performs a parameter
sweep over the specified parameter settings. It learns an optimal set of hyperparameters, which
might be different for each specific decision tree, dataset, or regression method. The process
of finding the optimal configuration is sometimes called tuning.
The component supports the following method for finding the optimum settings for a model:
integrated train and tune. In this method, you configure a set of parameters to use. You then
let the component iterate over multiple combinations. The component measures accuracy until
it finds a "best" model. With most learner components, you can choose which parameters
should be changed during the training process, and which should remain fixed.
Depending on how long you want the tuning process to run, you might decide to exhaustively
test all combinations. Or you might shorten the process by establishing a grid of parameter
combinations and testing a randomized subset of the parameter grid.
This method generates a trained model that you can save for reuse.
Tip
You can do a related task. Before you start tuning, apply feature selection to determine the
columns or variables that have the highest information value.
This section describes how to perform a basic parameter sweep, which trains a model by using
the Tune Model Hyperparameters component.
1. Add the Tune Model Hyperparameters component to your pipeline in the designer.
2. Connect an untrained model to the leftmost input.
Note
3. Add the dataset that you want to use for training, and connect it to the middle input of
Tune Model Hyperparameters.
Optionally, if you have a tagged dataset, you can connect it to the rightmost input port
(Optional validation dataset). This lets you measure accuracy while training and
tuning.
4. In the right panel of Tune Model Hyperparameters, choose a value for Parameter
sweeping mode. This option controls how the parameters are selected.
o Entire grid: When you select this option, the component loops over a grid
predefined by the system, to try different combinations and identify the best
learner. This option is useful when you don't know what the best parameter
settings might be and want to try all possible combinations of values.
o Random sweep: When you select this option, the component will randomly
select parameter values over a system-defined range. You must specify the
maximum number of runs that you want the component to execute. This option
is useful when you want to increase model performance by using the metrics of
your choice but still conserve computing resources.
5. For Label column, open the column selector to choose a single label column.
6. Choose the number of runs:
o Maximum number of runs on random sweep: If you choose a random sweep, you
can specify how many times the model should be trained, by using a random
combination of parameter values.
7. For Ranking, choose a single metric to use for ranking the models.
When you run a parameter sweep, the component calculates all applicable metrics for
the model type and returns them in the Sweep results report. The component uses
separate metrics for regression and classification models.
However, the metric that you choose determines how the models are ranked. Only the
top model, as ranked by the chosen metric, is output as a trained model to use for
scoring.
8. For Random seed, enter an integer number as a pseudo random number generator state
used for randomly selecting parameter values over a pre-defined range. This parameter
is only effective if Parameter sweeping mode is Random sweep.
To view the sweep results, you could either right-click the component, and then select
Visualize, or right-click left output port of the component to visualize.
The Sweep results includes all parameter sweep and accuracy metrics that apply to the
model type, and the metric that you selected for ranking determines which model is
considered "best."
To save a snapshot of the trained model, select the Outputs+logs tab in the right panel
of the Train model component. Select the Register dataset icon to save the model as
a reusable component.
Technical notes
This section contains implementation details and tips.
When you set up a parameter sweep, you define the scope of your search. The search might
use a finite number of parameters selected randomly. Or it might be an exhaustive search over
a parameter space that you define.
Random sweep: This option trains a model by using a set number of iterations.
You specify a range of values to iterate over, and the component uses a randomly
chosen subset of those values. Values are chosen with replacement, meaning that
numbers previously chosen at random are not removed from the pool of available
numbers. So the chance of any value being selected stays the same across all passes.
Entire grid: The option to use the entire grid means that every combination is tested.
This option is the most thorough, but it requires the most time.
We recommend that you pipeline with the settings to determine the most efficient method of
training on a particular dataset and model.
At the end of testing, the model presents a report that contains the accuracy for each model so
that you can review the metric results:
However, during training, you must choose a single metric to use in ranking the models that
are generated during the tuning process. You might find that the best metric varies, depending
on your business problem and the cost of false positives and false negatives.
Mean absolute error averages all the errors in the model, where error means the
distance of the predicted value from the true value. It's often abbreviated as MAE.
Root of mean squared error measures the average of the squares of the errors, and
then takes the root of that value. It's often abbreviated as RMSE.
Relative absolute error represents the error as a percentage of the true value.
Relative squared error normalizes the total squared error by dividing by the total
squared error of the predicted values.
Coefficient of determination is a single number that indicates how well data fits a
model. A value of one means that the model exactly matches the data. A value of zero
means that the data is random or otherwise can't be fit to the model. It's often called r2,
R2, or r-squared.
Almost all learners in Azure Machine Learning support cross-validation with an integrated
parameter sweep, which lets you choose the parameters to pipeline with. If the learner doesn't
support setting a range of values, you can still use it in cross-validation. In this case, a range of
allowed values is selected for the sweep.
Unit vii:
Neural Network Regression: Module Reference – Azure
Machine Learning (Reading).
Neural Network Regression component
Learning Objectives
1. Component overview
2. Configure Neural Network Regression
3. Create a neural network model using the default architecture
4. Results
5. Next steps
Component overview
This article describes a component in Azure Machine Learning designer.
Use this component to create a regression model using a customizable neural network
algorithm.
Although neural networks are widely known for use in deep learning and modeling complex
problems such as image recognition, they are easily adapted to regression problems. Any class
of statistical models can be termed a neural network if they use adaptive weights and can
approximate non-linear functions of their inputs. Thus neural network regression is suited to
problems where a more traditional regression model cannot fit a solution.
Neural network regression is a supervised learning method, and therefore requires a tagged
dataset, which includes a label column. Because a regression model predicts a numerical value,
the label column must be a numerical data type.
You can train the model by providing the model and the tagged dataset as an input to Train
Model. The trained model can then be used to predict values for the new input examples.
If you accept the default neural network architecture, use the Properties pane to set
parameters that control the behavior of the neural network, such as the number of nodes
in the hidden layer, learning rate, and normalization.
Start here if you are new to neural networks. The component supports many
customizations, as well as model tuning, without deep knowledge of neural networks.
Use this option if you want to add extra hidden layers, or fully customize the network
architecture, its connections, and activation functions.
This option is best if you are already somewhat familiar with neural networks. You use
the Net# language to define the network architecture.
Because the number of nodes in the input layer is determined by the number of features
in the training data, in a regression model there can be only one node in the output layer.
4. For Number of hidden nodes, type the number of hidden nodes. The default is one
hidden layer with 100 nodes. (This option is not available if you define a custom
architecture using Net#.)
5. For Learning rate, type a value that defines the step taken at each iteration, before
correction. A larger value for learning rate can cause the model to converge faster, but
it can overshoot local minima.
6. For Number of learning iterations, specify the maximum number of times the
algorithm processes the training cases.
7. For The momentum, type a value to apply during learning as a weight on nodes from
previous iterations.
8. Select the option, Shuffle examples, to change the order of cases between iterations. If
you deselect this option, cases are processed in exactly the same order each time you
run the pipeline.
9. For Random number seed, you can optionally type a value to use as the seed.
Specifying a seed value is useful when you want to ensure repeatability across runs of
the same pipeline.
10. Connect a training dataset and train the model:
o If you set Create trainer mode to Single Parameter, connect a tagged dataset
and the Train Model component.
o If you set Create trainer mode to Parameter Range, connect a tagged dataset
and train the model by using Tune Model Hyperparameters.
Note
If you pass a parameter range to Train Model, it uses only the default value in the single
parameter list.
If you pass a single set of parameter values to the Tune Model Hyperparameters
component, when it expects a range of settings for each parameter, it ignores the values,
and uses the default values for the learner.
If you select the Parameter Range option and enter a single value for any parameter,
that single value you specified is used throughout the sweep, even if other parameters
change across a range of values.
Results
After training is complete:
To save a snapshot of the trained model, select the Outputs tab in the right panel of the Train
model component. Select the Register dataset icon to save the model as a reusable component.