Ai Notes
Ai Notes
https://learn.microsoft.com/en-us/training/modules/
get-started-with-power-bi/3-building-blocks-of-
power-bi
Simply put, AI is the creation of software that imitates human behaviors and capabilities.
Key workloads include:
Machine learning - This is often the foundation for an AI system, and is the way
we "teach" a computer model to make predictions and draw conclusions from
data.
Anomaly detection - The capability to automatically detect errors or unusual
activity in a system.
Computer vision - The capability of software to interpret the world visually
through cameras, video, and images.
Natural language processing - The capability for a computer to interpret written
or spoken language, and respond in kind.
Knowledge mining - The capability to extract information from large volumes of
often unstructured data to create a searchable knowledge store.
The answer is, from data. In today's world, we create huge volumes of data as we go
about our everyday lives. From the text messages, emails, and social media posts we
send to the photographs and videos we take on our phones, we generate massive
amounts of information. More data still is created by millions of sensors in our homes,
cars, cities, public transport infrastructure, and factories.
Data scientists can use all of that data to train machine learning models that can make
predictions and inferences based on the relationships they find in the data.
Feature Capability
Automated This feature enables non-experts to quickly create an effective machine
machine learning learning model from data.
Data and compute Cloud-based data storage and compute resources that professional data
management scientists can use to run data experiment code at scale.
magine you're creating a software system to monitor credit card transactions and detect
unusual usage patterns that might indicate fraud. Or an application that tracks activity in
an automated production line and identifies failures. Or a racing car telemetry system
that uses sensors to proactively warn engineers about potential mechanical failures
before they happen.
Let's explore how anomaly detection might help in the racing car scenario.
1. Sensors in the car collect telemetry, such as engine revolutions, brake temperature, and so
on.
2. An anomaly detection model is trained to understand expected fluctuations in the
telemetry measurements over time.
3. If a measurement occurs outside of the normal expected range, the model reports an
anomaly that can be used to alert the race engineer to call the driver in for a pit stop to fix
the issue before it forces retirement from the race.
Object detection machine learning models are trained to classify individual objects
within an image, and identify their location with a bounding box. For example, a
traffic monitoring solution might use object detection to identify the location of
different classes of vehicle.
Task Description
Semantic
segmentation
You can create solutions that combine machine learning models with advanced
image analysis techniques to extract information from images, including "tags"
that could help catalog the image or even descriptive captions that summarize the
scene shown in the image.
Task Description
Face
detection,
analysis, and
recognition
Face detection is a specialized form of object detection that locates human faces in
an image. This can be combined with classification and facial geometry analysis
techniques to recognize individuals based on their facial features.
Optical
character
recognition
(OCR)
Optical character recognition is a technique used to detect and read text in images.
You can use OCR to read text in photographs (for example, road signs or store
fronts) or to extract information from scanned documents such as letters, invoices,
or forms.
Natural language processing (NLP) is the area of AI that deals with creating software
that understands written and spoken language.
Analyze and interpret text in documents, email messages, and other sources.
Interpret spoken language, and synthesize speech responses.
Automatically translate spoken or written phrases between languages.
Interpret commands and determine appropriate actions.
Service Capabilities
Language Use this service to access features for understanding and analyzing text, training
language models that can understand spoken or text-based commands, and building
intelligent applications.
Translator Use this service to translate text between more than 60 languages.
Speech Use this service to recognize and synthesize speech, and to translate spoken
languages.
Azure Bot This service provides a platform for conversational AI, the capability of a software
"agent" to participate in a conversation. Developers can use the Bot Framework to
create a bot and manage it with Azure Bot Service - integrating back-end services
like Language, and connecting to channels for web chat, email, Microsoft Teams,
and others.
Understand knowledge mining
Knowledge mining is the term used to describe solutions that involve extracting
information from large volumes of often unstructured data to create a searchable
knowledge store.
Azure Cognitive Search can utilize the built-in AI capabilities of Azure Cognitive Services
such as image processing, content extraction, and natural language processing to
perform knowledge mining of documents. The product's AI capabilities makes it
possible to index previously unsearchable documents and to extract and surface insights
from large amounts of data quickly.
Artificial Intelligence is a powerful tool that can be used to greatly benefit the world.
However, like any tool, it must be used responsibly.
The following table shows some of the potential challenges and risks facing an AI
application developer.
Understand Responsible AI
At Microsoft, AI software development is guided by a set of six principles, designed to ensure
that AI applications provide amazing solutions to difficult problems without any unintended
negative consequences.
Fairness
AI systems should treat all people fairly. For example, suppose you create a machine
learning model to support a loan approval application for a bank. The model should
predict whether the loan should be approved or denied without bias. This bias could be
based on gender, ethnicity, or other factors that result in an unfair advantage or
disadvantage to specific groups of applicants.
Azure Machine Learning includes the capability to interpret models and quantify the
extent to which each feature of the data influences the model's prediction. This
capability helps data scientists and developers identify and mitigate bias in the model.
For more details about considerations for fairness, watch the following video.
For more information about considerations for reliability and safety, watch the following
video.
For more details about considerations for privacy and security, watch the following
video.
Inclusiveness
AI systems should empower everyone and engage people. AI should bring benefits to all
parts of society, regardless of physical ability, gender, sexual orientation, ethnicity, or
other factors.
For more details about considerations for inclusiveness, watch the following video.
Transparency
AI systems should be understandable. Users should be made fully aware of the purpose
of the system, how it works, and what limitations may be expected.
For more details about considerations for transparency, watch the following video.
Accountability
People should be accountable for AI systems. Designers and developers of AI-based
solutions should work within a framework of governance and organizational principles
that ensure the solution meets ethical and legal standards that are clearly defined.
For more details about considerations for accountability, watch the following video.
The principles of responsible AI can help you understand some of the challenges facing
developers as they try to create ethical AI solutions.
Machine Learning is the foundation for most artificial intelligence solutions. Creating an
intelligent solution often begins with the use of machine learning to train predictive
models using historic data that you have collected.
Azure Machine Learning is a cloud service that you can use to train and manage machine
learning models.
Clustering: used to determine labels by grouping similar information into label groups;
like grouping measurements from birds into species.
Automated machine learning allows you to train models without extensive data science
or programming knowledge. For people with a data science and programming
background, it provides a way to save time and resources by automating algorithm
selection and hyperparameter tuning.
You can create an automated machine learning job in Azure Machine Learning studio
1. Prepare data: Identify the features and label in a dataset. Pre-process, or clean and
transform, the data as needed.
2. Train model: Split the data into two groups, a training and a validation set. Train a
machine learning model using the training data set. Test the machine learning model for
performance using the validation data set.
3. Evaluate performance: Compare how close the model's predictions are to the known
labels.
4. Deploy a predictive service: After you train a machine learning model, you can deploy
the model as an application on a server or device so that others can use it.
These are the same steps in the automated machine learning process with Azure
Machine Learning.
Prepare data
Machine learning models must be trained with existing data. Data scientists expend a lot
of effort exploring and pre-processing data, and trying various types of model-training
algorithms to produce accurate models, which is time consuming, and often makes
inefficient use of expensive compute hardware.
In Azure Machine Learning, data for model training and other operations is usually
encapsulated in an object called a data asset. You can create your own data asset in
Azure Machine Learning studio.
Train model
The automated machine learning capability in Azure Machine Learning
supports supervised machine learning models - in other words, models for which the
training data includes known label values. You can use automated machine learning to
train models for:
In Automated Machine Learning you can select from several types of tasks:
In Automated Machine Learning, you can select configurations for the primary metric,
type of model used for training, exit criteria, and concurrency limits.
Importantly, AutoML will split data into a training set and a validation set. You can
configure the details in the settings before you run the job.
Evaluate performance
After the job has finished you can review the best performing model. In this case, you
used exit criteria to stop the job. Thus the "best" model the job generated might not be
the best possible model, just the best one found within the time allowed for this
exercise.
The best model is identified based on the evaluation metric you specified, Normalized
root mean squared error.
The difference between the predicted and actual value, known as the residuals, indicates
the amount of error in the model. The performance metric root mean squared
error (RMSE), is calculated by squaring the errors across all of the test cases, finding the
mean of these squares, and then taking the square root. What all of this means is that
smaller this value is, the more accurate the model's predictions. The normalized root
mean squared error (NRMSE) standardizes the RMSE metric so it can be used for
comparison between models which have variables on different scales.
The Predicted vs. True chart should show a diagonal trend in which the predicted value
correlates closely to the true value. The dotted line shows how a perfect model should
perform. The closer the line of your model's average predicted value is to the dotted
line, the better its performance. A histogram below the line chart shows the distribution
of true values.
After you've used automated machine learning to train some models, you can deploy
the best performing model as a service for client applications to use.
Deploy a predictive service
In Azure Machine Learning, you can deploy a service as an Azure Container Instances
(ACI) or to an Azure Kubernetes Service (AKS) cluster. For production scenarios, an AKS
deployment is recommended, for which you must create an inference cluster compute
target. In this exercise, you'll use an ACI service, which is a suitable deployment target
for testing, and does not require you to create an inference cluster.
Introduction
Computer vision is one of the core areas of artificial intelligence (AI), and focuses on
creating solutions that enable AI applications to "see" the world and make sense of it.
Of course, computers don't have biological eyes that work the way ours do, but they are
capable of processing images; either from a live camera feed or from digital
photographs or videos. This ability to process images is the key to creating software that
can emulate human visual perception.
Content Organization: Identify people or objects in photos and organize them based on
that identification. Photo recognition applications like this are commonly used in photo
storage and social media applications.
Text Extraction: Analyze images and PDF documents that contain text and extract the text
into a structured format.
Spatial Analysis: Identify people or objects, such as cars, in a space and map their
movement within that space.
To an AI application, an image is just an array of pixel values. These numeric values can
be used as features to train machine learning models that make predictions about the
image and its contents.
Training machine learning models from scratch can be very time intensive and require a
large amount of data. Microsoft's Computer Vision service gives you access to pre-
trained computer vision capabilities.
Learning objectives
In this module you will:
Identify image analysis tasks that can be performed with the Computer Vision service.
Provision a Computer Vision resource.
Use a Computer Vision resource to analyze an image.
Get started with image analysis on Azure
The Computer Vision service is a cognitive service in Microsoft Azure that provides pre-
built computer vision capabilities. The service can analyze images, and return detailed
information about an image and the objects it depicts.
Computer Vision: A specific resource for the Computer Vision service. Use this resource
type if you don't intend to use any other cognitive services, or if you want to track
utilization and costs for your Computer Vision resource separately.
Cognitive Services: A general cognitive services resource that includes Computer Vision
along with many other cognitive services; such as Text Analytics, Translator Text, and
others. Use this resource type if you plan to use multiple cognitive services and want to
simplify administration and development.
Whichever type of resource you choose to create, it will provide two pieces of
information that you will need to use it:
Describing an image
Computer Vision has the ability to analyze an image, evaluate the objects that are
detected, and generate a human-readable phrase or sentence that can describe what
was detected in the image. Depending on the image contents, the service may return
multiple results, or phrases. Each returned phrase will have an associated confidence
score, indicating how confident the algorithm is in the supplied description. The highest
confidence phrases will be listed first.
To help you understand this concept, consider the following image of the Empire State
building in New York. The returned phrases are listed below the image in the order of
confidence.
The image descriptions generated by Computer Vision are based on a set of thousands
of recognizable objects, which can be used to suggest tags for the image. These tags
can be associated with the image as metadata that summarizes attributes of the image;
and can be particularly useful if you want to index an image along with a set of key
terms that might be used to search for images with specific attributes or contents.
For example, the tags returned for the Empire State building image include:
skyscraper
tower
building
Detecting objects
The object detection capability is similar to tagging, in that the service can identify
common objects; but rather than tagging, or providing tags for the recognized objects
only, this service can also return what is known as bounding box coordinates. Not only
will you get the type of object, but you will also receive a set of coordinates that indicate
the top, left, width, and height of the object detected, which you can use to identify the
location of the object in the image, like this:
Detecting brands
This feature provides the ability to identify commercial brands. The service has an
existing database of thousands of globally recognized logos from commercial brands of
products.
When you call the service and pass it an image, it performs a detection task and
determine if any of the identified objects in the image are recognized brands. The
service compares the brands against its database of popular brands spanning clothing,
consumer electronics, and many more categories. If a known brand is detected, the
service returns a response that contains the brand name, a confidence score (from 0 to 1
indicating how positive the identification is), and a bounding box (coordinates) for
where in the image the detected brand was found.
For example, in the following image, a laptop has a Microsoft logo on its lid, which is
identified and located by the Computer Vision service.
Detecting faces
The Computer Vision service can detect and analyze human faces in an image, including
the ability to determine age and a bounding box rectangle for the location of the
face(s). The facial analysis capabilities of the Computer Vision service are a subset of
those provided by the dedicated Face Service. If you need basic face detection and
analysis, combined with general image analysis capabilities, you can use the Computer
Vision service; but for more comprehensive facial analysis and facial recognition
functionality, use the Face service.
The following example shows an image of a person with their face detected and
approximate age estimated.
Categorizing an image
Computer Vision can categorize images based on their contents. The service uses a
parent/child hierarchy with a "current" limited set of categories. When analyzing an
image, detected objects are compared to the existing categories to determine the best
way to provide the categorization. As an example, one of the parent categories
is people_. This image of a person on a roof is assigned a category of people_.
A slightly different categorization is returned for the following image, which is assigned
to the category people_group because there are multiple people in the image:
When categorizing an image, the Computer Vision service supports two specialized
domain models:
Celebrities - The service includes a model that has been trained to identify thousands of
well-known celebrities from the worlds of sports, entertainment, and business.
Landmarks - The service can identify famous landmarks, such as the Taj Mahal and the
Statue of Liberty.
For example, when analyzing the following image for landmarks, the Computer Vision
service identifies the Eiffel Tower, with a confidence of 99.41%.
The Computer Vision service can use optical character recognition (OCR) capabilities to
detect printed and handwritten text in images. This capability is explored in the Read
text with the Computer Vision service module on Microsoft Learn.
Additional capabilities
Detect image types - for example, identifying clip art images or line drawings.
Detect image color schemes - specifically, identifying the dominant foreground,
background, and overall colors in an image.
Generate thumbnails - creating small versions of images.
Moderate content - detecting images that contain adult content or depict violent, gory
scenes.
Introduction
Analyzing text is a process where you evaluate different aspects of a document or phrase, in
order to gain insights into the content of that text. For the most part, humans are able to read
some text and understand the meaning behind it. Even without considering grammar rules for the
language the text is written in, specific insights can be identified in the text.
As an example, you might read some text and identify some key phrases that indicate the main
talking points of the text. You might also recognize names of people or well-known landmarks
such as the Eiffel Tower. Although difficult at times, you might also be able to get a sense for
how the person was feeling when they wrote the text, also commonly known as sentiment.
Statistical analysis of terms used in the text. For example, removing common "stop words" (words
like "the" or "a", which reveal little semantic information about the text), and
performing frequency analysis of the remaining words (counting how often each word appears)
can provide clues about the main subject of the text.
Extending frequency analysis to multi-term phrases, commonly known as N-grams (a two-word
phrase is a bi-gram, a three-word phrase is a tri-gram, and so on).
Applying stemming or lemmatization algorithms to normalize words before counting them - for
example, so that words like "power", "powered", and "powerful" are interpreted as being the
same word.
Applying linguistic structure rules to analyze sentences - for example, breaking down sentences
into tree-like structures such as a noun phrase, which itself contains nouns, verbs, adjectives, and
so on.
Encoding words or terms as numeric features that can be used to train a machine learning model.
For example, to classify a text document based on the terms it contains. This technique is often
used to perform sentiment analysis, in which a document is classified as positive or negative.
Creating vectorized models that capture semantic relationships between words by assigning them
to locations in n-dimensional space. This modeling technique might, for example, assign values to
the words "flower" and "plant" that locate them close to one another, while "skateboard" might
be given a value that positions it much further away.
While these techniques can be used to great effect, programming them can be complex. In
Microsoft Azure, the Language cognitive service can help simplify application development by
using pre-trained models that can:
In this module, you'll explore some of these capabilities and gain an understanding of how you
might apply them to applications such as:
A social media feed analyzer to detect sentiment around a political campaign or a product in
market.
A document search application that extracts key phrases to help summarize the main subject
matter of documents in a catalog.
A tool to extract brand information or company names from documents or other text for
identification purposes.
These examples are just a small sample of the many areas that the Language service can help
with text analytics.
he Language service is a part of the Azure Cognitive Services offerings that can perform
advanced natural language processing over raw text.
A Language resource - choose this resource type if you only plan to use natural language
processing services, or if you want to manage access and billing for the resource
separately from other services.
A Cognitive Services resource - choose this resource type if you plan to use the Language
service in combination with other cognitive services, and you want to manage access and
billing for these services together.
Language detection
Use the language detection capability of the Language service to identify the language
in which text is written. You can submit multiple documents at a time for analysis. For
each document submitted to it, the service will detect:
For example, consider a scenario where you own and operate a restaurant where
customers can complete surveys and provide feedback on the food, the service, staff,
and so on. Suppose you have received the following reviews from customers:
Review 1: "A fantastic place for lunch. The soup was delicious."
Review 3: "The croque monsieur avec frites was terrific. Bon appetit!"
You can use the text analytics capabilities in the Language service to detect the
language for each of these reviews; and it might respond with the following results:
There may be text that is ambiguous in nature, or that has mixed language content.
These situations can present a challenge to the service. An ambiguous content example
would be a case where the document contains limited text, or only punctuation. For
example, using the service to analyze the text ":-)", results in a value of unknown for the
language name and the language identifier, and a score of NaN (which is used to
indicate not a number).
Sentiment analysis
The text analytics capabilities in the Language service can evaluate text and return
sentiment scores and labels for each sentence. This capability is useful for detecting
positive and negative sentiment in social media, customer reviews, discussion forums
and more.
Using the pre-built machine learning classification model, the service evaluates the text
and returns a sentiment score in the range of 0 to 1, with values closer to 1 being a
positive sentiment. Scores that are close to the middle of the range (0.5) are considered
neutral or indeterminate.
For example, the following two restaurant reviews could be analyzed for sentiment:
"We had dinner at this restaurant last night and the first thing I noticed was how
courteous the staff was. We were greeted in a friendly manner and taken to our table right
away. The table was clean, the chairs were comfortable, and the food was amazing."
and
"Our dining experience at this restaurant was one of the worst I've ever had. The service
was slow, and the food was awful. I'll never eat at this establishment again."
The sentiment score for the first review might be around 0.9, indicating a positive
sentiment; while the score for the second review might be closer to 0.1, indicating a
negative sentiment.
Indeterminate sentiment
A score of 0.5 might indicate that the sentiment of the text is indeterminate, and could
result from text that does not have sufficient context to discern a sentiment or
insufficient phrasing. For example, a list of words in a sentence that has no structure,
could result in an indeterminate score. Another example where a score may be 0.5 is in
the case where the wrong language code was used. A language code (such as "en" for
English, or "fr" for French) is used to inform the service which language the text is in. If
you pass text in French but tell the service the language code is en for English, the
service will return a score of precisely 0.5.
"We had dinner here for a birthday celebration and had a fantastic experience. We were
greeted by a friendly hostess and taken to our table right away. The ambiance was
relaxed, the food was amazing, and service was terrific. If you like great food and attentive
service, you should try this place."
Key phrase extraction can provide some context to this review by extracting the
following phrases:
attentive service
great food
birthday celebration
fantastic experience
table
friendly hostess
dinner
ambiance
place
Not only can you use sentiment analysis to determine that this review is positive, you
can use the key phrases to identify important elements of the review.
Entity recognition
You can provide the Language service with unstructured text and it will return a list
of entities in the text that it recognizes. The service can also provide links to more
information about that entity on the web. An entity is essentially an item of a particular
type or a category; and in some cases, subtype, such as those as shown in the following
table.
Organization "Microsoft"
URL "https://www.bing.com"
Email "[email protected]"
IP Address "10.0.1.125"
For example, suppose you use the Language service to detect entities in the following
restaurant review extract:
What is conversational AI
Benefits of conversational AI