0% found this document useful (0 votes)
14 views18 pages

ML Notes

The document provides an overview of Machine Learning (ML), its definitions, classifications, and applications, emphasizing its ability to learn from data and improve performance over time. It categorizes ML into supervised, unsupervised, reinforcement, and semi-supervised learning, along with classification, regression, and clustering tasks. Additionally, it highlights the importance of data in ML and distinguishes between Machine Learning and Artificial Intelligence, explaining their interrelation and future scope.

Uploaded by

Teofel savage4L
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views18 pages

ML Notes

The document provides an overview of Machine Learning (ML), its definitions, classifications, and applications, emphasizing its ability to learn from data and improve performance over time. It categorizes ML into supervised, unsupervised, reinforcement, and semi-supervised learning, along with classification, regression, and clustering tasks. Additionally, it highlights the importance of data in ML and distinguishes between Machine Learning and Artificial Intelligence, explaining their interrelation and future scope.

Uploaded by

Teofel savage4L
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

An introduction to Machine Learning

The term Machine Learning was coined by Arthur Samuel in 1959, an American pioneer in the field
of computer gaming and artificial intelligence, and stated that “it gives computers the ability to learn
without being explicitly programmed”.
And in 1997, Tom Mitchell gave a “well-posed” mathematical and relational definition that “A
computer program is said to learn from experience E with respect to some task T and some
performance measure P, if its performance on T, as measured by P, improves with experience E.

Machine Learning is the latest buzzword floating around. It deserves to, as it is one of the most
interesting subfields of Computer Science. So what does Machine Learning really mean?

Let’s try to understand Machine Learning in layman’s terms. Consider you are trying to toss a paper
into a dustbin.

After the first attempt, you realize that you have put too much force into it. After the second
attempt, you realize you are closer to the target but you need to increase your throw angle. What is
happening here is basically after every throw we are learning something and improving the end
result. We are programmed to learn from our experience.

This implies that the tasks in which machine learning is concerned to offer a fundamentally
operational definition rather than defining the field in cognitive terms. This follows Alan Turing’s
proposal in his paper “Computing Machinery and Intelligence”, in which the question “Can machines
think?” is replaced with the question “Can machines do what we (as thinking entities) can do?”
Within the field of data analytics, machine learning is used to devise complex models and algorithms
that lend themselves to prediction; in commercial use, this is known as predictive analytics. These
analytical models allow researchers, data scientists, engineers, and analysts to “produce reliable,
repeatable decisions and results” and uncover “hidden insights” through learning from historical
relationships and trends in the data set(input).

Suppose that you decide to check out that offer for a vacation. You browse through the travel
agency website and search for a hotel. When you look at a specific hotel, just below the hotel
description there is a section titled “You might also like these hotels”. This is a common use case of
Machine Learning called “Recommendation Engine”. Again, many data points were used to train a
model in order to predict what will be the best hotels to show you under that section, based on a lot
of information they already know about you.

So if you want your program to predict, for example, traffic patterns at a busy intersection (task T),
you can run it through a machine learning algorithm with data about past traffic patterns
(experience E) and, if it has successfully “learned”, it will then do better at predicting future traffic
patterns (performance measure P).
The highly complex nature of many real-world problems, though, often means that inventing
specialized algorithms that will solve them perfectly every time is impractical, if not impossible.
Examples of machine learning problems include, “Is this cancer?”, “Which of these people are good
friends with each other?”, “Will this person like this movie?” such problems are excellent targets for
Machine Learning, and in fact, machine learning has been applied to such problems with great
success.

Classification of Machine Learning

Machine learning implementations are classified into three major categories, depending on the
nature of the learning “signal” or “response” available to a learning system which is as follows:-

1. Supervised learning: When an algorithm learns from example data and associated target
responses that can consist of numeric values or string labels, such as classes or tags, in order
to later predict the correct response when posed with new examples comes under the
category of Supervised learning. This approach is indeed similar to human learning under the
supervision of a teacher. The teacher provides good examples for the student to memorize,
and the student then derives general rules from these specific examples.

2. Unsupervised learning: Whereas when an algorithm learns from plain examples without any
associated response, leaving to the algorithm to determine the data patterns on its own.
This type of algorithm tends to restructure the data into something else, such as new
features that may represent a class or a new series of un-correlated values. They are quite
useful in providing humans with insights into the meaning of data and new useful inputs to
supervised machine learning algorithms.
As a kind of learning, it resembles the methods humans use to figure out that certain objects
or events are from the same class, such as by observing the degree of similarity between
objects. Some recommendation systems that you find on the web in the form of marketing
automation are based on this type of learning.

3. Reinforcement learning: When you present the algorithm with examples that lack labels, as
in unsupervised learning. However, you can accompany an example with positive or negative
feedback according to the solution the algorithm proposes comes under the category of
Reinforcement learning, which is connected to applications for which the algorithm must
make decisions (so the product is prescriptive, not just descriptive, as in unsupervised
learning), and the decisions bear consequences. In the human world, it is just like learning by
trial and error.
Errors help you learn because they have a penalty added (cost, loss of time, regret, pain, and
so on), teaching you that a certain course of action is less likely to succeed than others. An
interesting example of reinforcement learning occurs when computers learn to play video
games by themselves.
In this case, an application presents the algorithm with examples of specific situations, such
as having the gamer stuck in a maze while avoiding an enemy. The application lets the
algorithm know the outcome of actions it takes, and learning occurs while trying to avoid
what it discovers to be dangerous and to pursue survival. You can have a look at how the
company Google DeepMind has created a reinforcement learning program that plays old
Atari’s video games. When watching the video, notice how the program is initially clumsy
and unskilled but steadily improves with training until it becomes a champion.
4. Semi-supervised learning: where an incomplete training signal is given: a training set with
some (often many) of the target outputs missing. There is a special case of this principle
known as Transduction where the entire set of problem instances is known at learning time,
except that part of the targets are missing.

Categorizing on the basis of required Output

Another categorization of machine learning tasks arises when one considers the desired output of a
machine-learned system:

1. Classification: When inputs are divided into two or more classes, and the learner must
produce a model that assigns unseen inputs to one or more (multi-label classification) of
these classes. This is typically tackled in a supervised way. Spam filtering is an example of
classification, where the inputs are email (or other) messages and the classes are “spam”
and “not spam”.

2. Regression: Which is also a supervised problem, A case when the outputs are continuous
rather than discrete.

3. Clustering: When a set of inputs is to be divided into groups. Unlike in classification, the
groups are not known beforehand, making this typically an unsupervised task.

Machine Learning comes into the picture when problems cannot be solved by means of typical
approaches.

ML | Introduction to Data in Machine Learning


 Difficulty Level : Easy

 Last Updated : 15 Sep, 2021

DATA: It can be any unprocessed fact, value, text, sound, or picture that is not being interpreted and
analyzed. Data is the most important part of all Data Analytics, Machine Learning, Artificial
Intelligence. Without data, we can’t train any model and all modern research and automation will go
in vain. Big Enterprises are spending lots of money just to gather as much certain data as possible.

Example: Why did Facebook acquire WhatsApp by paying a huge price of $19 billion?
The answer is very simple and logical – it is to have access to the users’ information that Facebook
may not have but WhatsApp will have. This information of their users is of paramount importance to
Facebook as it will facilitate the task of improvement in their services.

INFORMATION: Data that has been interpreted and manipulated and has now some meaningful
inference for the users.
KNOWLEDGE: Combination of inferred information, experiences, learning, and insights. Results in
awareness or concept building for an individual or organization.

How we split data in Machine Learning?

 Training Data: The part of data we use to train our model. This is the data that your model
actually sees(both input and output) and learns from.

 Validation Data: The part of data that is used to do a frequent evaluation of the model, fit
on the training dataset along with improving involved hyperparameters (initially set
parameters before the model begins learning). This data plays its part when the model is
actually training.

 Testing Data: Once our model is completely trained, testing data provides an unbiased
evaluation. When we feed in the inputs of Testing data, our model will predict some
values(without seeing actual output). After prediction, we evaluate our model by comparing
it with the actual output present in the testing data. This is how we evaluate and see how
much our model has learned from the experiences feed in as training data, set at the time of
training.

Consider an example:
There’s a Shopping Mart Owner who conducted a survey for which he has a long list of questions
and answers that he had asked from the customers, this list of questions and answers is DATA. Now
every time when he wants to infer anything and can’t just go through each and every question of
thousands of customers to find something relevant as it would be time-consuming and not helpful.
In order to reduce this overhead and time wastage and to make work easier, data is manipulated
through software, calculations, graphs, etc. as per own convenience, this inference from
manipulated data is Information. So, Data is a must for Information. Now Knowledge has its role in
differentiating between two individuals having the same information. Knowledge is actually not
technical content but is linked to the human thought process.

Properties of Data –
1. Volume: Scale of Data. With the growing world population and technology at exposure,
huge data is being generated each and every millisecond.

2. Variety: Different forms of data – healthcare, images, videos, audio clippings.

3. Velocity: Rate of data streaming and generation.

4. Value: Meaningfulness of data in terms of information that researchers can infer from it.

5. Veracity: Certainty and correctness in data we are working on.

Some facts about Data:

 As compared to 2005, 300 times i.e. 40 Zettabytes (1ZB=10^21 bytes) of data will be
generated by 2020.

 By 2011, the healthcare sector has a data of 161 Billion Gigabytes

 400 Million tweets are sent by about 200 million active users per day

 Each month, more than 4 billion hours of video streaming is done by the users.

 30 Billion different types of content are shared every month by the user.

 It is reported that about 27% of data is inaccurate and so 1 in 3 business idealists or leaders
don’t trust the information on which they are making decisions.

Machine Learning and Artificial Intelligence


 Difficulty Level : Basic

 Last Updated : 25 Aug, 2021

Machine Learning and Artificial Intelligence are creating a huge buzz worldwide. The plethora of
applications in Artificial Intelligence has changed the face of technology. The terms Machine
Learning and Artificial Intelligence are often used interchangeably. However, there is a stark
difference between the two that is still unknown to the industry professionals.
Let’s start by taking an example of Virtual Personal Assistants which have been familiar to most of us
for quite some time now.

Working of Virtual Personal Assistants –


Siri(part of Apple Inc.’s iOS, watchOS, macOS, and tvOS operating systems), Google Now (a feature
of Google Search offering predictive cards with information and daily updates in the Google app for
Android and iOS.), Cortana (Cortana is a virtual assistant created by Microsoft for Windows 10) are
intelligent digital personal assistants on the platforms like iOS, Android and Windows respectively.
To put it plainly, they help to find relevant information when requested using voice. For instance, for
answering queries like ‘What’s the temperature today?’ or ‘What is the way to the nearest
supermarket’ etc. and the assistant will react by searching information, transferring that information
from the phone, or sending commands to various other applications.
AI is critical in these applications, as they gather data on the user’s request and utilize that data to
perceive speech in a better manner and serve the user with answers that are customized to his
inclination. Microsoft says that Cortana “consistently finds out about its user” and that it will in the
end build up the capacity to anticipate users’ needs and cater to them. Virtual assistants process a
tremendous measure of information from an assortment of sources to find out about users and be
more compelling in helping them arrange and track their data. Machine learning is a vital part of
these personal assistants as they gather and refine the data based on user’s past participation with
them. Thereon, this arrangement of information is used to render results that are custom-made to
user’s inclinations.

Roughly speaking, Artificial Intelligence (AI) is when a computer algorithm does intelligent work. On
the other hand, Machine Learning is a part of AI that learns from the data that also involves the
information gathered from previous experiences and allows the computer program to change its
behavior accordingly. Artificial Intelligence is the superset of Machine Learning i.e. all Machine
Learning is Artificial Intelligence but not all AI is Machine Learning.

Artificial Intelligence Machine Learning

AI manages more comprehensive issues


of automating a system. This
computerization should be possible by Machine Learning (ML) manages to influence user’s
utilizing any field such as image machines to gain from the external environment. This
processing, cognitive science, neural external environment can be sensors, electronic
systems, machine learning, etc. segments, external storage gadgets, and numerous
other devices.

What ML does, depends on the user input or a query


requested by the client, the framework checks whether
it is available in the knowledge base or not. If it is
AI manages the making of machines, available, it will restore the outcome to the user related
frameworks, and different gadgets savvy to that query, however, if it isn’t stored initially, the
by enabling them to think and do machine will take in the user input and will enhance its
errands as all people generally do. knowledge base, to give a better value to the end-user

Future Scope –

 Artificial Intelligence is here to stay and is going nowhere. It digs out the facts from
algorithms for a meaningful execution of various decisions and goals predetermined by a
firm.

 Artificial Intelligence and Machine Learning are likely to replace the current mode of
technology that we see these days, for example, traditional programming packages
like ERP and CRM are certainly losing their charm.
 Firms like Facebook, Google are investing a hefty amount in AI to get the desired outcome at
a relatively lower computational time.

 Artificial Intelligence is something that is going to redefine the world of software and IT in
the near future.

Basic Concept of Classification (Data Mining)


 Difficulty Level : Hard

 Last Updated : 12 Jul, 2021

Data Mining: Data mining in general terms means mining or digging deep into data that is in
different forms to gain patterns, and to gain knowledge on that pattern. In the process of data
mining, large data sets are first sorted, then patterns are identified and relationships are established
to perform data analysis and solve problems.

Classification: It is a data analysis task, i.e. the process of finding a model that describes and
distinguishes data classes and concepts. Classification is the problem of identifying to which of a set
of categories (subpopulations), a new observation belongs to, on the basis of a training set of data
containing observations and whose categories membership is known.
Example: Before starting any project, we need to check its feasibility. In this case, a classifier is
required to predict class labels such as ‘Safe’ and ‘Risky’ for adopting the Project and to further
approve it. It is a two-step process such as :

1. Learning Step (Training Phase): Construction of Classification Model


Different Algorithms are used to build a classifier by making the model learn using the
training set available. The model has to be trained for the prediction of accurate results.
Test data are used to estimate the accuracy of the classification rule

2. Classification Step: Model used to predict class labels and testing the constructed model on
test data and hence estimate the accuracy of the classification rules.

Test data are used to estimate the accuracy of the classification rule

Training and Testing:


Suppose there is a person who is sitting under a fan and the fan starts falling on him, he should get
aside in order not to get hurt. So, this is his training part to move away. While Testing if the person
sees any heavy object coming towards him or falling on him and moves aside then the system is
tested positively and if the person does not move aside then the system is negatively tested.
Same is the case with the data, it should be trained in order to get the accurate and best results.
There are certain data types associated with data mining that actually tells us the format of the file
(whether it is in text format or in numerical format).
Attributes – Represents different features of an object. Different types of attributes are:

1. Binary: Possesses only two values i.e. True or False


Example: Suppose there is a survey evaluating some products. We need to check whether
it’s useful or not. So, the Customer has to answer it in Yes or No.
Product usefulness: Yes / No

 Symmetric: Both values are equally important in all aspects

 Asymmetric: When both the values may not be important.

2. Nominal: When more than two outcomes are possible. It is in Alphabet form rather than
being in Integer form.
Example: One needs to choose some material but of different colors. So, the color might be
Yellow, Green, Black, Red.
Different Colors: Red, Green, Black, Yellow

 Ordinal: Values that must have some meaningful order.


Example: Suppose there are grade sheets of few students which might contain
different grades as per their performance such as A, B, C, D
Grades: A, B, C, D

 Continuous: May have an infinite number of values, it is in float type


Example: Measuring the weight of few Students in a sequence or orderly manner i.e.
50, 51, 52, 53
Weight: 50, 51, 52, 53

 Discrete: Finite number of values.


Example: Marks of a Student in a few subjects: 65, 70, 75, 80, 90
Marks: 65, 70, 75, 80, 90

Syntax:

 Mathematical Notation: Classification is based on building a function taking input feature


vector “X” and predicting its outcome “Y” (Qualitative response taking values in set C)

 Here Classifier (or model) is used which is a Supervised function, can be designed manually
based on expert’s knowledge. It has been constructed to predict class labels (Example: Label
– “Yes” or “No” for the approval of some event).

Classifiers can be categorized into two major types:


1. Discriminative: It is a very basic classifier and determines just one class for each row of data.
It tries to model just by depending on the observed data, depends heavily on the quality of
data rather than on distributions.
Example: Logistic Regression
Acceptance of a student at a University (Test and Grades need to be considered)
Suppose there are few students and the Result of them are as follows :

2. Generative: It models the distribution of individual classes and tries to learn the model that
generates the data behind the scenes by estimating assumptions and distributions of the
model. Used to predict the unseen data.
Example: Naive Bayes Classifier
Detecting Spam emails by looking at the previous data. Suppose 100 emails and that too
divided in 1:4 i.e. Class A: 25%(Spam emails) and Class B: 75%(Non-Spam emails). Now if a
user wants to check that if an email contains the word cheap, then that may be termed as
Spam.
It seems to be that in Class A(i.e. in 25% of data), 20 out of 25 emails are spam and rest not.
And in Class B(i.e. in 75% of data), 70 out of 75 emails are not spam and rest are spam.
So, if the email contains the word cheap, what is the probability of it being spam ?? (= 80%)

Classifiers Of Machine Learning:

1. Decision Trees

2. Bayesian Classifiers

3. Neural Networks

4. K-Nearest Neighbour

5. Support Vector Machines

6. Linear Regression

7. Logistic Regression

Associated Tools and Languages: Used to mine/ extract useful information from raw data.

 Main Languages used: R, SAS, Python, SQL

 Major Tools used: RapidMiner, Orange, KNIME, Spark, Weka

 Libraries used: Jupyter, NumPy, Matplotlib, Pandas, ScikitLearn, NLTK, TensorFlow, Seaborn,
Basemap, etc.

Real–Life Examples :

 Market Basket Analysis:


It is a modeling technique that has been associated with frequent transactions of buying
some combination of items.
Example: Amazon and many other Retailers use this technique. While viewing some
products, certain suggestions for the commodities are shown that some people have bought
in the past.

 Weather Forecasting:
Changing Patterns in weather conditions needs to be observed based on parameters such as
temperature, humidity, wind direction. This keen observation also requires the use of
previous records in order to predict it accurately.

Advantages:

 Mining Based Methods are cost-effective and efficient

 Helps in identifying criminal suspects

 Helps in predicting the risk of diseases

 Helps Banks and Financial Institutions to identify defaulters so that they may approve Cards,
Loan, etc.

Disadvantages:
Privacy: When the data is either are chances that a company may give some information about their
customers to other vendors or use this information for their profit.
Accuracy Problem: Selection of Accurate model must be there in order to get the best accuracy and
result.

APPLICATIONS:

 Marketing and Retailing

 Manufacturing

 Telecommunication Industry

 Intrusion Detection

 Education System

 Fraud Detection

GIST OF DATA MINING :

1. Choosing the correct classification method, like decision trees, Bayesian networks, or neural
networks.

2. Need a sample of data, where all class values are known. Then the data will be divided into
two parts, a training set, and a test set.

Now, the training set is given to a learning algorithm, which derives a classifier. Then the classifier is
tested with the test set, where all class values are hidden.
If the classifier classifies most cases in the test set correctly, it can be assumed that it works
accurately also on the future data else it may be a wrong model chosen.

Introduction to Natural Language Processing

 Difficulty Level : Easy

 Last Updated : 16 Nov, 2018

The essence of Natural Language Processing lies in making computers understand the natural
language. That’s not an easy task though. Computers can understand the structured form of data
like spreadsheets and the tables in the database, but human languages, texts, and voices form an
unstructured category of data, and it gets difficult for the computer to understand it, and there
arises the need for Natural Language Processing.

There’s a lot of natural language data out there in various forms and it would get very easy if
computers can understand and process that data. We can train the models in accordance with
expected output in different ways. Humans have been writing for thousands of years, there are a lot
of literature pieces available, and it would be great if we make computers understand that. But the
task is never going to be easy. There are various challenges floating out there like understanding the
correct meaning of the sentence, correct Named-Entity Recognition(NER), correct prediction of
various parts of speech, coreference resolution(the most challenging thing in my opinion).

Computers can’t truly understand the human language. If we feed enough data and train a model
properly, it can distinguish and try categorizing various parts of speech(noun, verb, adjective,
supporter, etc…) based on previously fed data and experiences. If it encounters a new word it tried
making the nearest guess which can be embarrassingly wrong few times.

It’s very difficult for a computer to extract the exact meaning from a sentence. For example – The
boy radiated fire like vibes. The boy had a very motivating personality or he actually radiated fire? As
you see over here, parsing English with a computer is going to be complicated.

There are various stages involved in training a model. Solving a complex problem in Machine
Learning means building a pipeline. In simple terms, it means breaking a complex problem into a
number of small problems, making models for each of them and then integrating these models. A
similar thing is done in NLP. We can break down the process of understanding English for a model
into a number of small pieces.

It would be really great if a computer could understand that San Pedro is an island in Belize district in
Central America with a population of 16, 444 and it is the second largest town in Belize. But to make
the computer understand this, we need to teach computer very basic concepts of written language.

So let’s start by creating an NLP pipeline. It has various steps which will give us the desired
output(maybe not in a few rare cases) at the end.

 Step #1: Sentence Segmentation


Breaking the piece of text in various sentences.

Input : San Pedro is a town on the southern part of the island of Ambergris Caye in the Belize District
of the nation of Belize, in Central America. According to 2015 mid-year estimates, the town has a
population of about 16, 444. It is the second-largest town in the Belize District and largest in the
Belize Rural South constituency.

Output : San Pedro is a town on the southern part of the island of Ambergris Caye in the 2.Belize
District of the nation of Belize, in Central America.
According to 2015 mid-year estimates, the town has a population of about 16, 444.
It is the second-largest town in the Belize District and largest in the Belize Rural South constituency.
For coding a sentence segmentation model, we can consider splitting a sentence when it encounters
any punctuation mark. But modern NLP pipelines have techniques to split even if the document isn’t
formatted properly.

 Step #2: Word Tokenization

Breaking the sentence into individual words called as tokens. We can tokenize them whenever we
encounter a space, we can train a model in that way. Even punctuations are considered as individual
tokens as they have some meaning.

Input : San Pedro is a town on the southern part of the island of Ambergris Caye in the Belize District
of the nation of Belize, in Central America. According to 2015 mid-year estimates, the town has a
population of about 16, 444. It is the second-largest town in the Belize District and largest in the
Belize Rural South constituency.

Output : ‘San Pedro’, ’ is’, ’a’, ’town’ and so.

 Step #3: Predicting Parts of Speech for each token

Predicting whether the word is a noun, verb, adjective, adverb, pronoun, etc. This will help to
understand what the sentence is talking about. This can be achieved by feeding the tokens( and the
words around it) to a pre-trained part-of-speech classification model. This model was fed a lot of
English words with various parts of speech tagged to them so that it classifies the similar words it
encounters in future in various parts of speech. Again, the models don’t really understand the
‘sense’ of the words, it just classifies them on the basis of its previous experience. It’s pure statistics.

The process will look like this:

Input : Part of speech classification model

Output : Town - common noun

Is - verb

The - determiner

And similarly, it will classify various tokens.

 Step #4: Lemmatization


Feeding the model with the root word.
For example –

There’s a Buffalo grazing in the field.

There are Buffaloes grazing in the field.

Here, both Buffalo and Buffaloes mean the same. But, the computer can confuse it as two different
terms as it doesn’t know anything. So we have to teach the computer that both terms mean the
same. We have to tell a computer that both sentences are talking about the same concept. So we
need to find out the most basic form or root form or lemma of the word and feed it to the model
accordingly.

In a similar fashion, we can use it for verbs too. ‘Play’ and ‘Playing’ should be considered as same.

 Step #5: Identifying stop words

There are various words in the English language that are used very frequently like ‘a’, ‘and’, ‘the’ etc.
These words make a lot of noise while doing statistical analysis. We can take these words out. Some
NLP pipelines will categorize these words as stop words, they will be filtered out while doing some
statistical analysis. Definitely, they are needed to understand the dependency between various
tokens to get the exact sense of the sentence. The list of stop words varies and depends on what
kind of output are you expecting.

 Step 6.1: Dependency Parsing

This means finding out the relationship between the words in the sentence and how they are related
to each other. We create a parse tree in dependency parsing, with root as the main verb in the
sentence. If we talk about the first sentence in our example, then ‘is’ is the main verb and it will be
the root of the parse tree. We can construct a parse tree of every sentence with one root word(main
verb) associated with it. We can also identify the kind of relationship that exists between the two
words. In our example, ‘San Pedro’ is the subject and ‘island’ is the attribute. Thus, the relationship
between ‘San Pedro’ and ‘is’, and ‘island’ and ‘is’ can be established.

Just like we trained a Machine Learning model to identify various parts of speech, we can train a
model to identify the dependency between words by feeding many words. It’s a complex task
though. In 2016, Google released a new dependency parser Parsey McParseface which used a deep
learning approach.

 Step 6.2: Finding Noun Phrases

We can group the words that represent the same idea. For example – It is the second-largest town in
the Belize District and largest in the Belize Rural South constituency. Here, tokens ‘second’, ‘largest’
and ‘town’ can be grouped together as they together represent the same thing ‘Belize’. We can use
the output of dependency parsing to combine such words. Whether to do this step or not
completely depends on the end goal, but it’s always quick to do this if we don’t want much
information about which words are adjective, rather focus on other important details.

 Step #7: Named Entity Recognition(NER)


San Pedro is a town on the southern part of the island of Ambergris Caye in the 2. Belize District of
the nation of Belize, in Central America.
Here, the NER maps the words with the real world places. The places that actually exist in the
physical world. We can automatically extract the real world places present in the document using
NLP.

If the above sentence is the input, NER will map it like this way:

San Pedro - Geographic Entity

Ambergris Caye - Geographic Entity

Belize - Geographic Entity

Central America - Geographic Entity

NER systems look for how a word is placed in a sentence and make use of other statistical models to
identify what kind of word actually it is. For example – ‘Washington’ can be a geographical location
as well as the last name of any person. A good NER system can identify this.

Kinds of objects that a typical NER system can tag:

People’s names.

Company names.

Geographical locations

Product names.

Date and time.

Amount of money.

Events.

 Step #8: Coreference Resolution:

San Pedro is a town on the southern part of the island of Ambergris Caye in the Belize District of the
nation of Belize, in Central America. According to 2015 mid-year estimates, the town has a
population of about 16, 444. It is the second-largest town in the Belize District and largest in the
Belize Rural South constituency.

Here, we know that ‘it’ in the sentence 6 stands for San Pedro, but for a computer, it isn’t possible to
understand that both the tokens are same because it treats both the sentences as two different
things while it’s processing them. Pronouns are used with a high frequency in English literature and
it becomes difficult for a computer to understand that both things are same.
How Machine Learning Is Used by Famous Companies?
 Difficulty Level : Expert

 Last Updated : 01 Oct, 2021

Machine Learning is the technology of today! While some people claim that this technology could
end the world, others believe that it could make life even easier. And it is no surprise that almost all
companies are using this technology to attract as many customers as possible by providing
personalized customer experiences. In fact, there is an increase of 270% in the number of
companies embracing ML over the last four years.

However, it is much easier for established tech companies with large resources to invest and
research more in Machine Learning and Artificial Intelligence. That is the reason this article focuses
on the interesting ways in which ML is used by famous companies like Google, Facebook, Twitter,
Baidu, and Pinterest. So let check out these companies and the various methods in which they use
Machine Learning.

1. Google

Rather than wondering “Which Google applications use ML?” it is better to ask the question “Do any
Google Applications not use ML?”. And the answer is most probably no!!! Google is heavily invested
in Machine Learning Research and plans to eventually integrate it fully in all its products. Even
currently, ML is used in all Google flagship products like Google Search, Google Translate, Google
Photos, Google Assistant, etc.

Google Search uses RankBrain which is a deep neural network that helps in providing the required
search results. In case there are any unique words or phrases on Google Search (like “CEO of Apple”)
then RankBrain makes intelligent guesses to find out that your search probably means “Tim Cook”.
Google Translate, on the other hand, analyses millions of documents that are already translated
from one language to another and then looks for the common patterns and basic vocabulary of the
language.

Google Photos uses Image Recognition, wherein Deep Learning is used to sort millions of images on
the internet in order to classify them more accurately. Google Assistant also uses Image Recognition
and Natural Language Processing to appear as a multitalented assistant that can answer all your
questions!

2. Facebook

Facebook is the most popular social networking site in the world with 2.41 Billion Monthly Active
Users! If you want to check out your friends, follow celebrities or watch cat photos, Facebook is the
place to go! And this level of popularity is only possible with the help of Machine Learning. Facebook
using ML in everything ranging from its News Feed to even Targeted Advertising.
Facebook uses Facial Recognition to recognize your friends on social media and suggest their names.
If you have your “tag suggestions” or “face recognition” turned on in Facebook then the Machine
Learning System analyses the pixels of the face in the image and creates a template that is unique
for every face. This facial fingerprint can then be used to detect the face again and suggest a tag.

And targeted advertising on Facebook is done using deep neural networks that analyze your age,
gender, location, page likes, interests, etc. to profile you into select categories and then show you
ads specifically targeted towards these categories. Facebook also uses Chatbots now that provide
you human-like customer support interactions. These chatbots use ML and NLP to interact with the
users and appear almost like humans.

3. Twitter

Twitter is the goto place for interesting tweets and intelligent debates! Want to know about the
current political climate, dangers of global warming, smart comments from favorite celebrities, then
go to Twitter! And guess how all these tweets are managed? That’s right, using Machine Learning!

Twitter uses an ML algorithm to organize the tweets on your timeline. Tweets based on the type of
content you like as well as tweets posted by friends, family, and so on are given higher priority and
appear on your higher on your feed. Also, tweets that are quite popular with lots of retweets or likes
have a higher chance of visibility. You may also see some of these tweets in the “In case you missed
it” section. Earlier, the tweets were arranged in a reverse chronological order, which is popular with
some people as they are demanding it back! Currently, Twitter is also using the Natural Language
Processing capabilities of IBM Watson to track and delete the abusive tweets generated.

Twitter also uses deep learning to identify what is going on in the live feed. This is done by training
the neural network to recognize the images in the videos using tags. Suppose you put the tags
“Dog”, “Animal”, “Pug” etc. in your video, the algorithm can identify that this is a dog and then use
this to identify dogs in other videos.

4. Baidu

Baidu is Google for China! While that is not strictly true, Baidu is a Chinese Search Engine that is
most commonly compared to Google. And just like Google, it also uses Machine Learning in many of
its applications like Baidu Search, DuerOS which is Baidu’s voice assistant, the Xiaoyu Zaikia (Little
Fish) home robot which is like Alexa.

Now, the primary focus of Baidu is its Search Engine as 75% of Chinese people use this. So Machine
Learning Algorithms are used for voice recognition and image recognition to provide the best
possible (and smarter!) service. Baidu has also invested heavily in natural language processing, which
is visible in DuerOS.

DuerOS is Baidu’s voice assistant, which uses natural language processing along with voice and
image recognition to create a smart system that can hold a full conversation with you while sounding
like a human. This voice assistant uses ML to understand the complexities in human language and
then copy it perfectly. Another application of Baidu’s mastery of NLP is the Little Fish home robot
which is like Alexa, but also different. It can turn its head to “listen” in the direction the voice is
coming from and respond accordingly!
5. Pinterest

In case you want to pin the images, videos, and GIFs that interest you, Pinterest is the place for it!
And whether you are a regular pinner or just a novice, Pinterest’s immense popularity guarantees
you have heard its name. Now, since this application is dependent on saving images from the
internet, it stands to reason that its most important feature would be to identify images.

And that’s where Machine Learning comes in! Pinterest using Image Recognition algorithms to
identify the patterns in an image you have pinned so that similar images are displayed when you
search for them. Suppose you have pinned a green shirt, you will be able to view images of more
green shirts using Image Recognition. But Pinterest doesn’t guarantee if these green shirts are
fashionable or not!!!

Another application of ML is that Pinterest provides you more personalized


recommendations based on your personal Pinning history. This is different than ML algorithms for
other social networking applications that also factor in your friends, age, gender, etc.

You might also like