0% found this document useful (0 votes)
50 views

Unit 1 Notes

This document discusses 10 applications of machine learning: 1. Image recognition for tasks like face detection on Facebook. 2. Speech recognition which converts voice to text for assistants like Siri. 3. Traffic prediction using location data and past travel times on Google Maps. 4. Product recommendations on sites like Amazon and Netflix. 5. Self-driving cars using computer vision and other techniques. 6. Email filtering which identifies spam and malware using algorithms. 7. Virtual assistants like Alexa which respond to voice commands. 8. Fraud detection to secure online transactions with neural networks. 9. Stock market analysis using neural networks to predict market trends. 10. Medical diagnosis with 3D models to

Uploaded by

Ram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Unit 1 Notes

This document discusses 10 applications of machine learning: 1. Image recognition for tasks like face detection on Facebook. 2. Speech recognition which converts voice to text for assistants like Siri. 3. Traffic prediction using location data and past travel times on Google Maps. 4. Product recommendations on sites like Amazon and Netflix. 5. Self-driving cars using computer vision and other techniques. 6. Email filtering which identifies spam and malware using algorithms. 7. Virtual assistants like Alexa which respond to voice commands. 8. Fraud detection to secure online transactions with neural networks. 9. Stock market analysis using neural networks to predict market trends. 10. Medical diagnosis with 3D models to

Uploaded by

Ram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Applications of Machine learning

Machine learning is a buzzword for today's technology, and it is growing very rapidly
day by day. We are using machine learning in our daily life even without knowing it
such as Google Maps, Google assistant, Alexa, etc. Below are some most trending
real-world applications of Machine Learning:

1. Image Recognition:
Image recognition is one of the most common applications of machine learning. It is
used to identify objects, persons, places, digital images, etc. The popular use case of
image recognition and face detection is, Automatic friend tagging suggestion:
Facebook provides us a feature of auto friend tagging suggestion. Whenever we
upload a photo with our Facebook friends, then we automatically get a tagging
suggestion with name, and the technology behind this is machine learning's face
detection and recognition algorithm.

It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.

13.1M
212
Triggers in SQL (Hindi)

2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.

Speech recognition is a process of converting voice instructions into text, and it is


also known as "Speech to text", or "Computer speech recognition." At present,
machine learning algorithms are widely used by various applications of speech
recognition. Google assistant, Siri, Cortana, and Alexa are using speech recognition
technology to follow the voice instructions.

3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the
correct path with the shortest route and predicts the traffic conditions.

It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or


heavily congested with the help of two ways:

o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.

Everyone who is using Google Map is helping this app to make it better. It takes
information from the user and sends back to its database to improve the
performance.

4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment
companies such as Amazon, Netflix, etc., for product recommendation to the user.
Whenever we search for some product on Amazon, then we started getting an
advertisement for the same product while internet surfing on the same browser and
this is because of machine learning.
Google understands the user interest using various machine learning algorithms and
suggests the product as per customer interest.

As similar, when we use Netflix, we find some recommendations for entertainment


series, movies, etc., and this is also done with the help of machine learning.

5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars.
Machine learning plays a significant role in self-driving cars. Tesla, the most popular
car manufacturing company is working on self-driving car. It is using unsupervised
learning method to train the car models to detect people and objects while driving.

6. Email Spam and Malware Filtering:


Whenever we receive a new email, it is filtered automatically as important, normal,
and spam. We always receive an important mail in our inbox with the important
symbol and spam emails in our spam box, and the technology behind this is Machine
learning. Below are some spam filters used by Gmail:

o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters

Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree,


and Naïve Bayes classifier are used for email spam filtering and malware detection.

7. Virtual Personal Assistant:


We have various virtual personal assistants such as Google
assistant, Alexa, Cortana, Siri. As the name suggests, they help us in finding the
information using our voice instruction. These assistants can help us in various ways
just by our voice instructions such as Play music, call someone, Open an email,
Scheduling an appointment, etc.

These virtual assistants use machine learning algorithms as an important part.

These assistant record our voice instructions, send it over the server on a cloud, and
decode it using ML algorithms and act accordingly.
8. Online Fraud Detection:
Machine learning is making our online transaction safe and secure by detecting fraud
transaction. Whenever we perform some online transaction, there may be various
ways that a fraudulent transaction can take place such as fake accounts, fake ids,
and steal money in the middle of a transaction. So to detect this, Feed Forward
Neural network helps us by checking whether it is a genuine transaction or a fraud
transaction.

For each genuine transaction, the output is converted into some hash values, and
these values become the input for the next round. For each genuine transaction,
there is a specific pattern which gets change for the fraud transaction hence, it
detects it and makes our online transactions more secure.

9. Stock Market trading:


Machine learning is widely used in stock market trading. In the stock market, there is
always a risk of up and downs in shares, so for this machine learning's long short
term memory neural network is used for the prediction of stock market trends.

10. Medical Diagnosis:


In medical science, machine learning is used for diseases diagnoses. With this,
medical technology is growing very fast and able to build 3D models that can predict
the exact position of lesions in the brain.

It helps in finding brain tumors and other brain-related diseases easily.

11. Automatic Language Translation:


Nowadays, if we visit a new place and we are not aware of the language then it is not
a problem at all, as for this also machine learning helps us by converting the text into
our known languages. Google's GNMT (Google Neural Machine Translation) provide
this feature, which is a Neural Machine Learning that translates the text into our
familiar language, and it called as automatic translation.

The technology behind the automatic translation is a sequence to sequence learning


algorithm, which is used with image recognition and translates the text from one
language to another language.
ML | Types of Learning – Supervised
Learning

What is Learning for a machine? 


A machine is said to be learning from past Experiences(data feed in) with respect to
some class of tasks if its Performance in a given Task improves with the Experience.
For example, assume that a machine has to predict whether a customer will buy a
specific product let’s say “Antivirus” this year or not. The machine will do it by
looking at the previous knowledge/past experiences i.e the data of products that the
customer had bought every year and if he buys Antivirus every year, then there is a
high probability that the customer is going to buy an antivirus this year as well. This is
how machine learning works at the basic conceptual level.  

Supervised Learning : 
Supervised learning is when the model is getting trained on a labelled dataset.
A labelled dataset is one that has both input and output parameters. In this type of
learning both training and validation, datasets are labelled as shown in the figures
below. 
Both the above figures have labelled data set – 
 Figure A: It is a dataset of a shopping store that is useful in predicting
whether a customer will purchase a particular product under consideration
or not based on his/ her gender, age, and salary. 
Input: Gender, Age, Salary 
Output: Purchased i.e. 0 or 1; 1 means yes the customer will purchase and
0 means that the customer won’t purchase it. 
 Figure B: It is a Meteorological dataset that serves the purpose of
predicting wind speed based on different parameters. 
Input: Dew Point, Temperature, Pressure, Relative Humidity, Wind
Direction 
Output: Wind Speed 
Training the system: 
While training the model, data is usually split in the ratio of 80:20 i.e. 80% as training
data and rest as testing data. In training data, we feed input as well as output for 80%
of data. The model learns from training data only. We use different machine learning
algorithms(which we will discuss in detail in the next articles) to build our model. By
learning, it means that the model will build some logic of its own. 
Once the model is ready then it is good to be tested. At the time of testing, the input is
fed from the remaining 20% data which the model has never seen before, the model
will predict some value and we will compare it with actual output and calculate the
accuracy. 
Types of Supervised Learning:  
1. Classification: It is a Supervised Learning task where output is having
defined labels(discrete value). For example in above Figure A, Output –
Purchased has defined labels i.e. 0 or 1; 1 means the customer will purchase
and 0 means that customer won’t purchase. The goal here is to predict
discrete values belonging to a particular class and evaluate them on the
basis of accuracy. 
It can be either binary or multi-class classification. In binary classification,
the model predicts either 0 or 1; yes or no but in the case of multi-
class classification, the model predicts more than one class. 
Example: Gmail classifies mails in more than one class like social,
promotions, updates, forums.
2. Regression: It is a Supervised Learning task where output is having
continuous value. 
Example in above Figure B, Output – Wind Speed is not having any
discrete value but is continuous in the particular range. The goal here is to
predict a value as much closer to the actual output value as our model can
and then evaluation is done by calculating the error value. The smaller the
error the greater the accuracy of our regression model.
Example of Supervised Learning Algorithms:  
 Linear Regression
 Nearest Neighbor
 Gaussian Naive Bayes
 Decision Trees
 Support Vector Machine (SVM)
 Random Forest

Advantages of Supervised learning:


o With the help of supervised learning, the model can predict the output on the basis
of prior experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such
as fraud detection, spam filtering, etc.

Disadvantages of supervised learning:


o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from
the training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.
Unsupervised Machine Learning
https://www.coursera.org/lecture/machine-learning/unsupervised-learning-olRZo(video link)

In the previous topic, we learned supervised machine learning in which models are
trained using labeled data under the supervision of training data. But there may be
many cases in which we do not have labeled data and need to find the hidden
patterns from the given dataset. So, to solve such types of cases in machine learning,
we need unsupervised learning techniques.

What is Unsupervised Learning?


As the name suggests, unsupervised learning is a machine learning technique in
which models are not supervised using training dataset. Instead, models itself find
the hidden patterns and insights from the given data. It can be compared to learning
which takes place in the human brain while learning new things. It can be defined as:

Unsupervised learning is a type of machine learning in which models are trained using unlabeled
dataset and are allowed to act on that data without any supervision.

Unsupervised learning cannot be directly applied to a regression or classification


problem because unlike supervised learning, we have the input data but no
corresponding output data. The goal of unsupervised learning is to find the
underlying structure of dataset, group that data according to similarities, and
represent that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an input dataset


containing images of different types of cats and dogs. The algorithm is never trained
upon the given dataset, which means it does not have any idea about the features of
the dataset. The task of the unsupervised learning algorithm is to identify the image
features on their own. Unsupervised learning algorithm will perform this task by
clustering the image dataset into the groups according to similarities between
images.

28.7M

632

C++ vs Java
Why use Unsupervised Learning?
Below are some main reasons which describe the importance of Unsupervised
Learning:

o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output so to
solve such cases, we need unsupervised learning.

Working of Unsupervised Learning


Working of unsupervised learning can be understood by the below diagram:
Here, we have taken an unlabelled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabelled input data is fed to
the machine learning model in order to train it. Firstly, it will interpret the raw data to
find the hidden patterns from the data and then will apply suitable algorithms such
as k-means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into
groups according to the similarities and difference between the objects.

Types of Unsupervised Learning Algorithm:


The unsupervised learning algorithm can be further categorized into two types of
problems:
o Clustering: Clustering is a method of grouping the objects into clusters such that
objects with most similarities remains into a group and has less or no similarities with
the objects of another group. Cluster analysis finds the commonalities between the
data objects and categorizes them as per the presence and absence of those
commonalities.
o Association: An association rule is an unsupervised learning method which is used
for finding the relationships between variables in the large database. It determines
the set of items that occurs together in the dataset. Association rule makes marketing
strategy more effective. Such as people who buy X item (suppose a bread) are also
tend to purchase Y (Butter/Jam) item. A typical example of Association rule is Market
Basket Analysis.

Unsupervised Learning algorithms:


Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition

Advantages of Unsupervised Learning


Unsupervised learning is used for more complex tasks as compared to supervised learning because,
in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison
to labeled data.

Disadvantages of Unsupervised Learning


o Unsupervised learning is intrinsically more difficult than supervised learning as it does
not have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input
data is not labeled, and algorithms do not know the exact output in advance.

Semi-supervised Learning:

In this type of learning, the algorithm is trained upon a combination of labeled


and unlabelled data. Typically, this combination will contain a very small
amount of labeled data and a very large amount of unlabelled data. The
basic procedure involved is that first, the programmer will cluster similar data
using an unsupervised learning algorithm and then use the existing labeled
data to label the rest of the unlabelled data. The typical use cases of such
type of algorithm have a common property among them – The acquisition of
unlabelled data is relatively cheap while labeling the said data is very
expensive. 
Intuitively, one may imagine the three types of learning algorithms as
Supervised learning where a student is under the supervision of a teacher at
both home and school, Unsupervised learning where a student has to figure
out a concept himself and Semi-Supervised learning where a teacher
teaches a few concepts in class and gives questions as homework which are
based on similar concepts. 
A Semi-Supervised algorithm assumes the following about the data
1. Continuity Assumption: The algorithm assumes that the points
which are closer to each other are more likely to have the same
output label.
2. Cluster Assumption: The data can be divided into discrete
clusters and points in the same cluster are more likely to share an
output label.
3. Manifold Assumption: The data lie approximately on a manifold of
much lower dimension than the input space. This assumption
allows the use of distances and densities which are defined on
a manifold.
Practical applications of Semi-Supervised Learning – 
1. Speech Analysis: Since labeling of audio files is a very intensive
task, Semi-Supervised learning is a very natural approach to solve
this problem.
2. Internet Content Classification: Labeling each webpage is an
impractical and unfeasible process and thus uses Semi-Supervised
learning algorithms. Even the Google search algorithm uses a
variant of Semi-Supervised learning to rank the relevance of a
webpage for a given query.
3. Protein Sequence Classification: Since DNA strands are typically
very large in size, the rise of Semi-Supervised learning has been
imminent in this field.
Google, in 2016 launched a new Semi-Supervised learning tool called
Google Expander and you can learn more about it here.

What is Reinforcement Learning?


o Reinforcement Learning is a feedback-based Machine learning technique in which an
agent learns to behave in an environment by performing the actions and seeing the
results of actions. For each good action, the agent gets positive feedback, and for
each bad action, the agent gets negative feedback or penalty.
o In Reinforcement Learning, the agent learns automatically using feedbacks without
any labeled data, unlike supervised learning.
o Since there is no labeled data, so the agent is bound to learn by its experience only.
o RL solves a specific type of problem where decision making is sequential, and the
goal is long-term, such as game-playing, robotics, etc.
o The agent interacts with the environment and explores it by itself. The primary goal of
an agent in reinforcement learning is to improve the performance by getting the
maximum positive rewards.
o The agent learns with the process of hit and trial, and based on the experience, it
learns to perform the task in a better way. Hence, we can say that  "Reinforcement
learning is a type of machine learning method where an intelligent agent
(computer program) interacts with the environment and learns to act within
that." How a Robotic dog learns the movement of his arms is an example of
Reinforcement learning.
o It is a core part of Artificial intelligence, and all AI agent works on the concept of
reinforcement learning. Here we do not need to pre-program the agent, as it learns
from its own experience without any human intervention.
o Example: Suppose there is an AI agent present within a maze environment, and his
goal is to find the diamond. The agent interacts with the environment by performing
some actions, and based on those actions, the state of the agent gets changed, and it
also receives a reward or penalty as feedback.
o The agent continues doing these three things (take action, change state/remain in
the same state, and get feedback), and by doing these actions, he learns and
explores the environment.
o The agent learns that what actions lead to positive feedback or rewards and what
actions lead to negative feedback penalty. As a positive reward, the agent gets a
positive point, and as a penalty, it gets a negative point.

Terms used in Reinforcement Learning


o Agent(): An entity that can perceive/explore the environment and act upon it.
o Environment(): A situation in which an agent is present or surrounded by. In RL, we
assume the stochastic environment, which means it is random in nature.
o Action(): Actions are the moves taken by an agent within the environment.
o State(): State is a situation returned by the environment after each action taken by
the agent.
o Reward(): A feedback returned to the agent from the environment to evaluate the
action of the agent.
o Policy(): Policy is a strategy applied by the agent for the next action based on the
current state.
o Value(): It is expected long-term retuned with the discount factor and opposite to
the short-term reward.
o Q-value(): It is mostly similar to the value, but it takes one additional parameter as a
current action (a).
Key Features of Reinforcement Learning
o In RL, the agent is not instructed about the environment and what actions need to be
taken.
o It is based on the hit and trial process.
o The agent takes the next action and changes states according to the feedback of the
previous action.
o The agent may get a delayed reward.
o The environment is stochastic, and the agent needs to explore it to reach to get the
maximum positive rewards.

Approaches to implement Reinforcement


Learning
There are mainly three ways to implement reinforcement-learning in ML, which are:

1. Value-based:
The value-based approach is about to find the optimal value function, which is the
maximum value at a state under any policy. Therefore, the agent expects the long-
term return at any state(s) under policy π.
2. Policy-based:
Policy-based approach is to find the optimal policy for the maximum future rewards
without using the value function. In this approach, the agent tries to apply such a
policy that the action performed in each step helps to maximize the future reward.
The policy-based approach has mainly two types of policy:
o Deterministic: The same action is produced by the policy (π) at any state.
o Stochastic: In this policy, probability determines the produced action.
3. Model-based: In the model-based approach, a virtual model is created for the
environment, and the agent explores that environment to learn it. There is no
particular solution or algorithm for this approach because the model representation
is different for each environment.
Train vs. Validation vs. Test set
For training and testing purposes of our model, we should have our data
broken down into three distinct dataset splits.

The Training Set


It is the set of data that is used to train and make the model learn the hidden
features/patterns in the data.
In each epoch, the same training data is fed to the neural
network repeatedly, and the model continues to learn the features of the
data.
The training set should have a diversified set of inputs so that the model is
trained in all scenarios and can predict any unseen data sample that may
appear in the future.

The Validation Set


The validation set is a set of data, separate from the training set, that is used
to validate our model performance during training.
This validation process gives information that helps us tune the model’s
hyperparameters and configurations accordingly. It is like a critic telling us
whether the training is moving in the right direction or not.
The model is trained on the training set, and, simultaneously, the model
evaluation is performed on the validation set after every epoch.
The main idea of splitting the dataset into a validation set is to prevent our
model from overfitting i.e., the model becomes really good at classifying the
samples in the training set but cannot generalize and make accurate
classifications on the data it has not seen before. 

The Test Set


The test set is a separate set of data used to test the model after completing
the training.
It provides an unbiased final model performance metric in terms of accuracy,
precision, etc. To put it simply, it answers the question of "How well does the
model perform?"

How to split your Machine


Learning data?
The creation of different samples and splits in the dataset helps us judge the
true model performance. 
The dataset split ratio depends on the number of samples present in
the dataset and the model.
Some common inferences that can be derived on dataset split include:
 If there are several hyperparameters to tune, the machine learning
model requires a larger validation set to optimize the model
performance. Similarly, if the model has fewer or no
hyperparameters, it would be easy to validate the model using a
small set of data.
 If a model use case is such that a false prediction can drastically
hamper the model performance—like falsely predicting cancer—it’s
better to validate the model after each epoch to make the model
learn varied scenarios.
 With the increase in the dimension/features of the data, the
hyperparameters of the neural network also increase making the
model more complex. In these scenarios, a large split of data should
be kept in training set with a validation set.
The truth is—
There is no optimal split percentage.
One has to come to a split percentage that suits the requirements and meets
the model’s needs. 
However, there are two major concerns while deciding on the optimum split:
 If there is less training data, the machine learning model will show
high variance in training.
 With less testing data/validation data, your model evaluation/model
performance statistic will have greater variance.
Essentially, you need to come up with an optimum split that suits the need of
the dataset/model.
But here's the rough standard split that you might encounter.
3 common pitfalls in the training
data split
Finally, let's briefly discuss common mistakes that data scientists make when
building their models.

Low-quality training data


The quality of the training data is crucial for the model performance to
improve.
If the training data is “garbage,” one cannot expect the model to perform well.
Moreover, since the machine learning algorithms are sensitive to the training
data, even small variations/errors in the training set can lead to significant
errors in the model performance. 
Overfitting
Overfitting happens when the machine learning model memorizes the pattern
in the training data to such an extent that it fails to classify unseen data.
The noise or fluctuations in the training data is seen as features and learned
by the model. This leads to the model outperforming in the training set but
poor performance in the validation and testing sets.

Overemphasis on Validation and Test Set


metrics
The validation set metric is the one that decides the path of the training of the
model.
After each epoch, the machine learning model is evaluated on the validation
set. Based on the validation set metrics, the corresponding loss terms are
calculated, and the hyperparameters are modified.
Metrics should be chosen so that they bring a positive effect on the overall
trajectory of the model performance.

Descriptive and Predictive Analysis in


Machine Learning

Descriptive analysis is used to understand the past and


predictive analysis is used to predict the future. Both of these
concepts are important in machine learning because a clear
understanding of the problem and its implications is the best
way to make the right decisions.

Descriptive and Predictive Analysis are types of statistical


analysis techniques structured as a sequence of steps that
you need to take to gain comprehensive domain knowledge
to solve complex business problems. These techniques give
you a clear understanding of the business problem so that
you can make the right decisions. Let’s take a look at
descriptive and predictive analytics in machine learning one
by one.

Descriptive Analysis:
Before using a machine learning algorithm, it is very
important to acquire abstract knowledge of the problem. The
goal of descriptive analysis is to find an accurate
understanding of the problem by asking questions from
historical data. Let’s understand the descriptive analysis
process using an example. Suppose your task is to optimize
the supply chain of a department store, for this task we
have purchase and sales data. After analyzing the data, we
make assumptions that sales increase during the day just
before the weekend. This means that our machine learning
model is based on periodicity. So, descriptive analysis helps
us understand the deep patterns from the data to uncover all
those special features that were overlooked at the initial
stage.
In short, the purpose of descriptive analysis is to enable us to
understand whether the machine learning model will perform
poorly or whether it is the best model in a particular problem.
Predictive Analysis:
Predictive analytics is an important concept in machine
learning. What happens is that once we have formed a
machine learning model based on descriptive analysis, the
next goal is to infer its future steps by giving some initial
conditions. Predictive analytics is used to discover and define
certain rules that underlie a process for pushing a particular
condition on time. For example, the object detector of a self-
driven car can be extremely precise at detecting an obstacle
in time, but another model must take action that minimizes
the risk of damage and maximizes the likelihood of safe
movement.
Predictive analytics, therefore, means observing a problem in
time and taking the most appropriate action as a prescription
to avoid any type of risk.

Reinforcement learning video link

https://www.youtube.com/watch?v=e3Jy2vShroE

You might also like