0% found this document useful (0 votes)
31 views

Null 5

The document discusses machine learning, including its definition, how it works, types of machine learning algorithms, and supervised learning. Machine learning allows computers to learn from data without being explicitly programmed. There are four main types of machine learning: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Supervised learning uses labeled training data to predict output values or classify inputs.

Uploaded by

eshakalifathima
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Null 5

The document discusses machine learning, including its definition, how it works, types of machine learning algorithms, and supervised learning. Machine learning allows computers to learn from data without being explicitly programmed. There are four main types of machine learning: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Supervised learning uses labeled training data to predict output values or classify inputs.

Uploaded by

eshakalifathima
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

UNIT - V

Machine Learning:
• Machine Learning is said as a subset of artificial intelligence that is mainly
concerned with the development of algorithms which allow a computer to learn
from the data and past experiences on their own.
• Machine Learning is the field of study that gives computers the capability to
learn without being explicitly programmed.
• A Machine Learning system learns from historical data, builds the
prediction models, and whenever it receives new data, predicts the output
for it.

Basic Difference in ML and Traditional Programming

• Traditional Programming: We feed in DATA (Input) + PROGRAM (logic), run it


on machine and get output.

• Machine Learning: We feed in DATA(Input) + Output, run it on machine during


training and the machine creates its own program(logic), which can be
evaluated while testing.

How Does Machine Learning Work?

There are four key steps you would follow when creating a machine learning
model.

1. Choose and Prepare a Training Data Set

Training data is information that is representative of the data the machine


learning application will ingest to tune model parameters. Training data is
sometimes labeled, meaning it has been tagged to call out classifications or
expected values the machine learning mode is required to predict. Other training
data may be unlabeled so the model will have to extract features and assign
clusters autonomously.

For labeled, data should be divided into a training subset and a testing subset.
The former is used to train the model and the latter to evaluate the effectiveness
of the model and find ways to improve it.
2. Select an Algorithm to Apply to the Training Data Set

The type of machine learning algorithm you choose will primarily depend on a
few aspects:

• Whether the use case is prediction of a value or classification which uses


labeled training data or the use case is clustering or dimensionality reduction
which uses unlabeled training data
• How much data is in the training set
• The nature of the problem the model seeks to solve

For prediction or classification use cases, you would usually use regression
algorithms such as ordinary least square regression or logistic regression. With
unlabeled data, you are likely to rely on clustering algorithms such as k -means or
nearest neighbor. Some algorithms like neural networks can be configured to
work with both clustering and prediction use cases.

3. Train the Algorithm to Build the Model

Training the algorithm is the process of tuning model variables and parameters to
more accurately predict the appropriate results. Training the machine learning
algorithm is usually iterative and uses a variety of optimization methods
depending upon the chosen model. These optimization methods do not require
human intervention which is part of the power of machine learning. The machine
learns from the data you give it with little to no specific direction from the user.

4. Use and Improve the Model

The last step is to feed new data to the model as a means of improving its
effectiveness and accuracy over time. Where the new information will come from
depends on the nature of the problem to be solved. For instance, a machine
learning model for self-driving cars will ingest real-world information on road
conditions, objects and traffic laws.

Need for machine learning:

The need for machine learning is increasing day by day. The reason behind the
need for machine learning is that it is capable of doing tasks that are too complex
for a person to implement directly. As a human, we have some limitations as we
cannot access the huge amount of data manually, so for this, we need some
computer systems and here comes the machine learning to make things easy for
us.
We can train machine learning algorithms by providing them the huge
amount of data and let them explore the data, construct the models, and predict
the required output automatically. The performance of the machine learning
algorithm depends on the amount of data, and it can be determined by the cost
function. With the help of machine learning, we can save both time and money.
The importance of machine learning can be easily understood by its uses
cases, Currently, machine learning is used in self-driving cars, cyber fraud
detection, face recognition, and friend suggestion by Facebook, etc. Various
top companies such as Netflix and Amazon have build machine learning models
that are using a vast amount of data to analyze the user interest and recommend
product accordingly.

importance of Machine Learning:

o Rapid increment in the production of data


o Solving complex problems, which are difficult for a human
o Decision making in various sector including finance
o Finding hidden patterns and extracting useful information from data.

Features of Machine Learning:

o Machine learning uses data to detect various patterns in a given dataset.


o It can learn from past data and improve automatically.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the
huge amount of the data.

Classification of Machine Learning


At a broad level, machine learning can be classified into four types:

1. Supervised learning
2. Semi-supervised learning
3. Unsupervised learning
4. Reinforcement learning

1) Supervised Learning
Supervised learning is a type of machine learning method in which we
provide sample labeled data to the machine learning system in order to train it,
and on that basis, it predicts the output.
The goal of supervised learning is to map input data with the output data.
The supervised learning is based on supervision, and it is the same as when a
student learns things in the supervision of the teacher. Supervised learning can be
grouped further in two categories of algorithms:

o Classification
o Regression

2) Semi-supervised learning:

Where an incomplete training signal is given: a training set with some


(often many) of the target outputs missing. Semi-supervised learning is an
approach to machine learning that combines small labeled data with a large
amount of unlabeled data during training. Semi-supervised learning falls
between unsupervised learning and supervised learning.
3) Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns
without any supervision.
The training is provided to the machine with the set of data that has not
been labeled, classified, or categorized, and the algorithm needs to act on that
data without any supervision. The goal of unsupervised learning is to restructure
the input data into new features or a group of objects with similar patterns.
In unsupervised learning, we don't have a predetermined result. The
machine tries to find useful insights from the huge amount of data. It can be
further classifieds into two categories of algorithms:

o Clustering
o Association

4) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a
learning agent gets a reward for each right action and gets a penalty for each
wrong action. The agent learns automatically with these feedbacks and improves
its performance. In reinforcement learning, the agent interacts with the
environment and explores it. The goal of an agent is to get the most reward points,
and hence, it improves its performance.
Supervised learning
Supervised learning, as the name indicates, has the presence of a supervisor as a
teacher. Basically supervised learning is when we teach or train the machine using
data that is well labelled. Which means some data is already tagged with the correct
answer. After that, the machine is provided with a new set of examples(data) so that
the supervised learning algorithm analyses the training data(set of training examples)
and produces a correct outcome from labelled data.

How Supervised Learning Works?


In supervised learning, models are trained using labelled dataset, where the model
learns about each type of data. Once the training process is completed, the model is
tested on the basis of test data (a subset of the training set), and then it predicts the
output.
The working of Supervised learning can be easily understood by the below example
and diagram:

Steps Involved in Supervised Learning:


o First Determine the type of training dataset
o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation
dataset.
o Determine the input features of the training dataset, which should have enough
knowledge so that the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector
machine, decision tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation
sets as the control parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model
predicts the correct output, which means our model is accurate.

Types of supervised Machine learning Algorithms:


Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable
and the output variable. It is used for the prediction of continuous variables, such as
Weather forecasting, Market Trends, etc. Below are some popular Regression
algorithms which come under supervised learning:

o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which
means there are two classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,

o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines
Advantages of Supervised learning:
o With the help of supervised learning, the model can predict the output on the
basis of prior experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such
as fraud detection, spam filtering, etc.

Disadvantages of supervised learning:


o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different
from the training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.

Unsupervised learning
Unsupervised learning is the training of a machine using information that is neither
classified nor labeled and allowing the algorithm to act on that information without
guidance. Here the task of the machine is to group unsorted information according to
similarities, patterns, and differences without any prior training of data. Unsupervised
learning cannot be directly applied to a regression or classification problem because
unlike supervised learning, we have the input data but no corresponding output data.
The goal of unsupervised learning is to find the underlying structure of dataset,
group that data according to similarities, and represent that dataset in a
compressed format.

Example: Suppose the unsupervised learning algorithm is given an input dataset


containing images of different types of cats and dogs. The algorithm is never trained
upon the given dataset, which means it does not have any idea about the features of
the dataset. The task of the unsupervised learning algorithm is to identify the image
features on their own. Unsupervised learning algorithm will perform this task by
clustering the image dataset into the groups according to similarities between images.

Why use Unsupervised Learning?


Below are some main reasons which describe the importance of Unsupervised
Learning:

o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output
so to solve such cases, we need unsupervised learning.

Working of Unsupervised Learning


Working of unsupervised learning can be understood by the below diagram:

Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the
machine learning model in order to train it. Firstly, it will interpret the raw data to find
the hidden patterns from the data and then will apply suitable algorithms such as k-
means clustering, Decision tree, etc.

Types of Unsupervised Learning Algorithm:


The unsupervised learning algorithm can be further categorized into two types of
problems:
o Clustering: Clustering is a method of grouping the objects into clusters such
that objects with most similarities remains into a group and has less or no
similarities with the objects of another group. Cluster analysis finds the
commonalities between the data objects and categorizes them as per the
presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is
used for finding the relationships between variables in the large database. It
determines the set of items that occurs together in the dataset. Association rule
makes marketing strategy more effective. Such as people who buy X item
(suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical
example of Association rule is Market Basket Analysis.

Unsupervised Learning algorithms:


Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition

Applications of Unsupervised Learning


o Network Analysis: Unsupervised learning is used for identifying plagiarism
and copyright in document network analysis of text data for scholarly articles.
o Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation applications for
different web applications and e-commerce websites.
o Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points within the
dataset. It is used to discover fraudulent transactions.
o Singular Value Decomposition: Singular Value Decomposition or SVD is
used to extract particular information from the database. For example,
extracting information of each user located at a particular location.

Advantages of Unsupervised Learning


o Unsupervised learning is used for more complex tasks as compared to
supervised learning because, in unsupervised learning, we don't have labeled
input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in
comparison to labeled data.

Disadvantages of Unsupervised Learning


o Unsupervised learning is intrinsically more difficult than supervised learning as
it does not have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as
input data is not labeled, and algorithms do not know the exact output in
advance.

Supervised vs. Unsupervised Machine Learning

Supervised Unsupervised
Parameters machine learning machine learning

Algorithms are trained Algorithms are used against


Input Data using labeled data. data that is not labelled

Computational
Complexity Simpler method Computationally complex

Accuracy Highly accurate Less accurate

No. of classes No. of classes is known No. of classes is not known


Data Analysis Uses offline analysis Uses real-time analysis of data

Linear and Logistics


regression, Random forest, K-Means clustering,
Hierarchical clustering,
Support Vector Machine,
Algorithms used Neural Network, etc. Apriori algorithm, etc.

Semi-supervised Learning
• Semi-supervised machine learning is a combination
of supervised and unsupervised learning.
• It uses a small amount of labeled data and a large amount of unlabeled data,
which provides the benefits of both unsupervised and supervised learning while
avoiding the challenges of finding a large amount of labeled data.
• That means you can train a model to label data without having to use as much
labeled training data.
• The basic disadvantage of supervised learning is that it requires hand-labeling
by ML specialists or data scientists, and it also requires a high cost to process.
• Further unsupervised learning also has a limited spectrum for its applications.
• To overcome these drawbacks of supervised learning and unsupervised
learning algorithms, the concept of Semi-supervised learning is
introduced.
• In this algorithm, training data is a combination of both labeled and unlabeled
data.
• However, labeled data exists with a very small amount while it consists of a
huge amount of unlabeled data.
• Initially, similar data is clustered along with an unsupervised learning algorithm,
and further, it helps to label the unlabeled data into labeled data.
• It is why label data is a comparatively, more expensive acquisition than
unlabeled data.

Real-world applications of Semi-supervised Learning-


Semi-supervised learning models are becoming more popular in the industries. Some
of the main applications are as follows.

o Speech Analysis- It is the most classic example of semi-supervised learning


applications. Since, labeling the audio data is the most impassable task that
requires many human resources, this problem can be naturally overcome with
the help of applying SSL in a Semi-supervised learning model.
o Web content classification- However, this is very critical and impossible to
label each page on the internet because it needs mode human intervention.
Still, this problem can be reduced through Semi-Supervised learning
algorithms.
Further, Google also uses semi-supervised learning algorithms to rank a
webpage for a given query.
o Protein sequence classification- DNA strands are larger, they require active
human intervention. So, the rise of the Semi-supervised model has been
proximate in this field.
o Text document classifier- As we know, it would be very unfeasible to find a
large amount of labeled text data, so semi-supervised learning is an ideal model
to overcome this.

Reinforcement learning
• Reinforcement learning is an area of Machine Learning.
• It is about taking suitable action to maximize reward in a particular situation.
• It is employed by various software and machines to find the best possible
behaviour or path it should take in a specific situation.
• Reinforcement learning differs from supervised learning in a way that in
supervised learning the training data has the answer key with it so the model is
trained with the correct answer itself whereas in reinforcement learning, there
is no answer but the reinforcement agent decides what to do to perform the
given task.
• In the absence of a training dataset, it is bound to learn from its experience.

Example: The problem is as follows: We have an agent and a reward, with many
hurdles in between. The agent is supposed to find the best possible path to reach the
reward. The following problem explains the problem more easily.
The above image shows the robot, diamond, and fire. The goal of the robot is to get
the reward that is the diamond and avoid the hurdles that are fired. The robot learns
by trying all the possible paths and then choosing the path which gives him the reward
with the least hurdles. Each right step will give the robot a reward and each wrong step
will subtract the reward of the robot. The total reward will be calculated when it reaches
the final reward that is the diamond.
Main points in Reinforcement learning –

• Input: The input should be an initial state from which the model will start
• Output: There are many possible outputs as there are a variety of solutions to
a particular problem
• Training: The training is based upon the input, The model will return a state and
the user will decide to reward or punish the model based on its output.
• The model keeps continues to learn.
• The best solution is decided based on the maximum reward.

Types of Reinforcement: There are two types of Reinforcement:


1. Positive –
Positive Reinforcement is defined as when an event, occurs due to a particular
behavior, increases the strength and the frequency of the behavior. In other
words, it has a positive effect on behavior.
Advantages of reinforcement learning are:
• Maximizes Performance
• Sustain Change for a long period of time
• Too much Reinforcement can lead to an overload of states which can
diminish the results
2. Negative –
Negative Reinforcement is defined as strengthening of behavior because a
negative condition is stopped or avoided.
Advantages of reinforcement learning:
• Increases Behavior
• Provide defiance to a minimum standard of performance
• It Only provides enough to meet up the minimum behavior

Real-world Use cases of Reinforcement Learning


o Video Games:

RL algorithms are much popular in gaming applications. It is used to gain super-


human performance. Some popular games that use RL algorithms
are AlphaGO and AlphaGO Zero.

o Resource Management:

The "Resource Management with Deep Reinforcement Learning" paper


showed that how to use RL in computer to automatically learn and schedule
resources to wait for different jobs in order to minimize average job slowdown.

o Robotics:
RL is widely being used in Robotics applications. Robots are used in the
industrial and manufacturing area, and these robots are made more powerful
with reinforcement learning. There are different industries that have their vision
of building intelligent robots using AI and Machine learning technology.
o Text Mining
Text-mining, one of the great applications of NLP, is now being implemented
with the help of Reinforcement Learning by Salesforce company.
Advantages and Disadvantages of Reinforcement Learning
Advantages

o It helps in solving complex real-world problems which are difficult to be solved


by general techniques.
o The learning model of RL is similar to the learning of human beings; hence most
accurate results can be found.
o Helps in achieving long term results.

Disadvantage

o RL algorithms are not preferred for simple problems.


o RL algorithms require huge data and computations.
o Too much reinforcement learning can lead to an overload of states which can
weaken the results.

Difference between Reinforcement learning and Supervised learning:

Reinforcement learning Supervised learning

Reinforcement learning is all about making


decisions sequentially. In simple words, we can say In Supervised learning, the
that the output depends on the state of the current decision is made on the initial
input and the next input depends on the output of input or the input given at the
the previous input start

In supervised learning the


In Reinforcement learning decision is dependent, decisions are independent of
So we give labels to sequences of dependent each other so labels are
decisions given to each decision.

Example: Chess game Example: Object recognition

Machine Learning Platforms


Machine Learning is the most popular technology in the 21st century that has various
capabilities such as text recognition, image recognition, training, tuning, etc. There are
some best machine learning platforms or software given below, using which you can
effectively deploy machine learning in your business.

o Amazon Sagemaker
o TIBCO Software
o Alteryx Analytics
o SAS
o H2O.ai
o DataRobot
o RapidMiner

You might also like