0% found this document useful (0 votes)

18 views47 pages

FAM Unit5

The document outlines various types of machine learning, including supervised, unsupervised, and semi-supervised learning, detailing their algorithms, advantages, disadvantages, and applications. It explains the differences between classification and regression tasks, as well as the importance of model evaluation metrics such as accuracy, precision, recall, and F1 score. Additionally, it discusses overfitting and underfitting, dataset splitting, and performance evaluation techniques like confusion matrices and ROC curves.

Uploaded by

koolavarghese6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views47 pages

FAM Unit5

Uploaded by

koolavarghese6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

UNIT 5

TYPES OF
LEARNING
TYPES OF LEARNING

 Supervised Learning
 Unsupervised Learning
 Semi-supervised Learning
 Reinforcement Learning
SUPERVISED LEARNING
 Its use of labeled datasets to train algorithms
that to classify data or predict outcomes
accurately.
 It relies on guidance and supervision
 Ex. Exit poll
• Supervised learning involves training a machine
from labeled data.
• Labeled data consists of examples with the
correct answer or classification.
• The machine learns the relationship between
inputs (fruit images) and outputs (fruit labels).
• The trained machine can then make predictions
on new, unlabeled data.
SUPERVISED LEARNING
SUPERVISED LEARNING ALGORITHMS
1. Linear Regression: Used for regression task, it models the
relationship between dependent variable and one or more
independent variables.

2. Decision Tree: This algorithm partition the dataset into

smaller subsets based on features.

3. Random Forest: This method combines multiple trees to

improve accuracy.

4. Naïve Bayes : This algorithm works for text classification

and spam filtering

5. K-Nearest Neighbor :A simple classification algorithm that

classifies data points based on majority among their K-
nearest neighbors in the feature.
CATEGORIES/TYPES OF SUPERVISED
MACHINE LEARNING
REGRESSION
 Regression algorithms are used if there is a
relationship between the input variable and the
output variable.

 It is used for the prediction of continuous

variables, such as Weather forecasting, Market
Trends, etc.

 Example-
1. Linear Regression
2. Regression Trees
3. Non-Linear Regression
4. Bayesian Linear Regression
CLASSIFICATION
 Classification algorithms are used when the
output variable is categorical.

 which means there are two classes such as

Yes-No, Male-Female, True-false, etc.

 Example
1. Spam Filtering,
2. Random Forest
3. Decision Trees
4. Logistic Regression
5. Support vector Machines
DIFFERENCE BETWEEN
CLASSIFICATION AND REGRESSION

Parameter Classification Regression

Basic Mapping function is used Mapping function is used

for mapping values to for mapping values to
predefined classes continues output

Involvers Prediction of Discrete values Continuous values

Nature of the predicted Unordered Ordered

data

Method of calculation By measuring accuracy By measuring root mean

square error

Example Algorithms Decision tree, Logistic Linear regression, Random

regression Forest
ADVANTAGES OF SUPERVISED
LEARNING:

 With the help of supervised learning, the model

can predict the output on the basis of prior
experiences.

 In supervised learning, we can have an exact

idea about the classes of objects.

 Supervised learning model helps us to solve

various real-world problems such as fraud
detection, spam filtering, etc.
DISADVANTAGES OF SUPERVISED LEARNING:

 Supervised learning models are not suitable for

handling the complex tasks.

 Supervised learning cannot predict the correct

output if the test data is different from the
training dataset.

 Training required lots of computation times.

 In supervised learning, we need enough

knowledge about the classes of object.
APPLICATIONS OF SUPERVISED
LEARNING:
• Fraud detection: Helps identify fraudulent
transactions in banking and finance
• Spam detection: Uses keywords and content
to identify spam emails
• Translation: Uses large amounts of digital
written material to create models that can
translate text from one language to another
• Image recognition: A computer identifies an
object in an image by looking for patterns that
match what it has seen before
• Product recommendations: A popular
feature on e-commerce websites
• Social media features: Facebook uses
machine learning to automatically suggest
friend tags by identifying faces in a user's photo
UNSUPERVISED LEARNING
 Unsupervised machine learning uses
machine learning algorithms to analyze and
cluster unlabeled datasets.

 These algorithms discover hidden patterns

or data groupings without the need for
human intervention.
TYPES OF UNSUPERVISED LEARNING
ALGORITHM:
CLUSTERING:
Clustering is a method of grouping the objects into
clusters such that objects with most similarities
remains into a group and has less or no similarities
with the objects of another group.

It does it by finding some similar patterns in the

unlabeled dataset such as shape, size, color,
behavior, etc., and divides them as per the presence
and absence of those similar patterns.

It is an unsupervised learning method, hence no

supervision is provided to the algorithm, and it deals
with the unlabeled dataset.
Example: Mall or supermarket
ASSOCIATION
• An association rule is an unsupervised
learning method which is used for finding
the relationships between variables in the
large database.

• It determines the set of items that occurs

together in the dataset. Association rule
makes marketing strategy more effective.

• Such as people who buy X item (suppose a

bread) are also tend to purchase Y
(Butter/Jam) item. A typical example of
Association rule is Market Basket Analysis.
APPLICATION OF ASSOCIATION
? Retail: For market basket analysis to
understand customer buying habits and to
drive sales through promotions and store
layout optimizations.
? Healthcare: For identifying combinations of
symptoms and diagnoses that frequently
occur together, which can help in the
diagnosis of new patients.
? Web Usage Mining: For analyzing patterns
in web usage data to improve website design
and personalized content delivery.
? Finance: For fraud detection by identifying
unusual patterns of transactions.
UNSUPERVISED LEARNING
ALGORITHMS
1. K-Means Clustering: The K-means clustering
algorithm is one of the most popular unsupervised
machine learning algorithms and it is used for data
segmentation.
It works by partitioning a data set into k clusters,
where each cluster has a mean that is computed
from the training data.
2. Principal Component Analysis (PCA): The
PCA algorithm is used for dimensionality
reduction of datasets
3.Convolutional Neural Networks (CNNs):They
work by taking an input image and splitting it into
small square tiles called “windows.” Each window is
then passed through a neuron in the first layer
ADVANTAGES OF SUPERVISED
MACHINE LEARNING
• Uncovering hidden patterns and structures
in data without needing labeled examples.
• Ability to explore and discover insights from
large and complex datasets.
• Flexibility in handling diverse data types
and domains.
• Useful for exploratory data analysis and
feature engineering.
DISADVANTAGES OF UNSUPERVISED
MACHINE LEARNING
• Results may be unpredictable or difficult to
understand.

• Difficult to measure accuracy or

effectiveness due to lack of predefined
answers during training.
APPLICATION OF UNSUPERVISED
LEARNING

• Recommendation systems: Unsupervised

learning can identify patterns and similarities in
user behavior and preferences to recommend
products, movies, or music that align with their
interests.
• Customer segmentation: Unsupervised
learning can identify groups of customers with
similar characteristics, allowing businesses to
target marketing campaigns and improve
customer service more effectively.
• Image analysis: Unsupervised learning can
group images based on their content, facilitating
tasks such as image classification, object
detection, and image retrieval.
DIFFERENCE BETWEEN SUPERVISED
AND UNSUPERVISED MACHINE
LEARNING

Supervised machine Unsupervised machine

Parameters
learning learning

Algorithms are used

Algorithms are trained
Input Data against data that is not
using labeled data.
labeled

Computational
Simpler method Computationally complex
Complexity

Accuracy Highly accurate Less accurate

No. of classes is not

No. of classes No. of classes is known
known

Uses real-time analysis of

Data Analysis Uses offline analysis
data

Desired output is not

Output Desired output is given.
given.
DIFFERENCE
Supervised machine Unsupervised machine
Parameters
learning learning

It is not possible to learn It is possible to learn larger

larger and more complex and more complex models
Complex model
models with supervised with unsupervised
learning. learning.

Model We can test our model. We can not test our model.

Supervised learning is also Unsupervised learning is

Called as
called classification. also called clustering.

Example: Optical character Example: Find a face in an

Example
recognition. image.

Use training data to infer

Training data No training data is used.
model.
SEMI SUPERVISED MACHINE
LEARNING
? With semi-supervised learning, you train an
initial model on a few labeled samples and
then iteratively apply the model to a larger
dataset.

? A semi-supervised learning approach uses

small amounts of labeled data and also
large amounts of unlabeled data.

? With semi-supervised learning, you train an

initial model on a few labeled samples and
then iteratively apply the model to a larger
dataset.
STEPS:
• We train a model with labelled data.

• We use the trained model to predict labels for the

unlabeled data, which creates pseudo-labeled data.

• We retrain the model with the pseudo-labeled and

labeled data together.

• This process happens iteratively as the model improves

and is able to perform with a greater degree of
accuracy.
MODEL EVALUATION
• Model evaluation is the process that uses some
metrics which help us to analyze the performance
of the model

• As we all know that model development is a multi-

step process and a check should be kept on how
well the model generalizes future predictions.

• Therefore evaluating a model plays a vital role so

that we can judge the performance of our model.

• The evaluation also helps to analyze a model’s key

weaknesses.
Accuracy
Accuracy is defined as the ratio of the number of correct
predictions to the total number of predictions. This is the
most fundamental metric used to evaluate the model.
Precision and Recall
Precision is the ratio of true positives to the summation of
true positives and false positives. It basically analyses the
positive predictions.
Recall is a metric that measures positive instances(True
Positive)from all the actual positive samples in dataset.
F1 score
The F1 score is the harmonic mean of precision and recall.
It is seen that during the precision-recall trade-off if we
increase the precision, recall decreases and vice versa. The
goal of the F1 score is to combine precision and recall.
TRAINING AND TESTING
The training data is used to train the machine
learning algorithm. Once you have trained your
machine learning model on a dataset, you must test
it on unseen data to evaluate its performance.

This unseen data is called the testing data.

This is similar to the test data used in software

testing, just the context is different here. In
software testing, we use test data to ensure the
software works well for given data.

In machine learning, we use testing data to ensure

the model works for the given testing data.
NEED OF DATA SET SPLITTING.
The train-test split is a technique for
evaluating the performance of a machine
learning algorithm.

The procedure involves taking a dataset and

dividing it into two subsets. The first subset is
used to fit the model and is referred to as the
training dataset.

The second subset is not used to train the model;

instead, the input element of the dataset is
provided to the model, then predictions are made
and compared to the expected values. This
second dataset is referred to as the test dataset.
Dataset Splitting:
scikit-learn alias sklearn is the most useful
and robust library for machine learning in
Python. The scikit-learn library provides
us with the model-selection module in which
we have the splitter function train-test-
split()
OVER FITTING AND UNDER FITTING IN ML
Over fitting occurs when our machine
learning model tries to cover all the data
points or more than the required data points
present in the given dataset.
Because of this, the model starts caching
noise and inaccurate values present in the
dataset, and all these factors reduce the
efficiency and accuracy of the model.
UNDERFITTING
Underfitting occurs when our machine learning model
is not able to capture the underlying trend of the data.
To avoid the overfitting in the model, the fed of training
data can be stopped at an early stage, due to which
the model may not learn enough from the training
data. As a result, it may fail to find the best fit.

How to avoid underfitting:

By increasing the training time of the model.
By increasing the number of features.
PERFORMANCE MATRIX IN ML
Confusion Matrix:
A table with two rows and two columns that reports the number of
true positives, false negatives, false positives, and true negatives.

• True Positive (TP): The model correctly predicted a positive

outcome (the actual outcome was positive).
• True Negative (TN): The model correctly predicted a negative
outcome (the actual outcome was negative).
• False Positive (FP): The model incorrectly predicted a positive
outcome (the actual outcome was negative). Also known as a
Type I error.
• False Negative (FN): The model incorrectly predicted a
negative outcome (the actual outcome was positive). Also
known as a Type II error.

This matrix is especially helpful in evaluating a model’s

performance beyond basic accuracy metrics.
METRICS BASED ON CONFUSION MATRIX
DATA

1. Accuracy
Accuracy is used to measure the performance of the
model. It is the ratio of Total correct instances to the total
instances.
Accuracy= TP+TN
TP+TN+FP+FN

2. Precision
Precision is a measure of how accurate a model’s positive
predictions are. It is defined as the ratio of true positive
predictions to the total number of positive predictions
made by the model.
Precision= TP
TP+FP
3. Recall
It is the ratio of the number of true positive (TP)
instances to the sum of true positive and false
negative (FN) instances.
Out of all positive classes how our model
predicted correctly
Recall= TP
TP+FN
Recall must be high as possible.
4. F1-Score
F1-score is used to evaluate the overall
performance of a classification model. It is the
harmonic mean of precision and recall,
F1-Score= 2⋅Precision⋅Recall
Precision+Recall
AUC-ROC CURVE
The AUC-ROC curve, or Area Under the
Receiver Operating Characteristic curve, is a
graphical representation of the performance
of a binary classification model at various
classification thresholds.
It is commonly used in machine learning to
assess the ability of a model to distinguish
between two classes, typically the positive
class (e.g., presence of a disease) and the
negative class (e.g., absence of a disease).
RECEIVER OPERATING CHARACTERISTICS
(ROC) CURVE
ROC stands for Receiver Operating
Characteristics, and the ROC curve is the
graphical representation of the effectiveness of
the binary classification model. It plots the true
positive rate (TPR) vs the false positive rate
(FPR) at different classification thresholds.
Area Under Curve (AUC) Curve:
AUC stands for the Area Under the Curve, and
the AUC curve represents the area under the
ROC curve. It measures the overall performance
of the binary classification model. As both TPR
and FPR range between 0 to 1, So, the area will
always lie between 0 and 1, and
A greater value of AUC denotes better model
performance.
LOG LOSS
Logarithmic Loss, commonly known as Log
Loss or Cross-Entropy Loss, is a crucial metric
in machine learning, particularly in
classification problems. It quantifies the
performance of a classification model by
measuring the difference between predicted
probabilities and actual outcomes.
CROSS VALIDATION
Cross validation is a technique used in
machine learning to evaluate the performance
of a model on unseen data. It involves
dividing the available data into multiple folds
or subsets, using one of these folds as a
validation set, and training the model on the
remaining folds. This process is repeated
multiple times, each time using a different
fold as the validation set.
Finally, the results from each validation step
are averaged to produce a more robust
estimate of the model’s performance.
The main purpose of cross validation is to
prevent overfitting, which occurs when a
model is trained too well on the training data
and performs poorly on new, unseen data.

Methods of Cross Validation

1.Validation
In this method divide the input dataset into a training
set and test or validation set.

Both the subsets are given 50% of the dataset.

But it has one of the big disadvantages that we are just

using a 50% dataset to train our model, so the model
may miss out to capture important information of the
dataset. It also tends to give the under fitted model.
2. LOOCV (Leave One Out Cross Validation)
In this method, we perform training on the whole dataset but
leaves only one data-point of the available dataset and then
iterates for each data-point.
In LOOCV, the model is trained on n−1 samples and tested
on the one omitted sample, repeating this process for each
data point in the dataset.

An advantage of using this method is that we make use of

all data points and hence it is low bias.

The major drawback of this method is that it leads

to higher variation in the testing model as we are testing
against one data point. If the data point is an outlier it can
lead to higher variation.

Another drawback is it takes a lot of execution time as it

iterates over ‘the number of data points’ times.
K FOLD
K-FOLD CROSS-VALIDATION APPROACH DIVIDES THE INPUT DATASET INTO
K GROUPS OF SAMPLES OF EQUAL SIZES. THESE SAMPLES ARE
CALLED FOLDS. FOR EACH LEARNING SET, THE PREDICTION FUNCTION
USES K-1 FOLDS, AND THE REST OF THE FOLDS ARE USED FOR THE TEST
SET.
LET'S TAKE AN EXAMPLE OF 5-FOLDS CROSS-VALIDATION. SO, THE
DATASET IS GROUPED INTO 5 FOLDS. ON 1ST ITERATION, THE FIRST FOLD IS
RESERVED FOR TEST THE MODEL, AND REST ARE USED TO TRAIN THE
MODEL. ON 2ND ITERATION, THE SECOND FOLD IS USED TO TEST THE
MODEL, AND REST ARE USED TO TRAIN THE MODEL. THIS PROCESS WILL
CONTINUE UNTIL EACH FOLD IS NOT USED FOR THE TEST FOLD.
ADVANTAGES AND DISADVANTAGES OF
CROSS VALIDATION
Advantages:
Overcoming Overfitting: Cross validation helps to prevent overfitting
by providing a more robust estimate of the model’s performance on
unseen data.
Model Selection: Cross validation can be used to compare different
models and select the one that performs the best on average.

Data Efficient: Cross validation allows the use of all the available data
for both training and validation, making it a more data-efficient
method compared to traditional validation techniques.
Disadvantages:
Computationally Expensive: Cross validation can be computationally
expensive, especially when the number of folds is large or when the
model is complex and requires a long time to train.
Time-Consuming: Cross validation can be time-consuming, especially
when there are many hyperparameters to tune or when multiple
models need to be compared.

Splett Etal (2018) Comparison of Universal Mental Health Screening To Students Already Receiving Intervention in A MTSS
No ratings yet
Splett Etal (2018) Comparison of Universal Mental Health Screening To Students Already Receiving Intervention in A MTSS
14 pages
Module IV - Machine Learning
No ratings yet
Module IV - Machine Learning
53 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
17 pages
Session 3 Types of Machine Learning
No ratings yet
Session 3 Types of Machine Learning
22 pages
Machine Learning and Web Scraping Lesson02
No ratings yet
Machine Learning and Web Scraping Lesson02
29 pages
Machine Learning
No ratings yet
Machine Learning
20 pages
Chapter 3notes
No ratings yet
Chapter 3notes
46 pages
Types of Machine Learning - Tpoint Tech
No ratings yet
Types of Machine Learning - Tpoint Tech
10 pages
Artificial Intelligent: Supervised Learning and Unsupervised Learning
No ratings yet
Artificial Intelligent: Supervised Learning and Unsupervised Learning
17 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
19 pages
Unit 1
No ratings yet
Unit 1
52 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
25 pages
Unit3-Important Topics Related To Neural Network
No ratings yet
Unit3-Important Topics Related To Neural Network
10 pages
Unit-5 Machine Learning
No ratings yet
Unit-5 Machine Learning
25 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
Module 1
No ratings yet
Module 1
122 pages
NeuralNetwork Learning
No ratings yet
NeuralNetwork Learning
22 pages
chp5 (14) Fam
No ratings yet
chp5 (14) Fam
13 pages
Learning Algorithms
No ratings yet
Learning Algorithms
28 pages
Machine Learning Is The Branch of
No ratings yet
Machine Learning Is The Branch of
12 pages
Types of Machine Learning
No ratings yet
Types of Machine Learning
14 pages
Module 1
No ratings yet
Module 1
47 pages
MLT Unit 1
No ratings yet
MLT Unit 1
15 pages
Machine Learning - Part - 1
No ratings yet
Machine Learning - Part - 1
17 pages
Unit 1
No ratings yet
Unit 1
19 pages
Unit5 ML Introduction
No ratings yet
Unit5 ML Introduction
32 pages
DS&ML 1
No ratings yet
DS&ML 1
9 pages
2 ML
No ratings yet
2 ML
9 pages
ML Lecture 2 3 Types
No ratings yet
ML Lecture 2 3 Types
27 pages
Machine Learning and AI
No ratings yet
Machine Learning and AI
13 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
UNIT4
No ratings yet
UNIT4
12 pages
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
No ratings yet
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
19 pages
FDS Assignment
No ratings yet
FDS Assignment
76 pages
BDAunit 5
No ratings yet
BDAunit 5
26 pages
AI Unit 4
No ratings yet
AI Unit 4
11 pages
Business Data Mining Week 5
No ratings yet
Business Data Mining Week 5
19 pages
Machine Learning Types
No ratings yet
Machine Learning Types
30 pages
Ai Unit 4
No ratings yet
Ai Unit 4
34 pages
Computer Science & Engineering: Apex Institute of Technology
No ratings yet
Computer Science & Engineering: Apex Institute of Technology
16 pages
U5 Unsupervised Learning
No ratings yet
U5 Unsupervised Learning
15 pages
BDA Unit-5
No ratings yet
BDA Unit-5
26 pages
Types of ML
No ratings yet
Types of ML
10 pages
Unit 1
No ratings yet
Unit 1
8 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
135 pages
Data Science Solutions IA 2
No ratings yet
Data Science Solutions IA 2
16 pages
Supervised Vs Unsupervised Learning
No ratings yet
Supervised Vs Unsupervised Learning
2 pages
Unit 3-Introduction To Machine Learning
No ratings yet
Unit 3-Introduction To Machine Learning
44 pages
Machine Learning Classification, Regression and Clustering
No ratings yet
Machine Learning Classification, Regression and Clustering
77 pages
Colloquium Evaluation: Faculty of Computer Science and Engineering To:Kanika Gupta Ma'Am Bhavya Sethi 16csu082
No ratings yet
Colloquium Evaluation: Faculty of Computer Science and Engineering To:Kanika Gupta Ma'Am Bhavya Sethi 16csu082
12 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
74 pages
ML Intro Types of Learning
No ratings yet
ML Intro Types of Learning
13 pages
Data Science Unit 3
No ratings yet
Data Science Unit 3
10 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
25 pages
Unit 1
No ratings yet
Unit 1
24 pages
Supervised Unsupervised Reinforcement
No ratings yet
Supervised Unsupervised Reinforcement
39 pages
Machine L
No ratings yet
Machine L
29 pages
AIML - Practical No.01
No ratings yet
AIML - Practical No.01
9 pages
Ai Unit 4
No ratings yet
Ai Unit 4
32 pages
CSS 1 2 3 Chapter Progressive Test QB
No ratings yet
CSS 1 2 3 Chapter Progressive Test QB
1 page
Comparative Study
No ratings yet
Comparative Study
17 pages
FAM Unit4
No ratings yet
FAM Unit4
11 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Chapter 1
No ratings yet
Chapter 1
59 pages
NIS-2User Authentication and Access Control
No ratings yet
NIS-2User Authentication and Access Control
45 pages
Steps To Run Our Website
No ratings yet
Steps To Run Our Website
1 page
Indexhtml
No ratings yet
Indexhtml
13 pages
6th - SEM Machine Learning Notes PDF
100% (1)
6th - SEM Machine Learning Notes PDF
36 pages
Qasim Et Al 2024 Lasso Type Instrumental Variable Selection Methods With An Application To Mendelian Randomization
No ratings yet
Qasim Et Al 2024 Lasso Type Instrumental Variable Selection Methods With An Application To Mendelian Randomization
24 pages
A Machine Learning Method For Predicting Disease-Associated MicroRNA
No ratings yet
A Machine Learning Method For Predicting Disease-Associated MicroRNA
10 pages
A Convolutional Route To Abbreviation Disambiguation in Clinical Text
No ratings yet
A Convolutional Route To Abbreviation Disambiguation in Clinical Text
8 pages
Mla Cae 1 QB
No ratings yet
Mla Cae 1 QB
2 pages
A Rapid and Nondestructive Method To Determine The Distribution Map Prot Carbo Sialic Acid On EBN - Shi
No ratings yet
A Rapid and Nondestructive Method To Determine The Distribution Map Prot Carbo Sialic Acid On EBN - Shi
7 pages
07-12 FR PDF
No ratings yet
07-12 FR PDF
75 pages
2021, Transfer Learning Via Multi-Scale Convolutional Neural Layers For Human-Virus Protein-Protein Interaction Prediction
No ratings yet
2021, Transfer Learning Via Multi-Scale Convolutional Neural Layers For Human-Virus Protein-Protein Interaction Prediction
8 pages
AIRLINE
No ratings yet
AIRLINE
63 pages
BDA - Research Paper6
No ratings yet
BDA - Research Paper6
10 pages
Process Validation Thesis
100% (3)
Process Validation Thesis
8 pages
Mts Made To Stick Model
No ratings yet
Mts Made To Stick Model
36 pages
David Montague, Algorithmic Trading of Futures Via Machine Learning
No ratings yet
David Montague, Algorithmic Trading of Futures Via Machine Learning
5 pages
LLM ML Interview Q
No ratings yet
LLM ML Interview Q
43 pages
Week-1 ML Slides
No ratings yet
Week-1 ML Slides
16 pages
Data Analysis and Graphics Using R An Example Based Approach Third Edition John Maindonald
100% (1)
Data Analysis and Graphics Using R An Example Based Approach Third Edition John Maindonald
59 pages
Application of Machine Learning For Fuel Consumption Modelling of Trucks Repository UoN
No ratings yet
Application of Machine Learning For Fuel Consumption Modelling of Trucks Repository UoN
11 pages
1.20 Deep Learning in Business Analytics
No ratings yet
1.20 Deep Learning in Business Analytics
21 pages
Implementation of MLOps 1710672760
No ratings yet
Implementation of MLOps 1710672760
23 pages
Blankertz 2007
No ratings yet
Blankertz 2007
12 pages
Cross Validation
No ratings yet
Cross Validation
10 pages
Delivery Time Prediction Using Random Forest
No ratings yet
Delivery Time Prediction Using Random Forest
6 pages
Lesson 3.2 - Supervised Learning Evaluation PDF
No ratings yet
Lesson 3.2 - Supervised Learning Evaluation PDF
38 pages
Application of Data Science and Machine Learning Algorithms For ROP Prediction Turning Data Into Knowledge
No ratings yet
Application of Data Science and Machine Learning Algorithms For ROP Prediction Turning Data Into Knowledge
10 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
37 pages
What Are The Types of Machine Learning?
100% (1)
What Are The Types of Machine Learning?
24 pages
Module 2
No ratings yet
Module 2
151 pages
CS 304.A Training Models
No ratings yet
CS 304.A Training Models
149 pages
Cross Validation in ML
No ratings yet
Cross Validation in ML
5 pages

FAM Unit5

Uploaded by

FAM Unit5

Uploaded by

UNIT 5

2. Decision Tree: This algorithm partition the dataset into

3. Random Forest: This method combines multiple trees to

4. Naïve Bayes : This algorithm works for text classification

5. K-Nearest Neighbor :A simple classification algorithm that

 It is used for the prediction of continuous

 which means there are two classes such as

Parameter Classification Regression

Basic Mapping function is used Mapping function is used

Involvers Prediction of Discrete values Continuous values

Nature of the predicted Unordered Ordered

Method of calculation By measuring accuracy By measuring root mean

Example Algorithms Decision tree, Logistic Linear regression, Random

 With the help of supervised learning, the model

 In supervised learning, we can have an exact

 Supervised learning model helps us to solve

 Supervised learning models are not suitable for

 Supervised learning cannot predict the correct

 Training required lots of computation times.

 In supervised learning, we need enough

 These algorithms discover hidden patterns

It does it by finding some similar patterns in the

It is an unsupervised learning method, hence no

• It determines the set of items that occurs

• Such as people who buy X item (suppose a

• Difficult to measure accuracy or

• Recommendation systems: Unsupervised

Supervised machine Unsupervised machine

Algorithms are used

Accuracy Highly accurate Less accurate

No. of classes is not

Uses real-time analysis of

Desired output is not

It is not possible to learn It is possible to learn larger

Supervised learning is also Unsupervised learning is

Example: Optical character Example: Find a face in an

Use training data to infer

? A semi-supervised learning approach uses

? With semi-supervised learning, you train an

• We use the trained model to predict labels for the

• We retrain the model with the pseudo-labeled and

• This process happens iteratively as the model improves

• As we all know that model development is a multi-

• Therefore evaluating a model plays a vital role so

• The evaluation also helps to analyze a model’s key

This unseen data is called the testing data.

This is similar to the test data used in software

In machine learning, we use testing data to ensure

The procedure involves taking a dataset and

The second subset is not used to train the model;

How to avoid underfitting:

• True Positive (TP): The model correctly predicted a positive

This matrix is especially helpful in evaluating a model’s

Methods of Cross Validation

Both the subsets are given 50% of the dataset.

But it has one of the big disadvantages that we are just

An advantage of using this method is that we make use of

The major drawback of this method is that it leads

Another drawback is it takes a lot of execution time as it

You might also like