Unit 2 Supervised Learning

The document provides an overview of machine learning types, including supervised, unsupervised, and semi-supervised learning, detailing their concepts, goals, and examples. It discusses specific case studies, such as digit recognition using the MNIST dataset and wine quality prediction using Principal Component Analysis (PCA). Additionally, it highlights the challenges of labeling data and the use of semi-supervised techniques like label propagation and active learning to enhance model training.

Uploaded by

onlinecoursesainds

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views36 pages

Unit 2 Supervised Learning

Uploaded by

onlinecoursesainds

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

Machine learning can be broadly categorized into several types,

primarily based on the kind of data they learn from and the type
of task they perform.1

Here's a breakdown of the main types:

1. Supervised Learning:
 Concept: The algorithm learns from labelled data, which
means the data includes both the input features and the
correct output (or target variable). It's like learning with a
teacher who provides the answers. 2

 Goal: To learn a mapping function that can predict the output

for new, unseen input data. 3
Examples:

o Classification: Predicting categories (e.g., spam or not spam, cat or dog). 4

Algorithms include Logistic Regression, Support Vector Machines,

Decision Trees, Random Forests, Naive Bayes.

o Regression: Predicting continuous values (e.g., house prices, stock

prices). Algorithms include Linear Regression, Polynomial Regression,
Support Vector Regression, Decision Tree Regression, Random Forest
Regression. 56
2. Unsupervised Learning:

 Concept: The algorithm learns from unlabeled data, meaning the data only
includes input features without corresponding outputs. It's like learning
7

without a teacher, discovering patterns on its own. 8

 Goal: To find patterns, structures, or groupings in the data. 9

 Examples:
o Clustering: Grouping similar data points together (e.g., customer segmentation,
document clustering). Algorithms include K-Means Clustering, Hierarchical Clustering,
10

DBSCAN. 11

o Dimensionality Reduction: Reducing the number of features while preserving important

information (e.g., feature extraction, data visualization). Algorithms include Principal
12

Component Analysis (PCA), t-SNE.

o Association Rule Mining: Discovering relationships between variables (e.g., market

basket analysis, recommendation systems). Algorithms include Apriori, FP-Growth. 13
Semi-Supervised Learning:
 Concept: A combination of supervised and unsupervised learning. The
algorithm learns from a dataset that contains both labelled and unlabelled
data.
 Goal: To leverage the limited labelled data to improve the learning from the
abundant unlabelled data.
 Examples:
o Image classification with a small number of labelled images and a large
number of unlabelled images.
o Natural language processing tasks where labelled data is scarce.
1. Supervised Learning:

CASE STUDY: DISCERNING DIGITS FROM IMAGES

These images aren’t unlike the Captcha checks many websites have in place to make sure
you’re not a computer trying to hack into the user accounts.

Our research goal is to let a computer recognize images of numbers (step one of the data science
process).
The data we’ll be working on is the MNIST data set, which is often used in the data science literature
for teaching and benchmarking
The MNIST images can be found in the data sets package of Scikit-learn and are already normalized
for you (all scaled to the same size: 64x64 pixels), so we won’t need much data preparation (step
three of the data science process). But let’s first fetch our data as step two of the data science
process, with the following listing.
but pl.matshow() returns a two-dimensional array (a matrix) reflecting the shape of the image. To flatten it into a list, we
need to call reshape() on digits.images. The net result will be a one-dimensional array that looks something like this:
From this point on, it’s a standard classification problem, which brings us to step five of the data science process:
model building.

Now that we have a way to pass the contents of an image into the classifier, we need to pass it a training data set so it
can start learning how to predict the numbers in the images.

We mentioned earlier that Scikit-learn contains a subset of the MNIST database (1,800 images), so we’ll use that. Each
image is also labeled with the number it actually shows. This will build a probabilistic model in memory of the most
likely digit shown in an image given its grayscale values.

Once the program has gone through the training set and built the model, we can then pass it the test set of data to
see how well it has learned to interpret the images using the model.
The following listing shows how to implement these steps in code.
The end result of this code is called a confusion matrix, such as the one shown in figure 3.6. Returned as a two-
dimensional array, it shows how often the number predicted was the correct number on the main diagonal and also in the
matrix entry (i,j), where j was predicted but the image showed i.

Looking at figure 3.6 we can see that the model predicted the number 2 correctly 17 times (at coordinates 3,3), but also
that the model predicted the number 8 15 times when it was actually the number 2 in the image (at 9,3).
From the confusion matrix, we can deduce that for most images the predictions are quite accurate. In a good
model you’d expect the sum of the numbers on the main diagonal of the matrix (also known as the matrix trace)
to be very high compared to the sum of all matrix entries, indicating that the predictions were correct for the
most part.

Let’s assume we want to show off our results in a more easily understandable way or we want to inspect several of
the images and the predictions our program has made: we can use the following code to display one next to the
other. Then we can see where the program has gone wrong and needs a little more training. If we’re satisfied with
the results, the model building ends here and we arrive at step six: presenting the results.
Figure 3.7 shows how all predictions seem to be correct except for the digit number 2, which it labels as 8.
We should forgive this mistake as this 2 does share visual similaritieswith 8.
The bottom left number is ambiguous, even to humans; is it a 5 or a 3? It’s debatable, but the algorithm
thinks it’s a 3. By discerning which images were misinterpreted, we can train the model further by labeling
them with the correct number they display and feeding them back into the model as a new training set
(step 5 of the data science process). This will make the model more accurate, so the cycle of learn, predict,
correct continues and the predictions become more accurate. This is a controlled data set we’re using for
the example. All the examples are the same size and they are all in 16 shades of gray.

In this supervised learning example, it’s apparent that without the labels associated with each image telling the
program what number that image shows, a model cannot be built and predictions cannot be made.
Unsupervised Learning

CASE STUDY: FINDING LATENT VARIABLES IN A WINE QUALITY DATA SET

In this short case study, you’ll use a technique known as Principal Component Analysis (PCA) to find latent variables in a
data set that describes the quality of wine.
Then you’ll compare how well a set of latent variables works in predicting the quality of wine against the original
observable set.
Part one of the data science process is to set our research goal: We want to explain the subjective “wine quality”
feedback using the different wine properties.
Our first job then is to download the data set (step two: acquiring data), as shown in the following listing, and
prepare it for analysis (step three: data preparation).
Then we can run the PCA algorithm and view the results to look at our options.
Because PCA is an explorative technique, we now arrive at step four of the data science process:
Data exploration, as shown in the following listing.
The plot generated from the wine data set is shown in figure 3.8. What you hope to see is an elbow or hockey stick
shape in the plot. This indicates that a few variables can represent the majority of the information in the data set while
the rest only add a little more.
In our plot, PCA tells us that reducing the set down to one variable can capture approximately 28% of the total
information in the set (the plot is zero-based, so variable one is at position zero on the x axis), two variables will capture
approximately 17% more or 45% total, and so on. Table 3.3 shows you the full read-out.

An elbow shape in the plot suggests that five variables can hold most of the information found inside the data. You
could argue for a cut-off at six or seven variables instead, but we’re going to opt for a simpler data set versus one
with less variance in data against the original data set.
At this point, we could go ahead and see if the original data set recoded with five latent variables is good enough to
predict the quality of the wine accurately, but before we do, we’ll see how we might identify what they represent.
INTERPRETING THE NEW VARIABLES
With the initial decision made to reduce the data set from 11 original variables to 5 latent variables, we can check to
see whether it’s possible to interpret or name them based on their relationships with the originals.
Actual names are easier to work with than codes such as lv1, lv2, and so on. We can add the line of code in the
following listing to generate a table that shows how the two sets of variables correlate.
COMPARING THE ACCURACY OF THE ORIGINAL DATA SET WITH LATENT VARIABLES
Now that we’ve decided our data set should be recoded into 5 latent variables rather than the 11 originals, it’s time to
see how well the new data set works for predicting the quality of wine when compared to the original. We’ll use the
Naïve Bayes Classifier algorithm we saw in the previous example for supervised learning to help.
Let’s start by seeing how well the original 11 variables could predict the wine quality scores.
The following listing presents the code to do this.
Now we’ll run the same prediction test, but starting with only 1 latent variable instead of the original 11. Then we’ll
add another, see how it did, add another, and so on to see how the predictive performance improves. The following
listing shows how this is done.
The resulting plot is shown in figure 3.9.
The plot in figure 3.9 shows that with only 3 latent variables, the classifier does a better job of predicting
wine quality than with the original 11. Also, adding more latent variables beyond 5 doesn’t add as much
predictive power as the first 5.
This shows our choice of cutting off at 5 variables was a good one, as we’d hoped.
We looked at how to group similar variables, but it’s also possible to group observations.
Semi-supervised learning

It shouldn’t surprise you to learn that while we’d like all our data to be labeled so we can use the more powerful
supervised machine learning techniques, in reality we often start with only minimally labeled data, if it’s labeled at
all. We can use our unsupervised machine learning techniques to analyze what we have and perhaps add labels to
the data set, but it will be prohibitively costly to label it all.

Our goal then is to train our predictor models with as little labeled data as possible. This is where semi-supervised
learning techniques come in—hybrids of the two approaches we’ve already seen.
Take for example the plot in figure 3.12. In this case, the data has only two labeled observations; normally this is too
few to make valid predictions.
A common semi-supervised learning technique is label propagation. In this technique, you start with a labeled data
set and give the same label to similar data points. This is similar to running a clustering algorithm over the data set
and labeling each cluster based on the labels they contain. If we were to apply this approach to the data set in
figure 3.12, we might end up with something like figure 3.13. One special approach to semi-supervised learning worth
mentioning here is active learning. In active learning the program points out the observations it wants to see
labeled for its next round of learning based on some criteria you have specified. For example, you might set it to try
and label the observations the algorithm is least certain about, or you might use multiple models to make a
prediction and select the points where the models disagree the most.

Machine Learning With Python: Amin Zollanvari
No ratings yet
Machine Learning With Python: Amin Zollanvari
457 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
39 pages
Artificial Intelligence
100% (1)
Artificial Intelligence
47 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Machine Learning - Unit - 1
100% (1)
Machine Learning - Unit - 1
58 pages
Introduction To Data Science - Lin and Li
No ratings yet
Introduction To Data Science - Lin and Li
403 pages
Breaking Into AI!
No ratings yet
Breaking Into AI!
30 pages
Jntuk Machine Learning 3-2 Unit-4
No ratings yet
Jntuk Machine Learning 3-2 Unit-4
32 pages
Business Analytics 1 Ca 2
No ratings yet
Business Analytics 1 Ca 2
26 pages
DM - Ch4 - Classification (Part1)
No ratings yet
DM - Ch4 - Classification (Part1)
20 pages
AA2 Intro ML 2024
No ratings yet
AA2 Intro ML 2024
35 pages
COS10022 DSP Week02 Regressions
No ratings yet
COS10022 DSP Week02 Regressions
41 pages
Pattern Recognition Application
No ratings yet
Pattern Recognition Application
43 pages
Unit-4 Data Mining
No ratings yet
Unit-4 Data Mining
19 pages
Article Review 11 Eng
No ratings yet
Article Review 11 Eng
18 pages
BANA 560 - Lecture - 2 - Data - Mining - Overview - Data - Exploration
No ratings yet
BANA 560 - Lecture - 2 - Data - Mining - Overview - Data - Exploration
38 pages
Module2 Ids 240201 162026
No ratings yet
Module2 Ids 240201 162026
11 pages
86 37 196 Mod 5
No ratings yet
86 37 196 Mod 5
52 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Chapter 4 Classification
No ratings yet
Chapter 4 Classification
78 pages
Predictive Analytics in Marketing
No ratings yet
Predictive Analytics in Marketing
90 pages
FDS Unit-4
No ratings yet
FDS Unit-4
15 pages
Chapter-14 Data Science
No ratings yet
Chapter-14 Data Science
12 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Data Science
No ratings yet
Data Science
4 pages
Class10-Introduction To ML
No ratings yet
Class10-Introduction To ML
32 pages
4 - Data Analytics Using DM and ML Algorithms - 1
No ratings yet
4 - Data Analytics Using DM and ML Algorithms - 1
71 pages
Artificial Intelligence - (Unit - 1)
No ratings yet
Artificial Intelligence - (Unit - 1)
47 pages
Thesis
No ratings yet
Thesis
45 pages
Lecture 1
No ratings yet
Lecture 1
62 pages
Oe Cae 3
No ratings yet
Oe Cae 3
7 pages
TTDS Lectures
No ratings yet
TTDS Lectures
13 pages
Maxbox Starter60 Machine Learning
No ratings yet
Maxbox Starter60 Machine Learning
8 pages
Lecture 2 The Data Science Process and Tools For Each Step
No ratings yet
Lecture 2 The Data Science Process and Tools For Each Step
8 pages
Week 12 Intro To DS and ML
No ratings yet
Week 12 Intro To DS and ML
67 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
Unit6 Part3 General Procedure
No ratings yet
Unit6 Part3 General Procedure
19 pages
Unit 2
No ratings yet
Unit 2
48 pages
305 BA PYTHON - APR 2022 ANSWER Key
No ratings yet
305 BA PYTHON - APR 2022 ANSWER Key
14 pages
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
No ratings yet
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
84 pages
EXAMPLE ML in Real Life
No ratings yet
EXAMPLE ML in Real Life
6 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
Unit 1: Capstone Project
No ratings yet
Unit 1: Capstone Project
21 pages
Data Exploration
No ratings yet
Data Exploration
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
89 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
1 - Machine Learning
No ratings yet
1 - Machine Learning
26 pages
Air Quality Prediction Using Machine Learning
No ratings yet
Air Quality Prediction Using Machine Learning
29 pages
DsNaIT v2.0
No ratings yet
DsNaIT v2.0
43 pages
Unit 1 Part 4
No ratings yet
Unit 1 Part 4
8 pages
ML SIG - Day 1
No ratings yet
ML SIG - Day 1
55 pages
Dsur Ea2352001010391 W3
No ratings yet
Dsur Ea2352001010391 W3
3 pages
Unit 2
No ratings yet
Unit 2
19 pages
Capstone Project
No ratings yet
Capstone Project
28 pages
Data and Analysis
No ratings yet
Data and Analysis
13 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Machine Learning & Data Mining
No ratings yet
Machine Learning & Data Mining
4 pages
Ai Project Cycle Short Note
No ratings yet
Ai Project Cycle Short Note
9 pages
Intro To Data Science Summary
No ratings yet
Intro To Data Science Summary
17 pages
MS5107 Boston Housing, Corolla NUIG
No ratings yet
MS5107 Boston Housing, Corolla NUIG
6 pages
AI Project Cycle 4
No ratings yet
AI Project Cycle 4
3 pages
IIML CaseBook Interviews 22-23
No ratings yet
IIML CaseBook Interviews 22-23
281 pages
Prediction of Diseases
No ratings yet
Prediction of Diseases
6 pages
Medical Imaging Using Machine Learning and Deep Learning Algorithms: A Review
No ratings yet
Medical Imaging Using Machine Learning and Deep Learning Algorithms: A Review
5 pages
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
No ratings yet
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
27 pages
Unit-1 New
No ratings yet
Unit-1 New
48 pages
Review (3) A Comprehensive Review On Email Spam Classification Using Machine Learning Algorithms
No ratings yet
Review (3) A Comprehensive Review On Email Spam Classification Using Machine Learning Algorithms
6 pages
Ch-0 Introduction To Machine Learning
No ratings yet
Ch-0 Introduction To Machine Learning
45 pages
Human Annotator For Imbalanced Dossier
No ratings yet
Human Annotator For Imbalanced Dossier
11 pages
In5490 Classification
No ratings yet
In5490 Classification
85 pages
Acs Chemrev 8b00728 PDF
No ratings yet
Acs Chemrev 8b00728 PDF
75 pages
Draft Week3
No ratings yet
Draft Week3
41 pages
ML 01 Introduction
No ratings yet
ML 01 Introduction
34 pages
Machine Learning Based Intrusion Detection System
No ratings yet
Machine Learning Based Intrusion Detection System
5 pages
Unit-I (R20 Syllabus) Machine Learning Basics
No ratings yet
Unit-I (R20 Syllabus) Machine Learning Basics
50 pages
Ijrpr Paper Templatev1
No ratings yet
Ijrpr Paper Templatev1
17 pages
K Means
No ratings yet
K Means
9 pages
Vision Teransformaer Paper
No ratings yet
Vision Teransformaer Paper
12 pages
Ai Unit 3
No ratings yet
Ai Unit 3
23 pages
Unit: 1 Introduction To Artificial Intelligence and Machine Learning
No ratings yet
Unit: 1 Introduction To Artificial Intelligence and Machine Learning
17 pages
Can Artificial Intelligence Power Future Malware?: ESET White Paper
No ratings yet
Can Artificial Intelligence Power Future Malware?: ESET White Paper
16 pages
A Review of Research Works On Supervised Learning Algorithms For SCADA
No ratings yet
A Review of Research Works On Supervised Learning Algorithms For SCADA
19 pages
Named Entity Recognitionfor Culturalalheritage
No ratings yet
Named Entity Recognitionfor Culturalalheritage
22 pages
Dual-Contrastive Dual-Consistency Dual-Transformer A Semi-Supervised Approach To Medical Image Segmentation ICCVW 2023 Paper
No ratings yet
Dual-Contrastive Dual-Consistency Dual-Transformer A Semi-Supervised Approach To Medical Image Segmentation ICCVW 2023 Paper
10 pages
Iterative Loop Method Combining Active and Semi-Supervised Learning For Domain Adaptive Semantic Segmentation
No ratings yet
Iterative Loop Method Combining Active and Semi-Supervised Learning For Domain Adaptive Semantic Segmentation
10 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
13 pages
Machine LearningA Review
No ratings yet
Machine LearningA Review
10 pages
Scaling Transformer Paradigms A Robust Framework For Hallucination Detection in Resource Limited NLP Systems
No ratings yet
Scaling Transformer Paradigms A Robust Framework For Hallucination Detection in Resource Limited NLP Systems
10 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
A Quick and Easy Guide in Using SPSS for Linear Regression Analysis
From Everand
A Quick and Easy Guide in Using SPSS for Linear Regression Analysis
Jurex Gallo
No ratings yet