0% found this document useful (0 votes)

705 views14 pages

Supervised Learning (Classification and Regression)

This document provides an overview of supervised machine learning techniques for classification and regression problems. It defines classification and regression, describing how classification involves categorizing inputs into different classes based on decision boundaries, while regression predicts a numeric output value. Common algorithms for each are listed, such as logistic regression, k-nearest neighbors, and support vector machines for classification, and linear, polynomial, and support vector regression for regression. Examples of applications are also provided.

Uploaded by

shreya sarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

705 views14 pages

Supervised Learning (Classification and Regression)

Uploaded by

shreya sarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Unit 1:

Supervised Learning (Regression/Classification)

Basic methods: Distance-based methods, Nearest-Neighbours, Decision

Trees, Naive Bayes

Linear models: Linear Regression, Logistic Regression, Generalized

Linear Models

Support Vector Machines, Nonlinearity and Kernel Methods

Beyond Binary Classification: Multi-class/Structured Outputs, Ranking

----------------------------------------------------------------------------------------------------------------------

Algorithm:

Data Program

Computer

Output

Machine Learning:

Data Output

Computer

Program
Tom Mitchell in his book Machine Learning provides a definition:

A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E.

Experience E- Data

Task T- Prediction / Classification / Acting

Measure of improvement - P

Learning System:
Here we feed experience, some background knowledge to the learner.

Then it creates a model.

Reasoner gets the problem then solve it.

Steps for creating a learner

Choose training experience -> Features

Choose the target function (function need to be learned)

Choose how to represent the target function (appropriate class of functions for different features)

Choose a learning algorithm to insert the target function.

 If we choose a Rich Representation then we can solve a complex function but it may be more
difficult to learn.
 Class of functions are called Hypothesis Language
Supervised Learning

Discrete- Whether it will rain or not tomorrow.

According to the symptoms predict whether a patient has a particular disease.

Continuous- Price of a house based on various constrains.

Training Inputs Output

Instances X1 X2 X3 … Xn Y
I1 a1 a2 a3 … an Y1
I2 b1 b2 b3 … bn Y2
I3 c1 c2 c3 … cn Y3
Microsoft Corporation – name of a company

First word- Microsoft

Later word- Corporation

Acquired XYZ

Outside- Acquired (not a company name)

XYZ- Company Name

In supervised learning, the goal is to learn the mapping (the rules) between a set of inputs & outputs.

For example, the inputs could be the weather forecast, and the outputs would be the visitors to the
beach. The goal in supervised learning would be to learn the mapping that describes the
relationship between temperature and number of beach visitors.

Example labelled data is provided of past input and output pairs during the learning process to teach
the model how it should behave, hence, ‘supervised’ learning. For the beach example, new inputs can
then be fed in of forecast temperature and the Machine learning algorithm will then output a future
prediction for the number of visitors.

Being able to adapt to new inputs and make predictions is the crucial generalisation part of machine
learning. In training, we want to maximise generalisation, so the supervised model defines the real
‘general’ underlying relationship. If the model is over-trained, we cause over-fitting to the examples
used and the model would be unable to adapt to new, previously unseen inputs.

A side effect to be aware of in supervised learning that the supervision we provide introduces bias to
the learning. The model can only be imitating exactly what it was shown, so it is very important to
show it reliable, unbiased examples. Also, supervised learning usually requires a lot of data before it
learns. Obtaining enough reliably labelled data is often the hardest and most expensive part of using
supervised learning. (Hence why data has been called the new oil!)

The output from a supervised Machine Learning model could be a category from a finite set e.g [low,
medium, high] for the number of visitors to the beach:

Input [temperature=20] -> Model -> Output = [visitors=high]

When this is the case, it’s deciding how to classify the input, and so is known as classification.

Alternatively, the output could be a real-world scalar (output a number):

Input [temperature=20] -> Model -> Output = [visitors=300]

When this is the case, it is known as regression.

Classification
Classification is used to group the similar data points into different sections in order to classify them.
Machine Learning is used to find the rules that explain how to separate the different data points.

But how are the magical rules created? Well, there are multiple ways to discover the rules. They all
focus on using data and answers to discover rules that linearly separate data points.

Linear separability is a key concept in machine learning. All that linear separability means is ‘can the
different data points be separated by a line?’. So put simply, classification approaches try to find the
best way to separate data points with a line.

The lines drawn between classes are known as the decision boundaries. The entire area that is chosen
to define a class is known as the decision surface. The decision surface defines that if a data point falls
within its boundaries, it will be assigned a certain class.

Machine learning for classification

 Logistic Regression (LR)

 K-Nearest Neighbours (K-NN)
 Support Vector Machine (SVM)
 Kernel SVM
 Naïve Bayes
 Decision Tree Classification
 Random Forest Classification
It’s considered that methods like SVM and random forests work best. Keep in mind that there are no
one-size-fits-all rules, and they probably won’t operate properly for your task.

Use cases of Classification Algorithms

Classification algorithms can be used in different places. Below are some popular use cases of
Classification Algorithms:

1. Email Spam Detection

2. Speech Recognition
3. Identifications of Cancer tumor cells
4. Drugs Classification
5. Biometric Identification, etc.

Regression
Regression is another form of supervised learning. The difference between classification and regression
is that regression outputs a number rather than a class. Therefore, regression is useful when predicting
number based problems like stock market prices, the temperature for a given day, or the probability of
an event.

Examples

Regression is used in financial trading to find the patterns in stocks and other assets to decide when to
buy/sell and make a profit.

For classification, it is already being used to classify if an email you receive is spam.

Both the classification and regression supervised learning techniques can be extended to much more
complex tasks. For example, tasks involving speech and audio, image classification, object detection
and chat bots are some examples.

A recent example shown below uses a model trained with supervised learning to realistically fake
videos of people talking.

You might be wondering how this complex image based task relates to classification or regression.
Well, it comes back to everything in the world, even complex phenomenon, being fundamentally
described with math and numbers. In this example, a neural network is still only outputting numbers
like in regression. But in this example the numbers are the numerical 3d coordinate values of a facial
mesh.
Machine learning for regression

Below is a short list of machine learning methods (having their own advantages and disadvantages)
that can be used for regression

 Liner regression
 Polynomial regression
 Ridge regression
 Decision trees
 SVR (Support Vector Regression)
 Random forest

You can find out the detailed explanation of each method here.

Difference between Regression and Classification

Regression Algorithm Classification Algorithm

In Regression, the output variable must be of In Classification, the output variable must be a
continuous nature or real value. discrete value.

The task of the regression algorithm is to map The task of the classification algorithm is to
the input value (x) with the continuous output map the input value(x) with the discrete output
variable(y). variable(y).

Regression Algorithms are used with Classification Algorithms are used with discrete
continuous data. data.

In Regression, we try to find the best fit line, In Classification, we try to find the decision
which can predict the output more accurately. boundary, which can divide the dataset into
different classes.

Regression algorithms can be used to solve the Classification Algorithms can be used to solve
regression problems such as Weather classification problems such as Identification of
Prediction, House price prediction, etc. spam emails, Speech Recognition, Identification
of cancer cells, etc.

The regression Algorithm can be further The Classification algorithms can be divided
divided into Linear and Non-linear Regression. into Binary Classifier and Multi-class Classifier.

Classification
100% (2)
Classification
105 pages
Combined ML
100% (1)
Combined ML
705 pages
Clustering (Unit 3)
100% (2)
Clustering (Unit 3)
71 pages
R22 ML Syllabus
No ratings yet
R22 ML Syllabus
2 pages
Machine Learning Algorithms PDF
100% (1)
Machine Learning Algorithms PDF
148 pages
Complete Introduction To Machine Learning 3rd Edition Ethem Alpaydin PDF For All Chapters
No ratings yet
Complete Introduction To Machine Learning 3rd Edition Ethem Alpaydin PDF For All Chapters
55 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Supervised and Deep Learning
No ratings yet
Supervised and Deep Learning
83 pages
U L D R: Nsupervised Earning and Imensionality Eduction
No ratings yet
U L D R: Nsupervised Earning and Imensionality Eduction
58 pages
Unit - 4 Machine Learning
100% (1)
Unit - 4 Machine Learning
84 pages
ML Interview Questions and Answers
100% (1)
ML Interview Questions and Answers
25 pages
Machine Learning Report
No ratings yet
Machine Learning Report
58 pages
ML Unit-1
No ratings yet
ML Unit-1
26 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
No ratings yet
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
12 pages
Chapter-2-Fundamentals of Machine Learning
No ratings yet
Chapter-2-Fundamentals of Machine Learning
23 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
12 pages
1694601295-Unit 3.6 Generalized Discriminant Analysis CU 2.0
100% (1)
1694601295-Unit 3.6 Generalized Discriminant Analysis CU 2.0
15 pages
Unit - 3 ML
No ratings yet
Unit - 3 ML
17 pages
Regression Notes
100% (1)
Regression Notes
20 pages
21CS54 Aiml Module3 PPT
No ratings yet
21CS54 Aiml Module3 PPT
102 pages
ML Unit 1
No ratings yet
ML Unit 1
44 pages
Data Mining Clustering
No ratings yet
Data Mining Clustering
76 pages
SCSA3015 Deep Learning Unit 2 PDF
No ratings yet
SCSA3015 Deep Learning Unit 2 PDF
32 pages
Final Twitter - Sentiment - Analysis - Report
100% (1)
Final Twitter - Sentiment - Analysis - Report
14 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
Alzheimers Disease Detection Using Different Machine Learning Algorithms
100% (1)
Alzheimers Disease Detection Using Different Machine Learning Algorithms
7 pages
Data Science and Machine Learning - MCQ
No ratings yet
Data Science and Machine Learning - MCQ
19 pages
Ai Unit-4-1
No ratings yet
Ai Unit-4-1
9 pages
Ensemble Methods Bagging Boosting and Stacking
100% (1)
Ensemble Methods Bagging Boosting and Stacking
19 pages
Exploratory Data Analysis of Titanic Survival Prediction Using Machine Learning Techniques
No ratings yet
Exploratory Data Analysis of Titanic Survival Prediction Using Machine Learning Techniques
5 pages
Machine Learning Lab Manual 7
100% (1)
Machine Learning Lab Manual 7
8 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Data Science A Beginner S Guide 1668243666
100% (1)
Data Science A Beginner S Guide 1668243666
26 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Machine Learning: The Hundred-Page Book
No ratings yet
Machine Learning: The Hundred-Page Book
9 pages
Lec-1 ML Intro
No ratings yet
Lec-1 ML Intro
15 pages
10 Support Vector Machine
No ratings yet
10 Support Vector Machine
130 pages
Crime Prediction in Nigeria's Higer Institutions
No ratings yet
Crime Prediction in Nigeria's Higer Institutions
13 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Data Integration & Transformation
No ratings yet
Data Integration & Transformation
14 pages
Classification and Prediction
No ratings yet
Classification and Prediction
126 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
21 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
6 pages
ML First Unit
No ratings yet
ML First Unit
70 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Machine Learning Module-3
No ratings yet
Machine Learning Module-3
23 pages
Mini Project On Diabetes Prediction: Information Technology
No ratings yet
Mini Project On Diabetes Prediction: Information Technology
19 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
Rainfall Prediction Using Machine Learning Algorithms A Comparative Analysis Approach
100% (1)
Rainfall Prediction Using Machine Learning Algorithms A Comparative Analysis Approach
4 pages
71A Machine Learning
No ratings yet
71A Machine Learning
8 pages
Flood Prediction Using Rainfall-Flow Pattern in Data-Sparse Watersheds
100% (1)
Flood Prediction Using Rainfall-Flow Pattern in Data-Sparse Watersheds
12 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
Data Mining - Classification Using Frequent Pattern
No ratings yet
Data Mining - Classification Using Frequent Pattern
8 pages
Building Recommendation System Using Movielens Data
No ratings yet
Building Recommendation System Using Movielens Data
6 pages
18AI61
No ratings yet
18AI61
3 pages
Demo
No ratings yet
Demo
29 pages
Introduction To Machine Learning: Mohsen Afsharchi
No ratings yet
Introduction To Machine Learning: Mohsen Afsharchi
72 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Campus Placement Analyzer: Using Supervised Machine Learning Algorithms
No ratings yet
Campus Placement Analyzer: Using Supervised Machine Learning Algorithms
5 pages
Institute of Engineering & Gsss Technologyforwomen: "Face Mask Detection Using Convolutional Neural Network (CNN) Model"
No ratings yet
Institute of Engineering & Gsss Technologyforwomen: "Face Mask Detection Using Convolutional Neural Network (CNN) Model"
12 pages
UPGrad PDF
No ratings yet
UPGrad PDF
7 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
2022 Prediction of Blasting Induced Air-Overpressure Using A Radial Basis Function Network With An Additional Hidden Layer
No ratings yet
2022 Prediction of Blasting Induced Air-Overpressure Using A Radial Basis Function Network With An Additional Hidden Layer
14 pages
ML Question Answer
No ratings yet
ML Question Answer
21 pages
Indian Sign Language Character Recognition: Course Project-CS365A
No ratings yet
Indian Sign Language Character Recognition: Course Project-CS365A
14 pages
Prediction of Sulfur Content in Propane and Butane After Gas Purification On A Treatment Unit
No ratings yet
Prediction of Sulfur Content in Propane and Butane After Gas Purification On A Treatment Unit
11 pages
Class06 SVM
No ratings yet
Class06 SVM
47 pages
GPU Accelerated Number Plate Localization
No ratings yet
GPU Accelerated Number Plate Localization
9 pages
Classification of Banana Leaf
No ratings yet
Classification of Banana Leaf
53 pages
Skin Cancer Detection Using Machine Learning Techniques: Vidya M Dr. Maya V Karki
No ratings yet
Skin Cancer Detection Using Machine Learning Techniques: Vidya M Dr. Maya V Karki
5 pages
DMDW Case Study Finished
No ratings yet
DMDW Case Study Finished
28 pages
Infrared-Ultrasonic Sensor Fusion For Support Vect PDF
No ratings yet
Infrared-Ultrasonic Sensor Fusion For Support Vect PDF
14 pages
Rainfall Prediction System: (Peer-Reviewed, Open Access, Fully Refereed International Journal)
No ratings yet
Rainfall Prediction System: (Peer-Reviewed, Open Access, Fully Refereed International Journal)
7 pages
Named Entity Recognition and Resolution
No ratings yet
Named Entity Recognition and Resolution
17 pages
Using Data Mining To Predict Secondary School Student Performance
No ratings yet
Using Data Mining To Predict Secondary School Student Performance
9 pages
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
No ratings yet
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
3 pages
Nuclear Maintenance Machine Learning
No ratings yet
Nuclear Maintenance Machine Learning
7 pages
Proposal Multiple Dieases Prediction System 1
No ratings yet
Proposal Multiple Dieases Prediction System 1
4 pages
Cse3008 Introduction-To-Machine - Learning Eth 1.0 2 Cse3008
No ratings yet
Cse3008 Introduction-To-Machine - Learning Eth 1.0 2 Cse3008
2 pages
Artificial Intelligence Applied To Project Success: A Literature Review
No ratings yet
Artificial Intelligence Applied To Project Success: A Literature Review
6 pages
Automated Sign Language Interpreter
No ratings yet
Automated Sign Language Interpreter
5 pages
Goutham Resume
No ratings yet
Goutham Resume
2 pages