0% found this document useful (0 votes)
11 views

Unit 2 Supervised Learning and Applications

The document discusses supervised learning in machine learning, focusing on classification algorithms that predict labels for input data. It covers various types of classifiers, including binary and multi-class classifiers, and details specific algorithms such as K-Nearest Neighbors, Decision Trees, and Support Vector Machines. Additionally, it highlights applications of supervised learning in business domains like pricing optimization, customer relationship management, and sales forecasting.

Uploaded by

prernakush0212
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Unit 2 Supervised Learning and Applications

The document discusses supervised learning in machine learning, focusing on classification algorithms that predict labels for input data. It covers various types of classifiers, including binary and multi-class classifiers, and details specific algorithms such as K-Nearest Neighbors, Decision Trees, and Support Vector Machines. Additionally, it highlights applications of supervised learning in business domains like pricing optimization, customer relationship management, and sales forecasting.

Uploaded by

prernakush0212
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Unit 2 Supervised Learning and Applications

Classification Algorithm in Machine Learning


Classification is a supervised machine learning method where the model tries to predict the
correct label of a given input data. In classification, the model is fully trained using the
training data, and then it is evaluated on test data before being used to perform prediction
on new unseen data.

For instance, an algorithm can learn to predict whether a given email is spam or ham (no
spam), as illustrated below .

Classification algorithms can be better understood using the below diagram. In the below
diagram, there are two classes, class A and Class B. These classes have features that are similar to
each other and dissimilar to other classes.
The algorithm which implements the classification on a dataset is known as a classifier. There are
two types of Classifications:

Binary Classifier: If the classification problem has only two possible outcomes, then it is called as
Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.

Multi-class Classifier: If a classification problem has more than two outcomes, then it is called
as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.

Types of ML Classification Algorithms:


Classification Algorithms can be further divided into the Mainly two category:

o Linear Models
o Logistic Regression
o Support Vector Machines
o Non-linear Models
o K-Nearest Neighbours
o Kernel SVM
o Naive Bayes
o Decision Tree Classification
o Random Forest Classification

Linear Regression
Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a
statistical method that is used for predictive analysis. Linear regression makes predictions for
continuous/real or numeric variables such as sales, salary, age, product price, etc.

Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression. Since linear regression shows the
linear relationship, which means it finds how the value of the dependent variable is changing
according to the value of the independent variable.

The linear regression model provides a sloped straight line representing the relationship between
the variables. Consider the below image:
Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε

Here,
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
The values for x and y variables are training datasets for Linear Regression model representation.

Types of Linear Regression


Linear regression can be further divided into two types of the algorithm:

Simple Linear Regression:


If a single independent variable is used to predict the value of a numerical dependent variable, then
such a Linear Regression algorithm is called Simple Linear Regression.
Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.

Evaluation Metrics:
Regression metrics are quantitative measures used to evaluate the nice of a regression
model. Scikit-analyze provides several metrics, each with its own strengths and boundaries,
to assess how well a model suits the statistics.

1. R-squared (R2R^2R2): Measures the proportion of variance in the dependent variable


that is predictable from the independent variables. It ranges from 0 to 1.
2. Mean Squared Error (MSE): The average squared difference between the actual
and predicted values.
3. Root Mean Squared Error (RMSE): The square root of MSE, which gives an
idea of the error in the same units as the dependent variable.

What is multivariate regression?


Multivariate regression is a statistical technique that allows researchers to examine the relationships
between multiple independent variables (predictors) and multiple dependent variables (outcomes).
Unlike multiple linear regression, which focuses on a single dependent variable, multivariate
regression analyzes multiple dependent variables simultaneously. This helps researchers understand
how several factors influence multiple outcomes at the same time.

Characteristics of Multivariate Regression


 Multivariate regression allows one to have a different view of the relationship between
various variables from all the possible angles.
 It helps you predict the behaviour of the response variables depending on how the predictor
variables move.
 Multivariate regression can be applied to various machine learning fields, including
economics, science, and medical research studies.

Nonlinear regression

Nonlinear regression is a form of regression analysis used to model relationships


between a dependent variable and one or more independent variables, where the
relationship does not follow a straight line or a simple linear pattern. In contrast to linear
regression, which assumes that changes in the dependent variable are directly proportional
to changes in the independent variables, nonlinear regression models more complex
relationships by fitting data to a curve.

Key Concepts of Nonlinear Regression:

1. Nonlinear Relationship: Unlike linear regression, which models a straight-line


relationship, nonlinear regression captures relationships that are curved or more
complex (e.g., exponential, logarithmic, polynomial).
2. Equation: Nonlinear regression equations take forms that are not linear in their
parameters. A general nonlinear regression model can be written as:

y=f(x,β)+ϵ

where:

 Y is the dependent variable,


 f(x,β) is a nonlinear function of the independent variable(s) x and parameters β,
 ϵ is the error term (residual).

Advantages of Nonlinear Regression:

 Flexibility: Can model more complex relationships between variables.


 Greater Accuracy: Often fits data better when the true relationship is not linear.

Disadvantages of Nonlinear Regression:

 Complexity: More difficult to interpret compared to linear models.


 Computational Challenges: Requires iterative optimization methods, which may not
always converge or provide a unique solution.

Nonlinear vs. Linear Regression:

 Linear Regression assumes a straight-line relationship between variables, making it easier


to interpret and compute.
 Nonlinear Regression can handle more complex relationships but requires iterative
algorithms to estimate parameters, making it computationally more intensive.

K-Nearest Neighbor(KNN) Algorithm for Machine


Learning
o K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on
Supervised Learning technique.
o K-NN algorithm assumes the similarity between the new case/data and available cases and
put the new case into the category that is most similar to the available categories.
o K-NN algorithm stores all the available data and classifies a new data point based on the
similarity. This means when new data appears then it can be easily classified into a well suite
category by using K- NN algorithm.
o K-NN algorithm can be used for Regression as well as for Classification but mostly it is used
for the Classification problems.
o K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
o It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an
action on the dataset.

o KNN algorithm at the training phase just stores the dataset and when it gets new data, then it
classifies that data into a category that is much similar to the new data.
o Example: Suppose, we have an image of a creature that looks similar to cat and dog, but we
want to know either it is a cat or dog. So for this identification, we can use the KNN algorithm,
as it works on a similarity measure. Our KNN model will find the similar features of the new
data set to the cats and dogs images and based on the most similar features it will put it in
either cat or dog category.
Why do we need a K-NN Algorithm?
Suppose there are two categories, i.e., Category A and Category B, and we have a new data point
x1, so this data point will lie in which of these categories. To solve this type of problem, we need a
K-NN algorithm. With the help of K-NN, we can easily identify the category or class of a particular
dataset. Consider the below diagram:

How does K-NN work?


The K-NN working can be explained on the basis of the below algorithm:

o Step-1: Select the number K of the neighbors


o Step-2: Calculate the Euclidean distance of K number of neighbors
o Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
o Step-4: Among these k neighbors, count the number of the data points in each category.
o Step-5: Assign the new data points to that category for which the number of the neighbor is
maximum.
o Step-6: Our model is ready.
Decision Tree Classification
o Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches represent the decision
rules and each leaf node represents the outcome.

o It is a graphical representation for getting all the possible solutions to a


problem/decision based on given conditions.

o It is called a decision tree because, similar to a tree, it starts with the root node, which expands
on further branches and constructs a tree-like structure.
o In order to build a tree, we use the CART algorithm, which stands for Classification and
Regression Tree algorithm.
o A decision tree simply asks a question, and based on the answer (Yes/No), it further split the
tree into subtrees.
o Below diagram explains the general structure of a decision tree:

Note: A decision tree can contain categorical data (YES/NO) as well as numeric data.

Why use Decision Trees?


There are various algorithms in Machine learning, so choosing the best algorithm for the given
dataset and problem is the main point to remember while creating a machine learning model.
Below are the two reasons for using the Decision tree:
o Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like
structure.

Decision Tree Terminologies


 Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.

 Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a
leaf node.

 Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the
given conditions.

 Branch/Sub Tree: A tree formed by splitting the tree.

 Pruning: Pruning is the process of removing the unwanted branches from the tree.

 Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the
child nodes.

Support Vector Machine


Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms, which is used for Classification as well as Regression problems.
However, primarily, it is used for Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can segregate
n-dimensional space into classes so that we can easily put the new data point in the correct
category in the future. This best decision boundary is called a hyperplane.

SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme
cases are called as support vectors, and hence algorithm is termed as Support Vector Machine.
Consider the below diagram in which there are two different categories that are classified using
a decision boundary or hyperplane:
Example: SVM can be understood with the example that we have used in the
KNN classifier. Suppose we see a strange cat that also has some features of
dogs, so if we want a model that can accurately identify whether it is a cat or dog,
so such a model can be created by using the SVM algorithm. We will first train
our model with lots of images of cats and dogs so that it can learn about different
features of cats and dogs, and then we test it with this strange creature. So as
support vector creates a decision boundary between these two data (cat and dog)
and choose extreme cases (support vectors), it will see the extreme case of cat
and dog. On the basis of the support vectors, it will classify it as a cat. Consider
the below diagram:

SVM algorithm can be used for Face detection, image classification, text categorization, etc.
Types of SVM

SVM can be of two types:

o Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset
can be classified into two classes by using a single straight line, then such data is
termed as linearly separable data, and classifier is used called as Linear SVM
classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM classifier.

Key Concepts of SVM:

1. Hyperplane:
o The goal of SVM is to find the optimal hyperplane that best separates the data points into
different classes. In a two-dimensional space, this hyperplane is simply a line, but in higher
dimensions, it becomes a plane or a hyperplane.
o The best hyperplane is the one that maximizes the margin—the distance between the
hyperplane and the nearest data points from each class, known as support vectors.
2. Support Vectors:
o These are the data points that lie closest to the hyperplane. They are critical because they
define the position and orientation of the hyperplane. Removing these points would change
the decision boundary.
3. Margin:
o The margin is the distance between the hyperplane and the nearest support vectors from
both classes. A larger margin is generally better, as it means the model is more confident in
its classification.
4. Linear Separability:
o In cases where the data is linearly separable (i.e., it can be perfectly separated by a straight
line or hyperplane), SVM will find the hyperplane that maximizes the margin between the
two classes.
5. Non-Linearly Separable Data:
o When the data is not linearly separable, SVM uses a technique called the kernel trick to
transform the data into a higher-dimensional space where it becomes linearly separable.

Applications of supervised learning in multiple domains


Supervised machine learning has transformative potential in various business domains, including pricing,
customer relationship management (CRM), sales, and marketing.

1. Pricing Optimization

Problem: Businesses need to find the optimal price point for products or services to maximize
revenue and profits while staying competitive.

Supervised Learning Solution:

 Regression models (e.g., Linear Regression, Ridge Regression, and Decision Trees) can predict the
optimal price for a product by analyzing historical sales data, competitor pricing, demand
fluctuations, and customer behavior.
 Example: An online retailer uses a supervised learning model to determine dynamic pricing
strategies based on competitor pricing, time of year, and customer demand, adjusting prices in real-
time to optimize profits.

2. Customer Relationship Management (CRM)

Problem: Businesses want to enhance customer relationships by identifying at-risk customers,


providing personalized experiences, and improving satisfaction.
Supervised Learning Solution:

 Churn Prediction: Classification algorithms like Logistic Regression, Support Vector Machines (SVM),
or Random Forests can predict which customers are likely to churn based on past interactions,
purchase frequency, and customer service calls.
 Customer Lifetime Value (CLV) Prediction: Regression models can predict the long-term value of a
customer based on transaction history, buying frequency, and demographic data.
 Example: A telecom company uses churn prediction models to identify at-risk customers and offers
targeted retention programs (e.g., discounts or personalized services) to prevent churn.

3. Sales Forecasting

Problem: Businesses need accurate sales forecasts to manage inventory, plan production, and
allocate resources efficiently.

Supervised Learning Solution:

 Supervised models can predict future sales based on historical sales data, market trends, and
seasonal factors.
 Lead Scoring: Classification models like Logistic Regression or Decision Trees can predict the
likelihood of a sales lead converting into a customer based on factors like interaction history, product
interest, and lead demographics.
 Example: A retail chain uses supervised learning models to predict future sales across different
locations based on factors like previous sales, seasonal trends, and local economic conditions.

Marketing Campaign Optimization

Problem: Marketers need to design personalized campaigns and allocate resources to the most
effective channels for customer acquisition and retention.

Supervised Learning Solution:

 Targeted Marketing: it can segment customers based on their behavior, preferences, and
demographics. Marketers can then create personalized campaigns targeted to specific customer
segments.
 Campaign Response Prediction: Regression models can predict how likely a customer is to respond
to a specific marketing campaign (e.g., email, social media ad) based on past campaign data.
 Example: An e-commerce company uses a supervised learning model to segment customers into
categories like frequent buyers, discount seekers, and one-time shoppers, creating targeted email
campaigns for each group.

Model evaluation is the process that uses some metrics which help us to analyze
the performance of the model. As we all know that model development is a multi-step process and a
check should be kept on how well the model generalizes future predictions. Therefore evaluating a
model plays a vital role so that we can judge the performance of our model. The evaluation also
helps to analyze a model’s key weaknesses. There are many metrics like Accuracy, Precision, Recall,
F1 score, Area under Curve, Confusion Matrix, and Mean Square Error. Cross Validation is one
technique that is followed during the training phase and it is a model evaluation technique as well.
Evaluation Metrics for Classification Task
In this Python code, we have imported the iris dataset which has features like the length and width
of sepals and petals. The target values are Iris setosa, Iris virginica, and Iris versicolor. After
importing the dataset we divided the dataset into train and test datasets in the ratio 80:20. Then we
called Decision Trees and trained our model. After that, we performed the prediction and calculated
the accuracy score, precision, recall, and f1 score. We also plotted the confusion matrix.
Importing Libraries and Dataset
Python libraries make it very easy for us to handle the data and perform typical and complex tasks
with a single line of code.
 Pandas – This library helps to load the data frame in a 2D array format and has multiple
functions to perform analysis tasks in one go.
 Numpy – Numpy arrays are very fast and can perform large computations in a very short
time.
 Matplotlib/Seaborn – This library is used to draw visualizations.

You might also like