MACHINE
LEARNING
ALGORITHMS
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
CONTENTS
• Introduction
• Types of ML algorithms
• Importance of algorithms
• Performance measures
• Bias – Variance
• Underfitting Vs Overfitting
• Classification algorithms
• Regression algorithms
• Clustering algorithms
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
INTRODUCTION
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
INTRODUCTION
A machine learning algorithm is a step-by-step computational procedure that allows computers to learn patterns
from data and make predictions or decisions without being explicitly programmed for specific tasks.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TYPES OF ML ALGORITHMS
Machine Learning
Semi-
Supervised Unsupervised
Supervised
Learning Learning
Learning
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
SUPERVISED LEARNING
• Data is labeled and the algorithms learn to predict the output from the input data
• Learning stops when the algorithm achieves an acceptable level of performance
• It is classified as
o Regression
o Classification
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
SUPERVISED LEARNING
Linear Regression
Regression
Decision Tree(RPART)
Supervised Learning
Logistic Regression
Classification
Support Vector Machines
Naive Bayes
Decision Tree(C5.0)
Fig: types of supervised learning algorithms
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
REGRESSION
• Regression is a technique from statistics used to predict values of a desired target
quantity
• A regression problem is when the output variable is a real or continuous value
• Types of Regression
o Simple Linear Regression
o Multiple Linear Regression
o Polynomial Regression
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
CLASSIFICATION
• Classification is a technique for determining which class the dependent belongs to based on one
or more independent variables
• A Classification problem is when output variable is category
• Types of Classification
o Logistic Regression
o Support Vector Machine
o Naive Bayes
o Decision tree
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
CLASSIFICATION - EXAMPLE
Fig: example of email filtering
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
UNSUPERVISED LEARNING
• Data is unlabeled and the algorithms learn to inherent structure from the input data
• Self learning algorithm
• Deals with the unlabeled data
• It is classified as
o Clustering
o Association
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
UNSUPERVISED LEARNING
K-means
Clustering
Clustering
Unsupervised Learning
Association Apriori Rules
Fig: types of unsupervised learning algorithms
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
CLUSTERING
• It segregate groups with similar traits and assign them into clusters
• Goal is to find the similar groups in data
• Examples of Clustering
o K-means Clustering
o Hierarchical Clustering
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ASSOCIATION
• Rule based machine learning method for discovering relations between variables
• It is intended to identify strong rules discovered in databases using some measures of
interestingness
• One Example of Association is:
o Apriori algorithm
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
SEMI – SUPERVISED LEARNING
• A mixture of supervised and unsupervised techniques can be used
• Allows neural networks to mimic human inductive logic
• And sort unknown information fast and accurately without human intervention
• Deals with the unlabeled data as well as labelled data
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
PERFORMANCE
MEASURES
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
PERFORMANCE MEASURES
In Machine Learning (ML), performance measures (or evaluation metrics) help to determine how well an
algorithm is performing on a given task.
• Regression • Classification
o Mean Squared Error (MSE) o Confusion Metrics
o Accuracy
o Mean Absolute Error (MAE) o Recall
o Root Mean Square Error (RMSE) o Precision
o F1 score
o ROC
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
MEAN SQUARED ERROR
• Measures the average of the squares of the errors
• Always non-negative, and values closer to zero are better
• Good performance metric
❑ Pi - predicted values
❑ Ai - observed values
❑ n - number of observations
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
MEAN ABSOLUTE ERROR
• Mean Absolute Error (MAE) is a measure of difference between two continuous variables
• Average of the absolute errors
• More intuitive and less sensitive to outliers
❑ Pi - predicted values
❑ Ai - observed values
❑ n - number of observations
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ROOT MEAN SQUARE ERROR
• Measures differences between predicted values by a model and the values actually
observed (Oi), where n is the number of observation
❑ Pi - predicted values
❑ Oi - observed values
❑ n - number of observation
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
CONFUSION MATRIX
In a classification problem, you can represent the errors using “Confusion Matrix”
Confusion Matrix
• Table with two dimensions viz. “Actual” and “Predicted”
• It is the easiest way to measure the performance of a classification problem where the output can be
of two or more type of classes.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ACCURACY
Accuracy is the percent of predictions that were correct
𝑇𝑃+𝑇𝑁
Accuracy =
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
For example consider 10,000 customer records
Action Actually bought Actually did not buy
Predicted(that there will be a TP : 500 FP : 100
buy)
Predicted(that there will be no FN : 400 TN : 9000
buy)
The "accuracy" is (9,000+500) out of 10,000 = 95%
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
PRECISION
Precision is the percent of positive predictions that were correct
𝑇𝑃
Precision=
𝑇𝑃+𝐹𝑃
For example consider 10,000 customer records
Action Actually bought Actually did not buy
Predicted(that there will be a TP : 500 FP : 100
buy)
Predicted(that there will be no FN : 400 TN : 9000
buy)
The "precision" is 500 out of 600 = 83.33%
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
RECALL
Recall is the percent of positives cases that you were able to catch
𝑇𝑃
Recall=
𝑇𝑃+𝐹𝑁
For example consider 10,000 customer records
Action Actually bought Actually did not buy
Predicted(that there will be a TP : 500 FP : 100
buy)
Predicted(that there will be no FN : 400 TN : 9000
buy)
The "recall" is 500 out of 900 = 55.55%
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
F1 - SCORE
It calculates the harmonic mean of the recall and precision as shown below equation
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑟𝑒𝑐𝑎𝑙𝑙
F1-Score = 2 *
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙
It provides a better sense of a model’s overall performance particularly for imbalanced datasets.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
Receiver Operating Characteristics (ROC) Curve
• Shows tradeoff between True Positive Rate(sensitivity) and
False Positive Rate(1-specificity)
• Closer the curve to left-hand border and top border of the
ROC space, more accurate the test
• Closer the curve to the 45-degree diagonal of the ROC space,
the less accurate the test
• The area under the curve is a measure of test accuracy.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TYPES OF ERROR
Training Error
• Fraction of training data misclassified
Test Error
• Fraction of test data misclassified
Generalization Error
• Probability of misclassifying new random data
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
BIAS AND
VARIANCE TRADEOFF
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
BIAS
Difference between the average prediction of a model and the correct value which it is trying to predict
• Low Bias
o Suggests less assumptions
o For example k-Nearest Neighbours and Support Vector Machines
• High-Bias
o Suggests more assumptions
o For examples Linear Regression, Linear Discriminant Analysis
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
VARIANCE ERROR
Variance is the amount by which predicted value of the target function will change if training data is used.
• Low Variance
o Suggests small change in predicted value
o For examples Linear Regression, Linear Discriminant Analysis
• High Variance
o Suggests large changes in predicted value
o For Examples k-Nearest Neighbors and Support Vector Machines
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
IRREDUCIBLE ERROR
• Even from a perfect model, we might not be able to remove the errors completely
made by it
• This is because the training data itself may contain noise. This error is called Irreducible error
or Bayes’ error rate or Optimum Error rate
• While we cannot do anything about the Irreducible Error, we focus on reducing the errors
due to bias and variance
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TOTAL ERROR
• The prediction error for any machine learning algorithm can be broken down into three parts:
o Bias Error
o Variance Error
o Irreducible Error
Total Error = Bias + Variance + Irreducible Error
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TOTAL ERROR
We can divide the received data into three parts:
• Training Set: It is 60% of the data and used for training a model
• Validation Set: It is typically 20% of the data and used to test the quality of
the trained model
• Test Set: It is typically 20% of the data and used is to report the accuracy of
the final model
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EFFECT OF MODEL COMPLEXITY
• If the model is not performing well, it is usually a high bias or a high variance
problem
• The figure below graphically shows the effect of model complexity on error due
to bias and variance
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EFFECT OF MODEL COMPLEXITY
• Increasing the complexity results into reduction in error due to lower bias in the model.
• This happens till a particular point
• As the complexity of model increases, variance increases and overfitting takes place
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
BIAS – VARIANCE TRADE - OFF
• Trade-off is tension between the error introduced by
the bias and the variance
• A balance between these two types of errors is known
as the trade-off management of bias-variance errors
• For example Ensemble learning
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DATA
INCONSISTENCIES IN
MACHINE LEARNING
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DATA INCONSISTENCIES
• Some Data Inconsistencies are:
o Unpredictable Data Formats
o Under Fitting
o Over Fitting
o Data Instability
• There are some established processes to address these inconsistencies
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
UNPREDICTABLE DATA FORMAT
• Complexity will creep in when the new data entering the system comes in
formats that are not supported by the machine learning system
• Machine learning is meant to work with new data constantly coming into
the system and learning from that data
• It is now difficult to say if our models work well for the new data which has
instability in the formats, unless there is a mechanism built to handle this
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
UNDER FITTING
• A model is said to be under-fitting when it doesn't take into consideration enough
information to accurately model the actual data
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
OVER FITTING
• The Statistical model is said be over fitted if it describes noise instead of the
relationships
• A large dataset also runs the risk of having the model overfit the data
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DATA INSTABILITY
• Machine Learning Algorithms are usually robust to noise within the data
• A problem will occur if the outliers are due to manual error or misinterpretation of the
relevant data
• This will result in a skewing of the data, which will ultimately end up in an incorrect
model
• There is a strong need to have a process to correct or handle human errors that can result in
building an incorrect model
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
PARAMETRIC AND
NON-PARAMETRIC ML
ALGORITHMS
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
PARAMETRIC ML ALGORITHM
Algorithms that simplify the function to a known form are called
parametric machine learning algorithms.
A parametric algorithm involves two steps:
• Select a form for the function
• Learn the coefficients for the function from the training data
• Example - Linear regression, Logistic regression and Naïve Bayes
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
PARAMETRIC ML ALGORITHM
• Consider a mapping function which is used in linear regression:
y = b0 + b1 * x1 + b2 * x2
• Where
o b0, b1 and b2 are the coefficients of the line
o x1 and x2 are input variables
o y is output variable
• Estimate the coefficients of the line equation and we have a predictive model for the
problem
• Often the assumed functional form is a linear combination of the input variables and are
called as linear machine learning algorithms
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ADVANTAGES OF PARAMETRIC ML ALGORITHM
• Simpler: These methods are easier to understand and interpret results
• Speed: Parametric models are very fast to learn from data
• Less Data: They do not require as much training data and can work well even if
the fit to the data is not perfect
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISADVANTAGES OF PARAMETRIC ML ALGORITHM
• Constrained: By choosing a functional form these methods are highly constrained to
the specified form
• Limited Complexity: The methods are more suited to simpler problems
• Poor Fit: In practice the methods are unlikely to match the underlying mapping function
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
NON-PARAMETRIC ML ALGORITMS
Algorithms that do not make strong assumptions about the form of the mapping function are called non-parametric
machine learning algorithms.
• Free to learn any functional form from the training data
• Assumes patterns as mapping function which are close and likely to have
a similar output variable
• Method works well when there is lot of data and becomes easier to
choose the right feature
• Examples – KNN, DT, RF, SVM
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ADVANTAGES OF NON-PARAMETRIC ML ALGORITMS
• Flexibility: Capable of fitting a large number of functional forms
• Power: No (or weak) assumptions about the underlying function
• Performance: Can result in higher performance models for prediction
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISADVANTAGES OF NON-PARAMETRIC ML ALGORITMS
• More data: Require a lot of training data to estimate the mapping function
• Slower: Slower to train as they often have far more parameters to train
• Overfitting: High risk to overt the training data and harder to explain why specific
predictions are made
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
CLASSIFICATION
ALGORITHMS
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
KNN
• K-Nearest Neighbors (KNN) is a supervised learning algorithm that classifies a data point based on the
majority class among its k closest neighbors in the training set.
• It is a non-parametric, instance-based method that uses distance metrics like Euclidean distance to find
similarity.
Example :
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WHAT IS K IN KNN?
In the k-Nearest Neighbors algorithm k is just a number that tells the algorithm how many nearby points or neighbors to look
at when it makes a decision.
Example: Imagine you're deciding which fruit it is based on its shape and size. You compare it to fruits
you already know.
• If k = 3, the algorithm looks at the 3 closest fruits to the new one.
• If 2 of those 3 fruits are apples and 1 is a banana, the algorithm says the new fruit is an apple
because most of its neighbors are apples.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
HOW TO CHOOSE K VALUE?
• Start with odd values (like 3, 5, 7) to avoid ties in classification.
• Use cross-validation to test different k values and choose the one with best accuracy.
• Plot a graph of error rate vs. k and choose the k where the error is lowest and stable.
• Smaller k:
• More flexible
• Low bias, high variance (risk of overfitting)
• Larger k:
• More stable
• High bias, low variance (risk of underfitting)
Note - Cross-validation is a model evaluation technique that splits the dataset into multiple parts to ensure reliable performance.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISTANCE METRICS USED IN KNN
1. Euclidean Distance
Euclidean distance is defined as the straight-line distance between two points in a plane or space.
You can think of it like the shortest path you would walk if you were to go directly from one point to another.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISTANCE METRICS USED IN KNN
2. Manhattan Distance
This is the total distance you would travel if you could only move along horizontal and vertical lines like a grid
or city streets. It’s also called "taxicab distance" because a taxi can only drive along the grid-like streets of a city.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISTANCE METRICS USED IN KNN
3. Minkowski Distance
Minkowski distance is like a family of distances, which includes both Euclidean and Manhattan distances as special
cases
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WORKING OF KNN
1. Load the dataset
2. Choose the value of k (number of neighbors).
3. Calculate the distance between the test point and all training points.
4. Identify the k closest training points.
5. Use majority vote among neighbors to predict the class.
6. Assign the predicted class to the test point.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
APPLICATIONS OF KNN
• Handwriting/Digit Recognition
– Used in OCR systems to recognize handwritten numbers (e.g., postal codes, bank cheques).
• Medical Diagnosis
– Helps predict diseases (like cancer or diabetes) by comparing patient data to historical cases.
• Recommender Systems
– Suggests movies, music, or products based on preferences of similar users.
• Image Classification
– Categorizes images (e.g., animals, objects) based on feature similarity.
• Spam Email Detection
– Identifies spam by comparing new emails to known spam and non-spam messages.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ADVANTAGES OF KNN
• Simple and intuitive: Easy to understand and implement.
• No training phase: It’s a lazy learner—stores all data and computes at prediction time.
• Adaptable to multiclass problems: Naturally handles multiple output classes.
• Effective with well-separated data: Works well when data classes are clearly
distinguishable.
• Flexible with distance metrics: Can use various distance formulas (Euclidean, Manhattan,
etc.).
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISADVANTAGES OF KNN
• Slow with large datasets: Prediction requires distance computation for every training
point.
• Memory intensive: Needs to store the entire training dataset.
• Sensitive to irrelevant features: Unimportant data can mislead distance calculations.
• Affected by feature scaling: Requires normalization for fair distance comparison.
• Struggles with imbalanced data: Biased toward the majority class if class distribution
is uneven.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
NAÏVE BAYES CLASSIFIER
A Naive Bayes classifier is a supervised probabilistic machine learning algorithm based on Bayes' Theorem, used
for classification tasks. It assumes that all features (inputs) are independent of each other.
Why it is called as Naïve Bayes?
• Naïve: Named as Naïve because it assumes the presence of one feature does not affect other features.
• Bayes: Named Bayes for the basis in Bayes’ Theorem.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
BAYES THEOREM
• Bayes' Theorem is a mathematical formula that helps determine the conditional probability of an event based on
prior knowledge and new evidence.
• It adjusts probabilities when new information comes in and helps make better decisions in uncertain situations.
Formula
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TYPES OF NAÏVE BAYES MODEL
• Gaussian : The Gaussian naive Bayes assumes that features follow a normal distribution. This
assumes that the predictors X are continuous and have been sampled from a Gaussian distribution.
• Multinomial : Multinomial Naïve Bayes classifier is used when the data has multinomial
distribution. A great Frequency of words is a predictor, and that is used as the feature for The
classifier.
• Bernoulli : The Bernoulli classifier functions the same as the Multinomial classifier, but its
predictor variables are Booleans that refer to independent. For example, if some word exists in a
document or not so. This model is also one of the most popular for document classification tasks.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WORKING OF NAÏVE BAYES
1. Terminology
Consider a classification problem (like predicting if someone plays golf based on
weather). Then:
• y is the class label (e.g. "Yes" or "No" for playing golf)
• X=(x1,x2,..., xn) is the feature vector (e.g. Outlook, Temperature, Humidity, Wind)
A sample row from the dataset:
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WORKING OF NAÏVE BAYES
2. The Naive Assumption
The "naive" in Naive Bayes comes from the assumption that all features are independent given the
class. That is:
Thus, Bayes' theorem becomes:
Since the denominator is constant for a given input, we can write:
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WORKING OF NAÏVE BAYES
3. Constructing the Naive Bayes Classifier
We compute the posterior for each class y and choose the class with the highest probability:
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EXAMPLE
Sample Dataset
Outlook Temperature Humidity Wind Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EXAMPLE
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EXAMPLE
Input: X=(Sunny , Hot, Normal, False)
Goal: Predict if golf will be played (Yes or No).
Conditional probabilities:
P (Value |
Feature Value P (Value | Yes)
No)
Outlook Sunny 2/9 3/5
Temperature Hot 2/9 2/5
Humidity Normal 6/9 1/5
Wind False 6/9 2/5
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EXAMPLE
Calculate posterior probabilities
Normalize the probabilities for comparison
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
EXAMPLE
Final Prediction
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ADVANTAGES AND DISADVANTAGES
Advantages
• It is easy to understand and quick to implement.
• Works well with large datasets and high-dimensional data.
• Performs well with small amounts of training data.
• Handles both continuous and categorical data.
Disadvantages
• Assumes that all features are independent, which is rarely true.
• If a feature value is missing in the training data, it may give zero probability (zero-frequency problem).
• Can be less accurate compared to more complex models on certain datasets.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
APPLICATIONS
• Spam detection – Classifies emails as spam or not spam.
• Sentiment analysis – Determines if a review is positive or negative.
• Document classification – Categorizes news, articles, or emails into topics.
• Medical diagnosis – Predicts diseases based on symptoms.
• Credit scoring – Assesses the risk of lending based on customer data.
• Recommendation systems – Suggests products based on user preferences.
• Face recognition – Used in early-stage models for image classification.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
SUPPORT VECTOR MACHINE
Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression
tasks. It tries to find the best boundary known as hyperplane that separates different classes in the data. It is useful
when you want to do binary classification like spam vs. not spam or cat vs. dog.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TERMINOLOGIES OF SVM
• Hyperplane: A decision boundary separating different classes in feature space and is represented by the
equation wx + b = 0 in linear classification.
• Support Vectors: The closest data points to the hyperplane, crucial for determining the hyperplane and
margin in SVM.
• Margin: The distance between the hyperplane and the support vectors. SVM aims to maximize this margin
for better classification performance.
• Kernel: A function that maps data to a higher-dimensional space enabling SVM to handle non-linearly
separable data.
• Hard Margin: A maximum-margin hyperplane that perfectly separates the data without misclassifications.
• Soft Margin: Allows some misclassifications by introducing slack variables, balancing margin
maximization and misclassification penalties when data is not perfectly separable.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
TERMINOLOGIES OF SVM
• C: A regularization term balancing margin maximization and misclassification penalties. A
higher C value forces stricter penalty for misclassifications.
• Hinge Loss: A loss function penalizing misclassified points or margin violations and is
combined with regularization in SVM.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WORKING OF SVM
1. Plot the Data
Each data point is plotted in n-dimensional space (where n is the number of features). The goal is to
separate the classes using a line (in 2D), a plane (in 3D), or a hyperplane (in higher dimensions).
2. Find the Optimal Hyperplane
SVM tries to find the hyperplane that separates the classes with the largest margin, meaning the
distance from the hyperplane to the nearest data points (support vectors) is maximized.
3. Identify Support Vectors
These are the critical data points closest to the decision boundary. They define the margin and are
used to build the model. Removing them would change the hyperplane.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
WORKING OF SVM
4. Maximize the Margin
The margin is the gap between support vectors of different classes. A larger margin means better
generalization on unseen data.
5. Use Kernel Trick (if non-linear)
If the data isn’t linearly separable, SVM applies a kernel function to project the data into a higher-
dimensional space where it becomes linearly separable.
6. Make Predictions
After training, new data points are classified based on which side of the hyperplane they fall on.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
MATHEMATICAL COMPUTATION OF SVM
The equation for the linear hyperplane can be written as:
Where:
• w is the normal vector to the hyperplane (the direction perpendicular to it).
• b is the offset or bias term representing the distance of the hyperplane from the origin along the
normal vector w.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
MATHEMATICAL COMPUTATION OF SVM
Distance from a Data Point to the Hyperplane
The distance between a data point x_i and the decision boundary can be calculated as:
where ||w|| represents the Euclidean norm of the weight vector w. Euclidean norm of the normal
vector W.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
ADVANTAGES OF SVM
1. Effective in high-dimensional spaces, even when the number of features exceeds the
number of samples.
2. Works well for clear margin of separation between classes.
3. Memory efficient – only support vectors are used in the decision function.
4. Flexible kernel trick allows SVM to model complex, non-linear decision boundaries.
5. Robust to overfitting in high-dimensional space, especially with proper regularization.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
DISADVANTAGES OF SVM
1. Not suitable for large datasets – training time can be high.
2. Poor performance with overlapping classes or noisy data.
3. Requires careful tuning of kernel and hyperparameters (e.g., C, gamma).
4. Not directly probabilistic – needs additional methods to estimate probabilities.
5. Hard to interpret the model (especially with non-linear kernels).
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
APPLICATIONS OF SVM
1. Used in text classification tasks like spam detection and sentiment analysis.
2. Applied in bioinformatics for cancer diagnosis and protein classification.
3. Useful in image recognition, such as face and handwriting detection.
4. Helps in financial forecasting and fraud detection.
5. Used in industrial systems for fault detection and quality control.
6. Applied in cybersecurity for intrusion detection and malware classification.
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar
THANK YOU
Foundation Course of Machine Learning using Python Conducted by NIELIT Bhubaneswar