0% found this document useful (0 votes)
12 views

Assessing a Single Classification Algorithm and Two Classification Algorithms

The document provides an overview of supervised and unsupervised learning algorithms, detailing their definitions, goals, input data, outputs, examples, advantages, and disadvantages. It covers various supervised algorithms such as Linear Regression, Logistic Regression, SVM, Naive Bayes, Decision Trees, Random Forest, and K-Nearest Neighbors, as well as unsupervised methods like K-Means and Gaussian Mixture Models. The document emphasizes the characteristics and applications of each algorithm, highlighting their respective strengths and weaknesses.

Uploaded by

953623243025
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Assessing a Single Classification Algorithm and Two Classification Algorithms

The document provides an overview of supervised and unsupervised learning algorithms, detailing their definitions, goals, input data, outputs, examples, advantages, and disadvantages. It covers various supervised algorithms such as Linear Regression, Logistic Regression, SVM, Naive Bayes, Decision Trees, Random Forest, and K-Nearest Neighbors, as well as unsupervised methods like K-Means and Gaussian Mixture Models. The document emphasizes the characteristics and applications of each algorithm, highlighting their respective strengths and weaknesses.

Uploaded by

953623243025
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

UNIT-V ASSESSING A SINGLE CLASSIFICATION ALGORITHM AND TWO

CLASSIFICATION ALGORITHMS

Unsupervised Reinforcement
Feature Supervised Learning
Learning Learning
Learns by interacting
Learns from unlabeled
Definition Learns from labeled data with environment
data
through rewards
Learn optimal actions
Predict outcomes Discover hidden
Goal to maximize reward
(classification/regression) patterns or structure
over time
States, actions, and
Input Data Labeled (input-output pairs) Unlabeled (only inputs) rewards from
environment
Groupings, associations, Policy or value
Output Predicted label or value
or structure function
Q-learning, Deep Q-
Linear regression, SVM, K-means, PCA,
Networks (DQN),
Examples decision trees, neural hierarchical clustering,
policy gradient
networks autoencoders
methods

1
Unsupervised Reinforcement
Feature Supervised Learning
Learning Learning
Silhouette score,
Accuracy, precision, recall, Cumulative reward,
Evaluation cohesion/separation
F1-score convergence rate
metrics
Robotics, game
Customer segmentation,
Application Spam detection, medical playing,
anomaly detection, data
Domains diagnosis, stock prediction recommendation
compression
systems
Moderate — no need High — needs many
Dependency High — needs large labeled
for labels, but quality of interactions with
on Data datasets
insights varies environment
Complex, involves
Usually faster, but
Training Generally straightforward, exploration vs.
interpretation can be
Complexity but depends on algorithm exploitation trade-
tricky
offs

WHAT IS SUPERVISED LEARNING ALGORITHM ?

A supervised learning algorithm is a type of machine learning algorithm that learns from
labeled training data to make predictions or decisions.

Key Characteristics of Supervised Learning:

 Input: Features (X)


 Output: Labels (Y)
 Goal: Learn a mapping from inputs to outputs, i.e., f(X) ≈ Y

Types of Supervised Learning Tasks:

1. Classification: Predict discrete labels (e.g., spam or not spam)


2. Regression: Predict continuous values (e.g., house price)

Examples of Supervised Algorithms:

 Linear Regression (for regression)


 Logistic Regression (for binary classification)
 Support Vector Machines (SVM)
 Decision Trees / Random Forests
 k-Nearest Neighbors (k-NN)
 Neural Networks

2
LINEAR REGRESSION

Linear Regression is a supervised learning algorithm used to model the relationship


between a dependent variable (target) and one or more independent variables (features) by
fitting a linear equation.

Equation of Linear Regression:

For Simple Linear Regression (1 feature):

y=mx+b

Where:

 y= predicted output
 x= input feature
 m = slope (coefficient)
 b= intercept

For Multiple Linear Regression (multiple features):

y=b0+b1x1+b2x2+⋯+bnxn
Types of Linear Regression:

1. Simple Linear Regression


o One independent variable
o Example: Predicting house price based on area alone
2. Multiple Linear Regression
o Multiple independent variables
o Example: Predicting house price based on area, number of rooms, location, etc.

Advantages:

 Simple and easy to implement


 Efficient training even on large datasets
 Good baseline model for regression problems
 Works well when the data is linearly related

Disadvantages:

 Sensitive to outliers
 Can underfit if the true relationship is non-linear
 Performance drops if features are not scaled or normalized

3
Logistic Regression:

Logistic Regression is a supervised learning algorithm used for classification tasks. It predicts
the probability that a given input belongs to a particular class (typically binary), using a logistic
(sigmoid) function to map linear combinations of input features to a probability between 0 and
1.

Logistic regression estimates the probability that a data point belongs to a class using the
formula:

1
𝑝( 𝑦 = 1 | 𝑥 ) =
1+ 𝑒 −(𝑏0+𝑏1𝑥1+𝑏2𝑥2….+𝑏𝑛𝑥𝑛)

Where:

 P(y=1∣x) is the probability of class 1 given features x


 b0 is the intercept, b1,b2 are the feature weights
 The output is interpreted as a probability, and a threshold (e.g., 0.5) is used to classify

Types of Logistic Regression:

1. Binary Logistic Regression


o Two possible classes (e.g., spam vs not spam)
2. Multinomial Logistic Regression
o More than two classes without order (e.g., predicting fruit type: apple, banana,
orange)
3. Ordinal Logistic Regression
o More than two ordered categories (e.g., low, medium, high satisfaction)

Advantages:

 Simple and efficient for binary classification


 Probabilistic interpretation of outputs
 Fast to train, even on large datasets
 Works well when the classes are linearly separable
 Can handle nonlinear boundaries with feature engineering

Disadvantages:

 Assumes linear decision boundary


 Not suitable for complex relationships without transformations
 Sensitive to outliers
 Doesn’t perform well if features are highly correlated (multicollinearity)

4
SUPPORT VECTOR MACHINE (SVM)

A Support Vector Machine (SVM) is a supervised machine learning algorithm used for
classification and sometimes regression tasks. SVM aims to find the optimal hyperplane that
maximally separates data points of different classes in the feature space.

Key Concept:

 SVM finds a decision boundary (hyperplane) that has the maximum margin between
the two classes.
 The support vectors are the data points closest to the hyperplane—they define the
margin.

Aspect SVM
Task Classification, Regression
Linear? Works for both linear and non-linear data
Key Feature Maximizing margin between classes
Sensitive To Parameter tuning (C, gamma), kernel choice
Strengths High accuracy, handles high-dimensional data well
Weaknesses Computationally expensive, less interpretable
Types

1. Linear SVM
o Used when data is linearly separable
o Finds a straight-line hyperplane in 2D, or a flat hyperplane in higher
dimensions
2. Non-linear SVM (using Kernel Trick)
o Used when data is not linearly separable
o Transforms data into a higher-dimensional space using kernels to make it
separable
3. Support Vector Regression (SVR)
o Applies the SVM principles to regression problems rather than classification

Advantages:

 High accuracy for classification tasks, especially in high-dimensional space


 Works well for both linearly and non-linearly separable data using kernels
 Robust to overfitting, especially in high-dimensional space

Disadvantages:

Computationally expensive, especially with large datasets

5
 Not suitable for very large datasets
 Difficult to interpret and tune (e.g., kernel choice, parameters like C and gamma)

NAIVE BAYES
Naive Bayes is a supervised learning algorithm based on Bayes’ Theorem with a naive
assumption that features are independent given the class label

Types of Naive Bayes:

1.Gaussian Naive Bayes 2. Multinomial Naive Bayes 3. Bernoulli Naive Bayes

Advantages:

 Fast and efficient even on large datasets


 Performs well in text classification tasks
 Simple and easy to implement
 Handles noise well
 Requires less training data

Disadvantages:

 Poor performance if this assumption is violated significantly


 Not suitable for datasets with highly correlated features
 Predictions can be less accurate compared to more complex models (e.g., SVM,
Random Forest)

6
DECISION TREE

A Decision Tree is a supervised learning algorithm used for both classification and
regression tasks. It works by splitting the dataset into subsets based on the feature values to
form a tree-like structure where each node represents a feature, each branch a decision rule,
and each leaf node a final output.

Types of Decision Trees:

1. Classification Trees
o Used when the target variable is categorical
o Example: Classifying an email as spam or not
2. Regression Trees
o Used when the target variable is continuous
o Example: Predicting house price

Advantages:

 Easy to understand and interpret (tree structure is visual and intuitive)


 Requires little data preprocessing (no need for scaling or normalization)
 Can handle both numerical and categorical features
 Performs well on small to medium-sized datasets
 Non-parametric: makes no assumptions about feature distributions

Disadvantages:

 Prone to overfitting, especially with deep trees


 Unstable: small changes in data can lead to very different trees
 Can create biased trees if some classes dominate
 Less accurate than ensemble methods (e.g., Random Forest, Gradient Boosting)

7
RANDOM FOREST

Random Forest is an ensemble learning algorithm that builds a collection (a "forest") of


decision trees and combines their outputs to improve overall performance. It is used for both
classification and regression tasks and is known for being more accurate and robust than a
single decision tree.

Random Forest works by:

 Building multiple decision trees during training


 Each tree is trained on a random subset of the data (bagging)
 At each split, it uses a random subset of features
 The final prediction is made by:
o Majority vote (for classification)
o Average prediction (for regression)

Advantages:

 High accuracy and performance


 Reduces overfitting compared to single decision trees
 Handles large datasets and high-dimensional feature spaces well
 Robust to noise and outliers
 Can handle missing values to some extent
 Works well with both classification and regression tasks

Disadvantages:

 Slower training and prediction compared to simpler models


 Less interpretable than a single decision tree
 Large memory usage due to multiple trees
 Not ideal for real-time applications where fast inference is needed

8
K-NEAREST NEIGHBORS (KNN)

K-Nearest Neighbors (KNN) is a supervised learning algorithm used for both


classification and regression. It is an instance-based or lazy learning algorithm, meaning it
doesn’t learn a model during training—instead, it makes predictions based on the closest
training examples in the feature space.

How KNN Works:

1. Choose the number of neighbors kkk


2. Measure the distance between the test data point and all training data (commonly
using Euclidean distance)
3. Select the k closest points
4. For classification:
o Return the most common class among the neighbors
For regression:
o Return the average value of the neighbors

Advantages:

 Simple and intuitive to understand and implement


 No training phase – great for small datasets
 Naturally handles multi-class problems
 Can adapt to complex decision boundaries with enough data

Disadvantages:

 Computationally expensive at prediction time (slow on large datasets)


 Sensitive to irrelevant or redundant features
 Affected by the choice of distance metric
 Poor performance with high-dimensional data (curse of dimensionality)
 Needs feature scaling for good results (e.g., normalization)

9
UNSUPERVISED LEARNING

Unsupervised Learning is a type of machine learning where the algorithm is trained on


unlabeled data. The goal is to find hidden patterns or structures in the input data without
predefined outputs or target labels.

Key Characteristics:

 No labeled outputs (no "correct answers")


 Focuses on exploring data structure, grouping, or dimensionality reduction
 Often used for clustering, association, and anomaly detection

Common Types of Unsupervised Learning:

1. Clustering
o Groups similar data points together
o Example: K-Means, Hierarchical Clustering, DBSCAN
2. Dimensionality Reduction
o Reduces the number of input variables
o Example: PCA (Principal Component Analysis), t-SNE
3. Association Rule Learning
o Finds relationships between variables
o Example: Apriori, Eclat (used in market basket analysis)

K-MEANS CLUSTERING

K-Means is an unsupervised learning algorithm used for clustering. It partitions a dataset


into K distinct, non-overlapping clusters based on feature similarity. The algorithm groups
data so that points in the same cluster are more similar to each other than to those in other
clusters.

How K-Means Works:

1. Choose the number of clusters KKK


2. Randomly initialize KKK centroids
3. Assign each data point to the nearest centroid

10
4. Recalculate the centroids as the mean of assigned points
5. Repeat steps 3–4 until centroids don’t change significantly (convergence)

Advantages:

 Simple and fast for small to medium-sized datasets


 Scales well to large datasets (especially with Mini-Batch K-Means)
 Efficient and easy to implement
 Works well when clusters are well-separated and spherical

Disadvantages:

 Requires specifying K in advance


 Assumes clusters are spherical and equally sized
 Sensitive to initialization (can converge to local minima)
 Poor performance on non-linear or overlapping clusters
 Not suitable for categorical data without preprocessing
 Sensitive to outliers and noise

GAUSSIAN MIXTURE MODEL (GMM)

A Gaussian Mixture Model (GMM) is an unsupervised learning algorithm used for


clustering and density estimation. It assumes that the data is generated from a mixture of
several Gaussian distributions, each with its own mean and covariance.

11
The model is typically trained using the Expectation-Maximization (EM) algorithm.

Types of Gaussian Mixture Models:

1. Spherical GMM
o Each component has the same variance in all directions.
2. Diagonal GMM
o Each component has a diagonal covariance matrix (features are uncorrelated).
3. Full GMM
o Each component has a full covariance matrix (features can be correlated).
4. Tied GMM
o All components share the same covariance matrix.

Advantages:

 Flexible clustering – can model elliptical clusters (unlike K-Means' circular ones)
 Probabilistic approach – gives soft assignments (probabilities of belonging to each
cluster)
 Works well when clusters overlap
 More powerful than K-Means for complex distributions

Disadvantages:

 Can be computationally expensive


 Assumes data comes from Gaussian distributions
 Requires specifying number of components (clusters)
 Sensitive to initialization and outliers
 May converge to a local minimum (depends on EM algorithm's initialization)

12

You might also like