0% found this document useful (0 votes)
27 views125 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views125 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 125

Introduction to Modelling and Evaluation

• SYLLABUS: Modelling and Evaluation: Introduction, Selecting a


Model, Training a Model, Model Representation and
Interpretability, Evaluating Performance of a Model, Improving
performance of a Model, Feature subset selection, Dimensionality
Reduction - PCA, SVD, FA, LDA.

1
Introduction to Modelling and Evaluation
• Modeling and evaluation are two fundamental aspects of the machine learning
process, working hand in hand to create effective predictive models and assess
their performance.
• Modeling:

• Modeling involves the development of mathematical representations or


algorithms that can learn patterns and relationships from data.
• This process typically consists of several steps:

1. Data Preprocessing: Cleaning the data, handling missing values, encoding


categorical variables, and scaling features to prepare the dataset for modeling.
Introduction to Modelling and Evaluation
2. Model Selection: Choosing the appropriate machine learning algorithm or
model architecture based on the nature of the problem (e.g., regression,
classification, clustering) and the characteristics of the data.

3. Training: Fitting the selected model to the training data by adjusting its
parameters to minimize the error or loss function. This is done through
optimization techniques such as gradient descent.

4. Validation: Assessing the performance of the trained model on a separate


validation dataset to fine-tune hyper parameters and prevent overfitting.

5. Testing: Evaluating the final model on a separate testing dataset to estimate


its performance on unseen data and assess its generalization ability.
Introduction to Modelling and Evaluation
6. Evaluation: Evaluation involves assessing the performance of a trained
machine learning model to determine how well it can make predictions or
classifications on new, unseen data.

• Common evaluation metrics depend on the nature of the problem and may
include accuracy, precision, recall, F1-score, mean squared error, or area
under the receiver operating characteristic (ROC) curve.

• The evaluation process typically follows these steps:

1. Data Splitting: Dividing the available data into separate training, validation,
and testing sets.
Introduction to Modelling and Evaluation
2. Performance Measurement: Applying the trained model to the testing dataset
and computing relevant evaluation metrics to quantify its performance.
• This step helps assess the model's accuracy, robustness, and generalization
ability.
3. Comparison: Comparing the performance of different models or variations of
the same model using the same evaluation metrics.
This allows practitioners to select the best-performing model for the given
task and dataset.

4. Interpretation: Interpreting the evaluation results to gain insights into the


strengths and weaknesses of the model.
Introduction to Modelling and Evaluation
• MODELLING ( Advantages & Disadvantages )

• Advantages

1. Prediction and Decision-Making: Models allow for the prediction of outcomes or


decisions based on historical data, enabling informed decision-making.
2. Automation: Once trained, models can automate repetitive tasks, saving time
and effort.
3. Insight Discovery: Models can uncover hidden patterns, relationships, and
trends in data that may not be apparent (feasible) through manual analysis.
4. Scalability: Many models can scale with larger datasets and computational
resources, accommodating growing needs.
Introduction to Modelling and Evaluation
• Disadvantages

1. Overfitting: Models may capture noise or irrelevant patterns in the training


data, leading to poor generalization to new data.
2. Under fitting: Models may be too simplistic to capture the underlying
complexity of the data, resulting in poor performance.
3. Data Dependency: Models heavily rely on the quality, quantity, and
representativeness of the training data.
4. Interpretability: Some complex models lack interpretability, making it
challenging to understand their decision-making process.
5. Computational Complexity: Training and deploying complex models, especially
deep learning models, can require significant computational resources.
Introduction to Modelling and Evaluation
• MODELLING ( Limitations and Applications )
•Limitations

1. Data Quality: The quality of predictions is directly influenced by the quality of


the training data, including biases and missing information.
2. Assumptions: Many models make assumptions about the underlying data
distribution, which may not always hold true in real-world scenarios.
3. Bias and Fairness: Models can continue with the biases present in the training
data, leading to unfair or discriminatory outcomes.
4. Generalization: Models may not generalize well to new, unseen data if they are
not properly validated and evaluated.
Introduction to Modelling and Evaluation
• Applications

1. Predictive Analytics: Models are used for predicting customer behavior, stock
prices, disease outbreaks, and more.
2. Recommendation Systems: ML Models are powerful in personalized
recommendations for products, movies, music, and content.
3. Image and Speech Recognition: Models classify images, transcribe speech, and
enable facial recognition in various applications.
4. Natural Language Processing: Models analyze and generate human language,
facilitating translation, sentiment analysis, chat bots, and more.
Introduction to Modelling and Evaluation
5. Healthcare: Models assist in diagnosis, prediction, treatment
recommendation, drug discovery, and personalized medicine.
6. Finance: Models predict market trends, assess credit risk, detect fraud, and
optimize investment strategies.
7. Autonomous Vehicles: Models enable real- time decision-making for
autonomous vehicles, including obstacle detection, path planning, and
collision avoidance.

8. Climate Modeling: Models simulate climate patterns, predict weather


conditions, and assess the impact of environmental changes.
Selecting a Model in Machine Learning
• Selecting the right model is a crucial step in the machine learning pipeline as it
directly impacts the performance and interpretability of the final solution.

• Here's a detailed guide on how to select a model in machine learning along with
precautions and reasons for each step:

• Step 1: Understand the Problem

• Step 2: Explore Available Models

• Step 3: Data Exploration and Preparation

• Step 4: Model Training and Evaluation

• Step 5: Model Selection and Refinement


Selecting a Model in Machine Learning
• Step 1: Understand the Problem

• Identify the type of machine learning task: Is it


classification (spam vs. not spam), regression (predicting
house prices), or clustering (grouping similar customers) ?
• Define the desired outcome: What do you want the model to
predict or recommend ?
Selecting a Model in Machine Learning
• Example: You want to build a system to predict customer churn (leaving your
service). This is a classification problem where the model needs to predict
whether a customer is likely to churn or not.
• Reasons:

• Clearly define the problem: A well-defined problem helps you choose models
suited for that specific task.

• Consider data limitations: If your data is limited, complex models might


overfit and perform poorly.
Selecting a Model in Machine Learning
• Step 2: Explore Available Models

• Research models commonly used for your chosen task type (e.g., decision
trees, logistic regression for classification).
• Consider factors like interpretability (how easy it is to understand the
model's predictions) and computational cost (training time and resources).
• Example: For customer churn prediction, you might consider decision trees
for their interpretability or random forests for better accuracy.
• Reasons:

• Don't be limited by tradition: Explore newer models that might be better


suited for your specific problem.
• Beware of hype: Not all popular models are suitable for every task.
Selecting a Model in Machine Learning
• Step 3: Data Exploration and Preparation

• Analyze your data for characteristics relevant to model selection (e.g., data
size, feature types, presence of missing values).
• Preprocess your data to ensure it's compatible with the chosen model (e.g.,
scaling numerical features, handling categorical features).

• Example: If you have a lot of missing values, a model like k-Nearest Neighbors
might not be ideal as it relies heavily on complete data points.
Selecting a Model in Machine Learning
• Reasons:

• Data quality matters: Poor quality data leads to poor model


performance, regardless of the chosen model.
• Understand data biases: Biases in data can lead to biased predictions.
Consider techniques to mitigate bias.
Selecting a Model in Machine Learning
• Step 4: Model Training and Evaluation

• Split your data into training and evaluation sets.

• Train multiple models with different configurations and hyper parameters.

• Evaluate each model's performance on unseen data using relevant metrics


(e.g., accuracy, precision, recall for classification).
• Example: Train a decision tree and a random forest model for customer
churn prediction. Evaluate them on a separate test set to see which one
performs better in predicting churn.
Selecting a Model in Machine Learning
• Step 5: Model Selection and Refinement

• Based on evaluation results, choose the model with the best performance on
unseen data.

• Consider refining the chosen model by adjusting hyper parameters or trying


feature engineering techniques.
• Example: If the random forest performs better in predicting customer churn,
you can further refine it by trying different hyper parameter combinations.
Selecting a Model in Machine Learning
• Input variables can be denoted by X, while individual input variables are represented
as X1 , X1 , X3 , …, Xn and output variable by symbol Y.

• The relationship between X and Y is represented in the general form:


• Y = f (X) + e,
• where ‘f ’ is the target function and ‘e’ is a random error term
• Similar to target function some other functions are used like:
• Cost function (also called error function) helps to measure the extent to which the
model is going wrong in estimating the relationship between X and Y.
• Loss function is almost similar to cost function – only difference being loss function is
usually a function defined on a data point, while cost function is for the entire
training data set.
Selecting a Model in Machine Learning
• Objective function takes in data and model (along with parameters)
as input and returns a value. Hence, Machine learning is an
optimization problem.

• Machine learning algorithms are broadly of two types:

• 1. models for supervised learning, which primarily focus on solving


predictive problems and,

• 2. models for unsupervised learning, which solve descriptive


problems.
Selecting a Model in Machine Learning
• 1. Predictive models :

• Models for supervised learning or predictive models try to predict certain


value using the values in an input data set.

• The learning model attempts to establish a relation between the target


feature, i.e. the feature being predicted, and the predictor features.

• Some predictive model examples are :

• 1. Predicting win/loss in a cricket match.

• 2. Predicting whether a transaction is fraud.

• 3. Predicting whether a customer may move to another product.


Selecting a Model in Machine Learning
• The models which are used for prediction of target features of categorical value are
known as classification models.

• Some of the popular classification models include k-Nearest Neighbor (kNN), Naïve
Bayes, and Decision Tree.

• Predictive models may also be used to predict numerical values of the target feature
based on the predictor features.

• Some examples are :

• 1. Prediction of revenue growth in the succeeding year

• 2. Prediction of rainfall amount in the coming monsoon

• 3. Prediction of potential flu patients and demand for flu shots next winter
Selecting a Model in Machine Learning
• The models which are used for prediction of the numerical value of the target
feature of a data instance are known as regression models.

• Linear Regression and Logistic Regression models are popular regression


models.

• 2. Descriptive models

• Models for unsupervised learning or descriptive models are used to describe a


data set or gain insight from a data set.

• There is no target feature or single feature of interest in case of unsupervised


learning.
Selecting a Model in Machine Learning
• Descriptive models which group together similar data instances, i.e. data
instances having a similar value of the different features are called clustering
models.

• Examples of clustering include:

• 1. Customer grouping or segmentation on social, demographic, ethnic, etc.

• 2. Grouping of music based on different aspects like genre, language etc.

• The most popular model for clustering is k-Means.


Training a Model in Machine Learning
• Training a model in machine learning is a systematic process that
involves teaching the model to make predictions or decisions based on
input data. Here's a step-by-step guide along with the reasons for each
step:

• Step 1: Data Preprocessing: Step 5: Train the Model:

• Step 2: Split the Data: Step 6: Evaluate the Model:

• Step 3: Choose a Model: Step 7: Hyper parameter Tuning

• Step 4: Initialize the Model: Step 8: Model Deployment:


Training a Model in Machine Learning
• Step 1: Data Preprocessing:

• Reason: Preprocessing the data ensures that it is clean, consistent, and in a


suitable format for training the model. This step helps in mitigating issues
such as missing values, outliers, and feature scaling disparities, which can
adversely affect model performance.

1. Handle missing values: Impute missing values or remove rows/columns with


missing data.

2. Outlier detection and treatment: Identify and handle outliers to prevent


them from skewing the model's learning process.
3. Feature scaling: Normalize or standardize features to bring them to a similar
scale, avoiding biases in model training.
Training a Model in Machine Learning
• Step 2: Split the Data:

• Reason: Splitting the data into training and testing sets allows for
evaluating the model's performance on unseen data and helps prevent
overfitting.
• Steps:

1. Split the dataset into training and testing sets, typically using a ratio
such as 70-30 or 80-20.

2. Optionally, perform cross-validation on the training set to assess the


model's generalization performance.
Training a Model in Machine Learning
• Step 3: Choose a Model:

• Reason: Selecting an appropriate model architecture or algorithm that


suits the problem domain and data characteristics is crucial for
achieving optimal performance.
• Steps:

1. Consider the problem type (e.g., regression, classification) and choose a


model accordingly.

2. Experiment with different algorithms and architectures, considering


factors such as model complexity, interpretability, and computational
resources.
Training a Model in Machine Learning
• Step 4: Initialize the Model:

• Reason: Initializing the model parameters prepares it for training


and set the initial conditions for optimization.

• Steps:

• Initialize the model parameters randomly or using predefined


values, depending on the algorithm and architecture.
Training a Model in Machine Learning
• Step 5: Train the Model:

• Reason: Training the model involves iteratively updating its parameters


to minimize a loss function, thereby improving its predictive performance.
• Steps:

1. Feed the training data into the model and obtain predictions.

2. Calculate the loss between the predicted and actual values using a
suitable loss function.

3. Use an optimization algorithm (e.g., gradient descent) to update the


model parameters and minimize the loss.
Training a Model in Machine Learning
• Repeat the process for multiple epochs or until convergence criteria are met.

• Step 6: Evaluate the Model:

• Reason: Evaluating the trained model on the testing set provides insights into

its generalization performance and helps to identify potential issues such as

overfitting or under fitting.

• Steps:

1. Use evaluation metrics such as accuracy, precision, recall, F1-score, or mean

squared error to assess the model's performance.


Training a Model in Machine Learning
2. Compare the model's performance on the testing set with its performance on

the training set to detect overfitting or under fitting.

• Step 7: Hyper parameter Tuning:

• Reason: Fine-tuning the model's hyper parameters helps optimize its

performance and improve its generalization ability.

• Steps:

1. Experiment with different values for hyper parameters such as learning rate,

regularization strength, and network architecture.


Training a Model in Machine Learning
• Use techniques like grid search or random search to search the hyper parameter
space efficiently.
• Step 8: Model Deployment:

• Reason: Deploying the trained model in a production environment allows for


making predictions on new, unseen data and deriving actionable insights.
• Steps:

1. Save the trained model parameters and architecture to disk for future use.

2. Integrate the model into the production system or application, ensuring


compatibility and scalability.
Training a Model in Machine Learning
• FOR SUPERVISED LEARNING

• 1. Holdout method:

• In case of supervised learning, a model is trained using the labelled input data.
However, how can we understand the performance of the model ?

• The test data may not be available immediately.

• Hence, a part of the input data is held back for evaluation of the model.

• The subset of the input data is used as the test data for evaluating the
performance of a trained model.
Training a Model in Machine Learning
• The method of partitioning the input data into two parts – training and test data
which is by holding back a part of the input data for validating the trained model
is known as holdout method.

• The figure below shows the Holdout method.


Training a Model in Machine Learning
• In certain cases, the input data is partitioned into three portions – a
training and a test data, and a third validation data.

• The validation data is used in place of test data, for measuring the
model performance. It is used in iterations and to refine the model
in each iteration.

• The test data is used only for once, after the model is refined and
finalized.
Training a Model in Machine Learning
2. K-fold Cross-validation method

• In k-fold cross-validation, the data set is divided into k-completely distinct or


non-overlapping random partitions called folds.

• Figure below depicts an overall approach for k-fold cross-validation.


Training a Model in Machine Learning
• The value of ‘k’ in k-fold cross-validation can be set to any number.
• However, there are two approaches which are extremely popular:
• 2.a 10-fold cross-validation (10-fold CV)
• 2.b Leave-one-out cross-validation (LOOCV)
• In 10-fold cross-validation, for each of the 10-folds, each comprising of
approximately 10% of the data, one of the folds is used as the test data
for validating model performance.
• Training is based on the remaining 9 folds (or 90% of the data).
• This is repeated 10 times, once for each of the 10 folds being used as the
test data and the remaining folds as the training data.
Training a Model in Machine Learning
• The average performance across all folds is being reported.

• Figure below depicts the detailed approach of selecting the ‘k’ folds in k-fold
cross-validation.
Training a Model in Machine Learning
• 2.b Leave-one-out cross-validation (LOOCV) : is an extreme case of
k-fold cross-validation using one record or data instance at a time
as a test data.

• This is done to maximize the count of data used to train the model.

• It is obvious that the number of iterations for which it has to be run


is equal to the total number of data in the input data set.

• Hence, obviously, it is computationally very expensive and not used


much in practice.
Training a Model in Machine Learning
• 3. Bootstrap sampling:

• Bootstrap sampling or simply bootstrapping is a popular way to identify


training and test data sets from the input data set.

• It uses the technique of Simple Random Sampling with Replacement (SRSWR),


which is a well-known technique in sampling theory for drawing random
samples.

• Bootstrapping randomly picks data instances from the input data set, with the
possibility of the same data instance to be picked multiple times.

• Figure below briefly presents the approach followed in bootstrap sampling.


Training a Model in Machine Learning
• This technique is particularly useful in case of input data sets of small size, i.e.
having very less number of data instances.

• Input data set having ‘n’ data instances, bootstrapping can create one or more
training data sets having ‘n’ data instances.
Training a Model in Machine Learning
• 4. Lazy vs. Eager learner

• In machine learning, lazy learners and eager learners refer to different


approaches for building models based on training data.

• An eager learner, generalizes the training data during the training phase
itself. It builds a model as soon as it receives the training data, and this model
is used to make predictions.

• When the test data comes in for classification, the eager learner is ready with
the model and doesn’t need to refer back to the training data.
Training a Model in Machine Learning
• Eager learners take more time in the learning phase than the lazy learners.
Some of the algorithms which adopt eager learning approach include Decision
Tree, Support Vector Machine, Neural Network, etc.

• A lazy learner is a learning model that defers the process of generalization until
a query is made.

• In other words, it stores the training data and waits until it receives a query (i.e.,
an input) before performing any computation.

• This means that lazy learners don't build models based on training data until
they are needed for prediction.
Training a Model in Machine Learning
• Lazy learners take very little time in training because not much of
training actually happens.

• One of the most popular algorithm for lazy learning is k-nearest neighbor.
Model Representation and Interpretability Machine Learning
• Model representation and interpretability in machine learning refer to how a
trained model captures and represents patterns and relationships in the data
and how understandable and explainable the model's decisions are to humans.
The step-by-step guide with suitable real-time examples are :

• Step 1: Model Representation:

• Step 2: Model Interpretability:

• Step 3: Trade-offs and Considerations:

• Step 4: Validation and Feedback:

• Step 5: Documentation and Communication:


Model Representation and Interpretability Machine Learning
• Step 1: Model Representation:

• Definition: Model representation involves understanding how the model


internally represents the learned patterns and relationships from the
data.
• Steps:

1. Feature Importance: Analyze the importance of different features


in the model's decision-making process.

2. Coefficients (for linear models): Examine the coefficients assigned to


each feature in linear models, indicating their impact on the target
variable.
Model Representation and Interpretability Machine Learning
3. Decision Boundaries (for classification models): Visualize decision
boundaries to understand how the model separates different classes
in the feature space.

• Example: Consider a logistic regression model trained to predict


customer churn in a telecommunications company.

• By examining the coefficients assigned to different customer attributes


(e.g., contract length, monthly charges), we can understand which factors
contribute most significantly to churn prediction.
Model Representation and Interpretability Machine Learning
• Step 2: Model Interpretability:

• Definition: Model interpretability refers to how understandable and


explainable the model's predictions are to stakeholders, such as domain experts
or end-users.
• Steps:

1. Feature Importance Rankings: Rank features based on their contribution to


model predictions, providing insights into which features are most
influential.
2. Partial Dependence Plots: Visualize the relationship between individual
features and the model's predictions while marginalizing over other
features.
Model Representation and Interpretability Machine Learning
3. LIME (Local Interpretable Model-agnostic Explanations): Generate local
explanations for individual predictions by perturbing feature values and
observing changes in predictions.
• Example: Suppose we have a random forest classifier for diagnosing medical
conditions based on patient symptoms.
• By generating partial dependence plots for important features (e.g., blood
pressure, cholesterol levels), we can understand how changes in these features
affect the model's predictions for different conditions.
Model Representation and Interpretability Machine Learning
• Step 3: Trade-offs and Considerations:

1. Trade-offs: Balance between model complexity and interpretability, as more


complex models may sacrifice interpretability for improved performance.
2. Considerations: Consider the audience's expertise level and the importance of
model interpretability in the application domain.
• Example: In finance, a bank may prioritize model interpretability in credit
risk assessment models to comply with regulatory requirements and provide
transparent explanations for loan approval decisions, even if it means using
simpler models with slightly lower predictive accuracy.
Model Representation and Interpretability Machine Learning
• Step 4: Validation and Feedback:

2. Validation: Validate the model's representations and interpretations through


domain expert validation and quantitative assessments.
1. Feedback: Gather feedback from stakeholders to improve the model’s
interpretability and ensure that the explanations provided are meaningful
and actionable.
• Example: After deploying a machine learning model for predicting customer
preferences in an e-commerce platform, collect feedback from customer service
representatives to validate the model's feature importance rankings and ensure
they align with their domain knowledge.
Model Representation and Interpretability Machine Learning
• Step 5: Documentation and Communication:

1. Documentation: Document the model's representations and


interpretations, including feature importance rankings, decision rules,
and explanation methods used.
2. Communication: Clearly communicate the model's interpretations to
stakeholders using visualizations, reports, and interactive tools.
• Example: Prepare a detailed report documenting the feature importance
rankings and partial dependence plots of a machine learning model for
predicting stock market trends, and present the findings to investors and
financial analysts in a clear and accessible manner.
Model Representation and Interpretability Machine Learning
• The goal of supervised machine learning is to learn or derive a target function
which can best determine the target variable from the set of input variables.

• Fitness of a target function approximated by a learning algorithm determines


how correctly it is able to classify a set of data it has never seen.

• Model representation involves understanding how the model internally


represents the learned patterns and relationships from the data.

• Some issues that are faced during the model representation are :

• 1. Underfitting 3. Bias – variance trade-off

• 2. Overfitting 3.1 Errors due to ‘Bias’

3.2 Errors due to ‘Variance’


Model Representation and Interpretability Machine Learning
• 1. Underfitting
• If the target function is kept too simple, it may not be able to capture the
essential information and represent the underlying data well.
• A typical case of underfitting may occur when trying to represent a non-
linear data with a linear model.
• Reasons for underfitting
1. Many times underfitting happens due to unavailability of sufficient
training data.
2. Underfitting results in both poor performance with training data as well
as poor generalization to test data.
Model Representation and Interpretability Machine Learning
• How to avoid underfitting ?

• Underfitting can be avoided by :

• 1. using more training data 2. reducing features by effective


feature selection.

• 2. Overfitting

• Overfitting refers to a situation where the model has been designed in


such a way that it emulates the training data too closely.

• Any specific deviation in the training data, like noise or outliers, gets
embedded in the model.
Model Representation and Interpretability Machine Learning
• It adversely impacts the performance of the model on the test data.

• Overfitting results in good performance with training data set, but poor
performance with test data set.

• Hence, model behaves with poor generalization.

• Overfitting can be avoided by :

• 1. using re-sampling techniques like k-fold cross validation

• 2. hold back of a validation data set

• 3. remove the nodes which have little or no predictive power for the given
machine learning problem.
Model Representation and Interpretability Machine Learning
• The representation of underfitting and overfitting with a sample
data set is shown in figure below:
• The target function, in these
cases, tries to make sure all
training data points are
correctly partitioned by the
decision boundary.
Model Representation and Interpretability Machine Learning
• The figure shows examples in both
regression (top row) and
classification (bottom row).

• Underfitting :

• Regression (Top-left)The model is


too simple (e.g., a straight line).

• Classification (Bottom-left): The


decision boundary (line separating
classes) is too simple, and it does
not capture the true distribution of
the two classes.
Model Representation and Interpretability Machine Learning
• Balanced fitting :

• Regression (Top-middle) The model


appropriately captures the
underlying pattern.

• Classification (Bottom-middle): The


decision boundary is complex
enough to separate the two classes
accurately, but not overly complex.
Model Representation and Interpretability Machine Learning
• Overfitting:

• Regression (Top-right): The model


is too complex, fitting every data
point, including noise.

• Classification (Bottom-right): The


decision boundary is overly
complicated, capturing even minor
variations and noise.

• In machine learning, the goal is to achieve a balanced fit, which provides good
performance on both training and unseen test data.
Model Representation and Interpretability Machine Learning
• 3. Bias – variance trade-off

• In supervised learning, the class value assigned by the learning model


based on the training data may differ from the actual class value.

• This error in learning can be of two types – errors due to ‘bias’ and error
due to ‘variance’.

• 3.1 Errors due to ‘Bias’.

• Errors due to bias arise from simplifying assumptions made by the


model to make the target function less complex or easier to learn.

• In short, it is due to underfitting of the model.


Model Representation and Interpretability Machine Learning
• For example, a linear model trying to fit complex, nonlinear data will
have high bias.

• 3.2 Errors due to ‘Variance’

• Variance refers to the model's sensitivity to fluctuations in the training


data.

• A model with high variance tends to overfit the training data, capturing
noise and random fluctuations.

• For example, a very complex model (e.g., a high-degree polynomial) may


fit the training data perfectly, but it won't generalize well to unseen data.
Model Representation and Interpretability Machine Learning
• The problems in training a model can either happen because either:

• (a) the model is too simple and hence fails to interpret the data grossly or.

• (b) the model is extremely complex and magnifies even small differences
in the training data.

• Increasing the bias will decrease the variance, and Increasing the
variance will decrease the bias.

• The best solution is to have a model with low bias as well as low variance
which may not be possible as shown in Figure below :
Model Representation and Interpretability Machine Learning
Evaluating Performance of a Model in Machine Learning
• Evaluating the performance of a machine learning model is essential to assess
its effectiveness in making predictions or classifications. Here's a detailed step-
by-step procedure with real-time examples:
• Step 1: Define Evaluation Metrics

1. Choose metrics relevant to your specific problem type. Don't rely solely on
overall accuracy!

2. Classification Tasks (e.g., spam filtering, customer churn prediction):

3. Accuracy: Proportion of correct predictions (both positive and negative).

4. Precision: Ratio of true positives to all predicted positives (how good is the
model at identifying true positives?).
Evaluating Performance of a Model in Machine Learning
5. Recall: Ratio of true positives to all actual positives (how good is the model at
finding all the positives?).
Example : Regression Tasks (e.g., stock price prediction, house price prediction)

6. Mean Squared Error (MSE): Average squared difference between predicted and
actual values (lower MSE indicates better performance).
7. R-Squared: Proportion of variance in the target variable explained by the
model (higher R- squared indicates better fit).
• Example: Spam filter evaluation:

1. Accuracy: Tells you the overall percentage of emails correctly classified as spam or
not-spam.
Evaluating Performance of a Model in Machine Learning
2. Precision: Measures how good the filter is at identifying actual spam emails
(avoiding false positives).
3. Recall: Measures how good the filter is at catching all spam emails (avoiding
false negatives).
Example

Confusion Matrix : A confusion matrix is an N x N matrix used for evaluating the


performance of the classification model, where N is the number of target
classes.
This matrix compares the actual target values with these predicted by the
machine learning model.
Evaluating Performance of a Model in Machine Learning
• Consider a binary classifier predicting
the presence of email a spam or not
spam.

• The classifier made of total of 150


predictions, out of 150 emails, the
classifier predicted “yes” 100 times and
“no” 50 times.
Evaluating Performance of a Model in Machine Learning
• Accuracy : overall correct predictions (both
positive and negative).
• Accuracy = TP + TN / TP+TN+FP+FN
• = 45 + 95 / 45 + 95 + 5 + 5
• = 93.33%

• Recall or sensitivity or true positive rate:


Measures how good the filter is at catching
all positive spam emails (avoiding false
negatives).
• Recall = TP / Actual Yes
• = 95 / 100
Evaluating Performance of a Model in Machine Learning
• Precision : Measures how good the filter is at identifying
actual spam emails (avoiding false positives).

• Precision = TP / Predicted Yes

• = 95 / 100 = 95%

• F1-Score : The F1-Score is especially useful when the dataset


is imbalanced, meaning there are significantly more
instances of one class than another.
Evaluating Performance of a Model in Machine Learning

• Step 2: Split Data into Training, Validation, and Test Sets

1. Training Set (Majority): Used to train the model (typically 60-80% of the
data).

2. Validation Set (Optional): Used for hyper parameter tuning during


training (10-20% of the data). Not always used, but helpful for avoiding
overfitting.

3. Test Set (Minority): Used for final evaluation of the trained model's
performance on unseen data (typically 10-20% of the data).
Evaluating Performance of a Model in Machine Learning
• Reasoning: The training set teaches the model, the validation set helps fine-tune
it, and the test set provides an unbiased assessment of its generalizability.

• Step 3: Train the Model

• Use the training data to train your chosen machine learning algorithm.
• Step 4: Evaluate the Model on the Test Set

1. Apply the trained model to the unseen test data.

2. Calculate the chosen evaluation metrics based on the model's predictions and
the actual target values in the test set.
Evaluating Performance of a Model in Machine Learning
• Step 5: Analyze the Results

1. Interpret the evaluation metrics. High accuracy might not be enough!


Consider precision, recall, or other relevant metrics.
2. Look for potential issues:

1. Over fitting: The model performs well on training data but poorly on the
test set.

2. Under fitting: The model is too simple and doesn't capture the underlying
patterns in the data.
Evaluating Performance of a Model in Machine Learning
• Step 6: (Optional) Retrain and Re-evaluate

1. Based on the evaluation results, you might need to:

2. Refine the model: Adjust hyper parameters, try feature engineering

techniques, or even explore a different algorithm.


3. Collect more data: If the data is limited, consider gathering more data to
improve model performance.
Improving performance of a Model in Machine Learning
• Achieving optimal performance from a machine learning model is an ongoing
process.

• Improving the performance of a machine learning model involves optimizing


its predictive accuracy, generalization ability, and efficiency.

• A detailed step-by-step procedure with real-time examples are :


• Step 1: Identify Performance Metrics:

• Procedure: Define the performance metrics that are most relevant to the
problem at hand (e.g., accuracy, precision, recall, F1-score, ROC-AUC for
classification; mean squared error, R- squared for regression).
• Explanation: Identifying appropriate performance metrics provides a clear
objective for model improvement.
Improving performance of a Model in Machine Learning
• Step 2: Analyze Model Errors:

• Procedure: Analyze the errors made by the model on the validation/testing set.

• Explanation: Understanding the types of errors (e.g., false positives, false


negatives) helps identify patterns and areas for improvement.

• Step 3: Feature Engineering:


• Procedure: Engineering new features or transform existing ones to capture
more relevant information from the data.
• Explanation: Feature engineering enhances the model's ability to learn
complex relationships and patterns in the data.
Improving performance of a Model in Machine Learning
• Step 4: Hyper parameter Tuning:

• Procedure: Experiment with different hyper parameter settings (e.g., learning


rate, regularization strength, tree depth) using techniques like grid search or
random search.
• Explanation: Optimizing hyper parameters improves the model's performance
and generalization ability.
• Step 5: Algorithm Selection:

• Procedure: Consider alternative algorithms or models that may better suit the
problem domain or data characteristics.
• Explanation: Different algorithms have different strengths and weaknesses,
and switching to a more suitable algorithm can lead to improved performance.
Improving performance of a Model in Machine Learning
• Step 6: Ensemble Methods:

• Procedure: Combine multiple models (e.g., bagging, boosting, stacking) to


leverage their collective predictive power.

• Explanation: Ensemble methods often outperform individual models by


reducing variance and bias, leading to better performance.

• Step 7: Cross-Validation:

• Procedure: Perform cross-validation to obtain more reliable estimates of the


model's performance and generalization ability.
• Explanation: Cross-validation helps assess how well the model generalizes to
unseen data and provides insights into its stability and robustness.
Improving performance of a Model in Machine Learning
• Step 8: Regularization:

• Procedure: Apply regularization techniques (e.g., L1/L2 regularization,


dropout) to prevent overfitting and improve the model's generalization ability.
• Explanation: Regularization penalizes overly complex models, leading to
smoother decision boundaries and better generalization to unseen data.
• Step 9: Data Augmentation (for image/audio data):

• Procedure: Generate additional training examples by applying


transformations such as rotation, flipping, or scaling to the original data.
• Explanation: Data augmentation increases the diversity of the training data,
helping the model learn more robust and invariant features.
Improving performance of a Model in Machine Learning
• Step 10: Model Enrichment (for text data):

• Procedure: Incorporate pre-trained word embedding’s (e.g., Word2Vec, GloVe,


BERT) or language models to capture richer semantic information from text
data.
• Explanation: Pre-trained embedding’s provide contextualized
representations of words or sentences, improving the model's understanding
of text semantics.
• Real-time Example: Image Classification for Autonomous Vehicles:
Feature subset selection in Machine Learning
1. Feature subset selection in machine learning refers to the process of
identifying and selecting a subset of the most relevant features from the original
set of features in a dataset.
2. The goal is to improve the performance of the model by reducing
dimensionality, minimizing overfitting, and enhancing interpretability.
3. It's like decluttering the closest – we keep the item that we use often (relevant
features) and discard the rest (irrelevant features).

• Why Use Feature subset selection ?

1. Improved Model Performance: Removing irrelevant or redundant features can


lead to simpler models that generalize better to unseen data and potentially
reduce overfitting.
Feature subset selection in Machine Learning
2. Reduced Training Time: Fewer features mean less computation needed to train
the model, saving time and resources.
3. Enhanced Interpretability: With fewer features, it becomes easier to
understand the model's decision-making process.

• Procedure:

1. Data Preparation: Clean and pre-process your data before feature selection.

2. Evaluation Metric: Define an evaluation metric to assess the performance of


different feature subsets (e.g., accuracy, F1-score).
3. Feature Ranking/Selection Techniques: Choose a method to identify relevant
features. Here are some common approaches:
Feature subset selection in Machine Learning
• 3.1 Filter Methods: These methods score each feature independently based on
their statistical properties (e.g., variance, correlation) and select features
exceeding a certain threshold.

• Example: Using a filter method like chi-squared test to identify features that
have a strong correlation with the target variable in a classification task.

• 3.2 Wrapper Methods: These methods evaluate feature subsets based on their
impact on the performance of a machine learning model. They train the model
with different feature subsets and choose the one that yields the best
performance.
Feature subset selection in Machine Learning
• Example: Using a wrapper method like recursive feature elimination (RFE) to
iteratively remove the least informative feature and retrain the model until a
desired performance level is reached.

• 3.3 Embedded Methods: These methods integrate feature selection as part of the
model training process.

• Some algorithms, like decision trees, inherently perform feature selection during
training.

4. Evaluation and Refinement: Evaluate the performance of the model using the
selected features with your chosen evaluation metric.
Feature subset selection in Machine Learning
• Advantages: Disadvantages:

1. Improved model performance and 1. Choosing the "best" subset can be


subjective and depends on the
generalization
chosen evaluation metric.
2. Reduced training time and
2. Removing relevant features can lead
computational cost
to under fitting and decreased
3. Enhanced model interpretability, can
model performance.
help identify important features and
3. Feature selection algorithms can be
relationships within the data
computationally expensive for large
datasets.
Feature subset selection in Machine Learning
• Limitations:

1. Feature selection methods might not always find the optimal subset.

2. The effectiveness depends on the quality and characteristics of the data.

3. Curse of Dimensionality: Feature subset selection may not be effective in high-


dimensional datasets where the number of features is much larger.
4. Model Bias: The choice of feature selection method and search strategy may
introduce bias into the model, affecting its performance.
5. Task Dependency: The effectiveness of feature subset selection depends on the
specific machine learning task and the characteristics of the dataset.
Feature subset selection in Machine Learning
• Applications:

1. Bioinformatics: Identifying relevant genes or protein features for disease


prediction or drug discovery.
2. Healthcare:

• Example: In medical diagnosis, feature subset selection can help identify the
most relevant patient characteristics.
3. Finance:

• Example: In credit scoring, selecting a subset of the most predictive financial


attributes (e.g., credit history, income, debt-to-income ratio) can improve the
accuracy of credit risk assessment models.
Feature subset selection in Machine Learning
4. Image Recognition:

• Example: In image classification tasks, identifying a subset of


discriminative image features (e.g., edges, textures, colors) can reduce
computational complexity and improve the efficiency of convolutional
neural networks.
5. Text Classification:

• Example: In sentiment analysis, selecting a subset of informative words or


n-grams from text documents can enhance the performance of classification
models by focusing on the most relevant textual features.
Principal Component Analysis (PCA) in Machine Learning:
• PCA (Principal Component Analysis) is a dimensionality reduction technique
widely used in machine learning and data analysis.

• Principal Component Analysis (PCA) is used to reduce the dimensionality of a


data set by finding a new set of variables, smaller than the original set of
variables, retaining most of the sample’s information, and useful for the
regression and classification of data.

• Principal Component Analysis (PCA) is a technique for dimensionality


reduction that identifies a set of orthogonal axes, called principal components,
that capture the maximum variance in the data.
Principal Component Analysis (PCA) in Machine Learning:

• The first principal component


captures the most variation in
the data, but the second
principal component captures
the maximum variance that
is orthogonal to the first
principal component

• Principal Component Analysis can be used for a variety of purposes, including


data visualization, feature selection, and data compression.
Principal Component Analysis (PCA) in Machine Learning:
• Why PCA is Important?

• PCA tackles high-dimensional data, datasets with many features. It


transforms the data into a lower-dimensional space while preserving the most
significant information.

• This is crucial because:

• Reduced Complexity: PCA simplifies complex datasets, making them easier to


visualize, analyze, and use for machine learning algorithms.

• Improved Performance: By focusing on the most informative features (principal


components), PCA can lead to better performance in tasks like classification,
clustering, and regression.
Principal Component Analysis (PCA) in Machine Learning:
• Noise Reduction: PCA can remove noise and redundant information from the
data, leading to more robust models.

• Procedure:

1. Data Standardization: Prepare your data by standardizing it. This ensures all
features contribute equally to the analysis.
2. Covariance Matrix Calculation: Calculate the covariance matrix, which
captures the linear relationships between all features in your data.
3. Eigenvalue Decomposition: Decompose the covariance matrix to find its
eigenvectors (principal components) and eigenvalues.
Principal Component Analysis (PCA) in Machine Learning:
• Eigenvectors represent the directions of greatest variance in the data, and
eigenvalues represent the amount of variance explained by each
component.
4. Dimensionality Reduction: Select the top 'n' principal components that
explain the majority of the variance in the data (e.g., 90%). This 'n'
represents the new, lower-dimensional space.
5. Data Transformation: Project the original data onto the selected
principal components, effectively creating a lower-dimensional
representation of the data.
Principal Component Analysis (PCA) in Machine Learning:
• Advantages of PCA:

1. Reduced Complexity: Makes data easier to visualize, analyze, and


interpret.

2. Improved Model Performance: Can lead to better results in machine


learning tasks.

3. Reduced Training Time: Lower-dimensional data requires less time and


computational resources to train models.
4. Reduced Noise Sensitivity: Less susceptible to noise in the data.
Principal Component Analysis (PCA) in Machine Learning:

• Disadvantages of PCA:
1. Information Loss: PCA discards information by eliminating less
significant principal components.

2. Interpretability Loss: Principal components might be harder to


interpret than original features, as they are linear combinations of the
original features.
3. Assumes Linear Relationships: Works best when features have linear
relationships. If the relationships are highly non-linear, PCA might not
be as effective.
Principal Component Analysis (PCA) in Machine Learning:
• Applications of PCA:
PCA's versatility makes it a valuable tool across various domains:
1. Image Compression: Reduce image size for storage and transmission
while preserving essential details (e.g., compressing photos for faster
web loading).
2. Anomaly Detection: Identify outliers and data points that deviate
significantly from the norm (e.g., detecting fraudulent transactions in
financial data).
3. Recommendation Systems: Recommend products or services based on
user preferences in a lower-dimensional space (e.g., suggesting movies
to users on streaming platforms).
Principal Component Analysis (PCA) in Machine Learning:

5. Natural Language Processing (NLP): Reduce the dimensionality of


text data for tasks like sentiment analysis or topic modeling (e.g.,
analyzing customer reviews to understand overall sentiment).
6. Feature Engineering: Create new features (principal components)
that capture the most important information in the data,
potentially improving machine learning model performance.
Single Value Decomposition(SVD) in Machine Learning
• Singular Value Decomposition (SVD) is a powerful mathematical technique
used in various fields, including machine learning and data analysis.

• It decomposes a matrix into its underlying building blocks, revealing hidden


patterns and relationships within the data.

• Importance of Singular Value Decomposition (SVD):

1. Dimensionality Reduction: Like PCA (Principal Component Analysis), SVD


can be used to reduce the dimensionality of data while preserving important
information.
2. Data Compression: SVD can compress data by identifying the most
significant components, making it useful for storage and transmission.
Single Value Decomposition(SVD) in Machine Learning
3. Noise Reduction: SVD can help remove noise and redundancy from the
data, improving the quality and interpretability of the data.
4. Pattern Recognition: SVD excels at uncovering hidden patterns and
relationships within complex datasets, aiding tasks like anomaly detection
and recommendation systems.
Procedure of SVD:
A Singular Value Decomposition (SVD) is a powerful mathematical tool
that provides a factorization of a given matrix into three matrices: a
unitary matrix, a diagonal matrix, and its conjugate transpose.
Single Value Decomposition(SVD) in Machine Learning
• Matrix Representation: Represent your data as a matrix, where rows represent
data points and columns represent features.

• Decomposition: Decompose the matrix into three matrices:

• The SVD of MxN matrix A is given by the formula A = U Σ V T

• where:

• U: A left singular matrix containing orthogonal basis vectors representing the


data's row space U = ATA.

• Σ (Sigma): A diagonal matrix containing the singular values, representing the


importance of each basis vector.
Single Value Decomposition(SVD) in Machine Learning
• V (transposed): A right singular matrix containing orthogonal basis vectors

representing the data's column space V = AAT.

• Dimensionality Reduction (Optional): Select a subset of the singular

values (those with the largest values) and reconstruct a lower-

dimensional approximation of the original matrix .

• What is the difference between SVD and PCA?

• SVD gives you the whole nine-yard of diagonalizing a matrix into special

matrices that are easy to manipulate and to analyze.


Single Value Decomposition(SVD) in Machine Learning
• Advantages of SVD:

1. Versatility: Applicable to various data types (numerical, text) and tasks


(dimensionality reduction, noise reduction, pattern recognition).
2. Computationally Efficient: Efficient algorithms exist for calculating
SVD, making it suitable for large datasets.
3. Noise Reduction: Can effectively remove noise and redundancy from the
data.
4. Pattern Recognition: Excellent at uncovering hidden patterns and
relationships within complex datasets.
Single Value Decomposition(SVD) in Machine Learning
• Disadvantages of SVD:

1. Interpretability: The singular values and basis vectors might not


be directly interpretable in human terms, requiring additional
analysis.
2. Computational Cost: While SVD is efficient, but can still be
computationally expensive for very large matrices.
3. Limitations in non-linear relationships : SVD assumes that the
relationships between features are linear, which can limit its
ability to capture complex non-linear relationships.
Single Value Decomposition(SVD) in Machine Learning
• Applications of SVD:

1. Recommendation Systems: Identify hidden relationships between users and


items to recommend relevant products or content.
2. Image Compression: Compress images by discarding less significant singular
values, reducing file size while preserving key features.
3. Anomaly Detection: Detect data points that deviate significantly from the
patterns captured by the singular values, potentially indicating anomalies or
outliers.
4. Natural Language Processing (NLP): Analyze and understand text data by
uncovering latent semantic relationships between words and documents.
Single Value Decomposition(SVD) in Machine Learning
5. Signal Processing: Separate signals from noise in audio or video
data by exploiting the different properties captured by the singular
values.

• Example :

• Lets take a simple 2x2 matrix A

• For a given matrix A, SVD is expressed as :

• A=UΣVT
Single Value Decomposition(SVD) in Machine Learning
Single Value Decomposition(SVD) in Machine Learning
Single Value Decomposition(SVD) in Machine Learning
Single Value Decomposition(SVD) in Machine Learning
Single Value Decomposition(SVD) in Machine Learning
Single Value Decomposition(SVD) in Machine Learning
Single Value Decomposition(SVD) in Machine Learning
import numpy as np
import pandas as pd
from sklearn.decomposition import TruncatedSVD
from sklearn.datasets import load_iris

# Load the iris dataset


iris = load_iris()
X = iris.data

# Perform SVD
svd = TruncatedSVD(n_components=2)
X_reduced = svd.fit_transform(X)

# Print the shape of the reduced dataset


print("Shape of reduced dataset:", X_reduced.shape)
FA (Factory Analysis) in Machine Learning:
• What is Factor Analysis?

• Factor analysis is a statistical technique used in machine learning and data


analysis to uncover latent variables (hidden factors) that explain the observed
correlations between multiple measured variables.

• These latent variables represent underlying patterns or constructs that cannot


be directly observed but influence the observed data.
• Importance of Factor Analysis:

1. Dimensionality Reduction: Factor analysis simplifies complex datasets by


reducing the number of variables to a smaller set of latent factors, making
data easier to analyze and visualize.
FA (Factory Analysis) in Machine Learning:
2. Improved Model Performance: By focusing on the underlying factors that
drive the data, factor analysis can lead to better performance in machine
learning tasks like classification, clustering, and regression.
3. Enhanced Understanding: It helps tp identify the underlying factors that
influence the observed data, providing insights into the relationships between
variables.
• Procedure of Factor Analysis:

1. Data Preparation:

• Ensure your data is clean, consistent, and properly pre-


processed. Handle missing values and outliers appropriately.
FA (Factory Analysis) in Machine Learning:
2. Correlation Matrix Calculation:

• Calculate the correlation matrix, which captures the linear relationships


between all pairs of variables.
3. Factor Extraction:

• Choose a factor extraction method, such as Principal Component Analysis


(PCA), Maximum Likelihood, or Exploratory Factor Analysis (EFA) to
identify the latent factors.
4. Factor Rotation:

• Factors might be rotated to improve interpretability. Rotation doesn't change


the underlying structure but redistributes the variance explained by each
factor.
FA (Factory Analysis) in Machine Learning:
5. Model Evaluation and Interpretation:

• Assess the model's fit using goodness-of-fit measures and evaluate the
interpretability of the extracted factors. You might need to refine the number
of factors or extraction method based on the results.

• Advantages of Factor Analysis:

1. Dimensionality Reduction: Simplifies complex data and improves model


efficiency.

2. Improved Model Performance: Can lead to better results in various machine


learning tasks.

3. Enhanced Understanding: Uncovers hidden relationships and factors influencing


data.
FA (Factory Analysis) in Machine Learning:
4. Flexibility: Applicable to various data types and tasks.

• Disadvantages of Factor Analysis:

1. Subjectivity: Choosing the number of factors and interpreting them can be


subjective.

2. Assumptions: Relies on assumptions about the underlying structure of the data,


such as linearity and multivariate normality.
3. Limited Interpretability: Extracted factors might not always be easily
interpretable in real- world terms.
FA (Factory Analysis) in Machine Learning:
• Applications of Factor Analysis:

1. Psychometrics: Identifying latent traits like personality or intelligence from


psychological test scores.
2. Finance: Analyzing market factors that influence stock prices or financial risk.

3. Social Sciences: Exploring underlying factors influencing social phenomena or


survey responses.
4. Marketing: Understanding customer segmentation and preferences based on
underlying factors.
5. Image Processing: Identifying key components in images or videos using
factor analysis on pixel data.
Linear Discriminant Analysis (LDA)
• What is Linear Discriminant Analysis(LDA) in Machine Learning?

• In machine learning, Linear Discriminant Analysis (LDA) is a supervised learning


technique used for classification tasks.
• Why Use It?

1. Classification: LDA is specifically designed to classify data points into predefined


categories.

2. Dimensionality Reduction: LDA can also reduce the number of features


(dimensions) in your data while preserving the information most relevant for
classification.
3. Interpretability: Unlike some complex models, LDA results in a linear equation,
making it easier to understand how features contribute to the classification.
Linear Discriminant Analysis (LDA)
• How Does LDA Work?

1. Data Preparation: Ensure your data is clean and pre-processed, with labels
indicating the class for each data point.
2. Maximizing Class Separation: LDA finds a linear transformation that
maximizes the separation between the means of different classes in the data.
This helps to create a clear distinction between the classes.
3. Minimizing Within-Class Variance: Simultaneously, LDA aims to minimize the
variance within each class.
This ensures that data points within a class are tightly clustered, further
enhancing the separation between classes.
Linear Discriminant Analysis (LDA)
4. Dimensionality Reduction: LDA can project the data onto a lower-
dimensional space while maintaining the class separability.
This can be beneficial for reducing computational cost and
improving model performance in some cases.
5. Classification: New data points are projected onto the transformed
space and classified based on the learned decision boundary
between classes.
Linear Discriminant Analysis (LDA)
• Advantages of LDA:

1. Effective Classification: LDA performs well for tasks where classes are linearly
separable.

2. Dimensionality Reduction: Simplifies data and potentially improves model


performance.

3. Interpretability: Provides insight into how features influence the classification.

• Disadvantages of LDA:

1. Linearity Assumption: LDA assumes a linear relationship between features,


which might not hold true for all datasets. Non-linear relationships can lead
to suboptimal performance.
Linear Discriminant Analysis (LDA)
2. Small Datasets: LDA might not be ideal for small datasets as it relies on
estimating class means and variances.

3. High-Dimensional Data: In high-dimensional settings, LDA might struggle


to find the optimal linear separation, potentially leading to overfitting.

• Limitations of LDA:

1. Sensitivity to Outliers: Outliers can significantly impact the calculation of


class means and variances, affecting LDA's performance.
2. Class Imbalance: LDA might not perform well when class distributions are
skewed (unequal number of data points in each class).
Linear Discriminant Analysis (LDA)
• Applications of LDA:

1. Facial Recognition: Classifying faces based on features like eyes, nose, and
mouth.

2. Spam Filtering: Identifying spam emails based on text content and other
features.

3. Image Classification: Classifying images like handwritten digits or different


types of objects.

4. Bioinformatics: Classifying genes or protein sequences based on their properties.

5. Text Classification: Classifying documents or emails into categories like spam,


news, or business.

You might also like